Every day, Clover devices handle the core card and point-of-sale processing for hundreds of thousands of merchants. Behind the scenes, we operate a cloud platform provides processing, storage and collaboration for merchants, application developers, service providers and our merchants’ customers. Our devices and platform form the backbone of millions of payment interactions between merchants and their customers daily.To support all of this, we have a team of engineers working around the clock to ensure our systems remain operational, safe and secure. Right now we are looking to further scale our operation, and we are looking for an experienced Site Reliability Engineer to join our Operations Team.
Availability, reliability, and security are paramount. In this role, you will help build and operate complex systems that allow our large fleet of smart payment terminals to process tens of millions of transactions a day. We are hoping to find individuals who are a hybrid between system administrators and software engineers.
Desired Skills and Experience
- Monitor site reliability, availability, and performance
- Evaluate and deploy critical patches needed for production systems
- Designing and deploying algorithms
- Write automation frameworks
- Create alerts to detect and respond to production issues
- Scale disaster recovery testing
- You take things personally and act like an owner
- Strong CS fundamentals. BS degree in Computer Science or related technical field, or equivalent practical experience.
- Super strong Linux skills
- Strong MySQL (preferable) or other RDBMS skills.
- Some configuration management experience. Product does not really matter (any of Puppet, Chef, cfengine, Fabric, Ansible, Salt is fine)
- Strong scripting and automation skills. We mostly use Python.
- Ability to read and debug code. Most of our applications are Java on a simple, custom app server.
- Supreme troubleshooting skills
- Highly organized and detail oriented
- You’re willing to be on call most of the time
- Cloud experience nice, but not required; platform does not matter
- Monitoring tools skills
- Experience with tools like Elastic/Kibana, Jenkins, Pagerduty, Wavefront