Desired Skills and Experience

  • Continuously refine monitoring processes, thresholds, and configuration
  • Work closely with product developers to ensure new features have the proper operational support and maintainability - provide deep technical guidance to development teams
  • Help with designing, building and maintaining the cloud native platform needed to support our growth plans, we do that handling Infrastructure as code and automating as much as we can
  • Mentoring and supporting team members on production readiness and best practices
  • Develop software for the purposes of automating, monitoring and maintaining deployed infrastructure and services
  • Handling high-severity internal or customer incidents, ensuring we meet all SLAs
  • Help teams create and maintain documentation and runbooks/playbooks
  • Participate in Scrum processes and ceremonies
  • Respond to issues and escalations
  • Participate in on-call rotation
  • Track record of leading a team of Software or Systems Engineers
  • Track record of working as a Site Reliability Engineer, DevOps Engineer, or a Software Engineer
  • Must be able to code and learn coding in new languages
  • Experience in at least one scripting language: Python, Ruby, Bash, Perl
  • Experience in working with infrastructure as code tools such as Puppet, Chef, SaltStack, Ansible, CloudFormation, Terraform etc.
  • Track record of working with Linux systems in production
  • Experience in working with container technologies such as Docker
  • Experience in working with cloud platforms such as AWS
  • Experience using Agile practices
  • Experience with modern open source infrastructure services and concepts such as Redis, ElasticSearch, Kafka, and Docker
  • Experience in software development in any language. Our focus languages are Go and Scala.
  • Experience in working with any functional programming language such as Clojure, Haskell, or OCaml.

Apply