OfferUp is looking for a Senior Site Reliability Engineer to join our Operations Team.  We provide tools and services to all teams in OfferUp for managing an increasingly complex production infrastructure in AWS.  Our success is measured by our ability to allow everyone to stand up and deploy services quickly with no downtime.  In this role you will be at the forefront of driving and developing the technology that automates everything.  

Responsibilities:

Desired Skills and Experience

  • Work with other SREs to build a comprehensive set of tools to automate and monitor our production infrastructure
  • Work with Engineering to build resilient, operable, self-healing services
  • Participate in reasonable on-call rotations with the rest of Engineering
  • Managed groups of servers, preferably in AWS, at scale
  • Reasonably deep knowledge of Linux and internet technologies
  • Proficient in modern scripting languages like Python or Ruby
  • Configuration management tools like Ansible or Salt
  • Used advanced metrics to solve hard problems
  • Experience managing Big Data or high-throughput distributed systems like Hadoop and Kafka 
  • Experience with continuous integration
  • Contribution to open source projects
  • Acts like a team
  • Avoids doing things twice
  • Solves hard problems for tomorrow, not just for today
  • Prefers fixing problems to complaining about them
  • Investigates, considers and adopts new technology where it makes sense
  • Doesn’t tolerate brilliant jerks