Senior Site Reliability Engineer at Peloton Interactive (New York, NY)

Peloton is looking for a Senior Site Reliability Engineer to work with teams across the organization to build and maintain monitorable, performant, reliable and highly-scalable software systems. We are a small, fast-paced, growing team of engineers tackling challenging problems at scale and headquartered in a brand new headquarters in the heart of Manhattan. Software and systems engineers with interest and/or experience in system automation are encouraged to apply for this position.

THE ROLE:

Desired Skills and Experience

Evangelize best practices for building and operating highly reliable systems
Serve as subject matter expert in observability and monitoring
Consult in system design to meet reliability and capacity requirements
Automate infrastructure and configuration management
Conduct timely post-mortems of production infrastructure incidents
Assist with all aspects of operational security and compliance
Seek out potential threats to security and reliability and advocate solutions
Participate in an on-call rotation to receive escalations
We work with Amazon Web Services, Chef, Python, Ubuntu, Nginx, Jenkins, Terraform, Akamai, Elemental
Passion for reliable, scalable, observable software with strong sense of ownership
Deep experience with Linux system administration
Experience developing and monitoring mission-critical systems
Substantial experience with a programming language like Python, Perl, Ruby, Bash, Java, C
Working knowledge of a centralized configuration tool like Chef, Puppet, or Ansible
Experience with or interest in learning about streaming applications and media servers
Bonus: experience configuring and monitoring CDNs. We use Akamai, Cloudfront, Cloudflare