Desired Skills and Experience
- Evangelize best practices for building and operating highly reliable systems
- Serve as subject matter expert in observability and monitoring
- Consult in system design to meet reliability and capacity requirements
- Automate infrastructure and configuration management
- Conduct timely post-mortems of production infrastructure incidents
- Assist with all aspects of operational security and compliance
- Seek out potential threats to security and reliability and advocate solutions
- Participate in an on-call rotation to receive escalations
- We work with Amazon Web Services, Chef, Python, Ubuntu, Nginx, Jenkins, Terraform, Akamai, Elemental
- Passion for reliable, scalable, observable software with strong sense of ownership
- Deep experience with Linux system administration
- Experience developing and monitoring mission-critical systems
- Substantial experience with a programming language like Python, Perl, Ruby, Bash, Java, C
- Working knowledge of a centralized configuration tool like Chef, Puppet, or Ansible
- Experience with or interest in learning about streaming applications and media servers
- Bonus: experience configuring and monitoring CDNs. We use Akamai, Cloudfront, Cloudflare
Apply