System Reliability Engineer - Applications at Fastly (San Francisco, CA) (allows remote)
System Reliability Engineer - Core Engineering
We are looking for talented System Reliability Engineers to help design the next generation of Fastly’s production service platform and drive the reliability, speed, efficiency, and security of our backend systems. You will work with engineers on many teams to tackle the challenges of building a better, more flexible CDN.
System Reliability Engineer - Applications
We are looking for a talented Applications System Reliability Engineer (ASRE) to work alongside our applications developers to help them design, deploy and operate the next generation of Fastly’s production service platform. System Reliability Engineering (SRE) drives the reliability, speed, efficiency, and security of our backend systems. Each ASRE is embedded in a particular application team for a period of 2-6 months, where they consult with and assist that team on scalability, reliability, deployment, operational documentation, production readiness and the like. This embedding model allows the ASRE to get real depth in a particular Fastly application and build ongoing relationships with that application engineering team, while also adding breadth of experience via team rotation.
Desired Skills and Experience
- Develop infrastructure to enable reliable and rapid deployment, effective monitoring, and high availability in a large-scale Linux environment.
- Diagnose and resolve performance and reliability issues across the entire stack: hardware, kernel, application, network, including cross-application dependencies.
- Write tools to automate maintenance and deployment of machines, services, applications.
- Work closely with your development teams to ensure that services are designed with scale, operability, performance, and ease-of-use in mind.
- Build and maintain a robust Continuous Integration environment.
- 4+ years experience running high availability systems and supporting infrastructure.
- Strong understanding of Linux systems, high and low level.
- Useful knowledge of shell scripting and one or more scripting language (e.g. Python, Ruby, Perl).
- Experience with compiled languages (Go, C/C++, Java).
- Good understanding of configuration management best practices and standards.
- Experienced with cloud providers such as Amazon Web Services and Google Cloud Platform.
- Adaptable to a wide variety of technologies and people.