Site Reliability Engineer
We’re looking for a seasoned Site Reliability Engineer to ensure RealSelf is fast, reliable, and available for millions of users. You will work in a close-knit team to proactively monitor and improve end-to-end system performance, identify bottlenecks, and potential failures throughout our infrastructure. Responsibilities:
- Ensure our platform exceeds goals for availability, capacity, efficiency, scalability, and performance
- Drive the company through “Disaster Recovery Tests”, where we manually turn down pieces of infrastructure to test RealSelf overall resiliency to failures
- Proactively monitor and improve end-to-end system performance, identify bottlenecks, and potential failures
- Develop and maintain scalable alerting tools for debugging and monitoring
- Manage the system and processes that RealSelf engineers use to package, test, and deploy their software
Desired Skills and Experience
See application page for details