Desired Skills and Experience
- Design, build, or improve current systems that focus on scalability, availability, and efficiencies of Hulu’s services.
- Identify mission critical problems and solve them via automation and design improvements.
- Build or improve monitoring and instrumentation to predict future scalability or latency risks and solve them before they ever manifest into customer facing issues.
- Develop best practices with development teams to improve scalability and reliability of Hulu’s services.
- Design and improve the developer platform and infrastructure so that reliability and availability become an even more natural part of our software development process.
- Experience as a Site Reliability Engineer or a Software Engineer focused on infrastructure and/or operations.
- Experience with the building blocks of large scale systems including load balancing, fault tolerance, containers, instrumentation, predictive monitoring, etc
- Familiarity with commonly available services and tools (AWS, Docker, Redis, New Relic, Heroku, Hadoop, etc)
- Strong passion for automation, testing and code quality.
- BS in Computer Science or equivalent preferred.
- Familiarity with one or more of the following: Java, Python, Go.