Desired Skills and Experience

  • Design, build, or improve current systems that focus on scalability, availability, and efficiencies of Hulu’s services.
  • Identify mission critical problems and solve them via automation and design improvements.
  • Build or improve monitoring and instrumentation to predict future scalability or latency risks and solve them before they ever manifest into customer facing issues.
  • Develop best practices with development teams to improve scalability and reliability of Hulu’s services.
  • Design and improve the developer platform and infrastructure so that reliability and availability become an even more natural part of our software development process.
  • Experience as a Site Reliability Engineer or a Software Engineer focused on infrastructure and/or operations.
  • Experience with the building blocks of large scale systems including load balancing, fault tolerance, containers, instrumentation, predictive monitoring, etc
  • Familiarity with commonly available services and tools (AWS, Docker, Redis, New Relic, Heroku, Hadoop, etc)
  • Strong passion for automation, testing and code quality.
  • BS in Computer Science or equivalent preferred.
  • Familiarity with one or more of the following: Java, Python, Go.