Site Reliability Engineer at Sky (Brentford, UK)
Site Reliability Engineer
Sky is the world market leader in digital satellite broadcast technology with over 10 million Sky subscriber homes across the UK and Irish Republic.
Making bold decisions is a big part of our history at Sky – and with talented people like you on board, we’re confident it’s going to be a big part of our future. OTT delivers an exciting range of internet TV products that’s revolutionising the industry, with fresh ideas and the latest technologies.
Joining us in the Reliability Engineering team you will be at the core of ensuring that we deliver all these fantastic products at an industry beating level of availability and performance.
Background
Site Reliability Engineering (SRE) is what you get when you treat operations as if it’s a software problem. Our mission is to progress, protect, and provide for the software and systems behind all of Sky’s public services – NowTV, SkyStore, SkyGo, to name just a few - with an ever-watchful eye on their availability, latency, performance, and capacity.
This is an unusual job, unlike others in the industry. Like traditional operations groups, we keep important, revenue-critical systems up and running We hire people from both systems and software backgrounds. Strong candidates will have experience with both.
In SRE, we flip between the fine-grained detail of disk driver I/O scheduling to the big picture of service capacity, across a range of systems and a user population measured in millions. We drive reliability and performance across massive scale by mastering the full depth of the stack.
As a Software Engineer on the Persistence Reliability Engineering team, you will have the opportunity to tackle the complex problems of scale while using your expertise in coding, algorithms, complexity analysis and large-scale system design
Key Responsibilities:
Our benefits package is designed to recognise the essential part you play in Skys success. We offer free Sky+HD, broadband and talk services; private medical insurance, generous holiday entitlements, a contribution pension scheme, a Share scheme so you can have a stake in Sky and our success plus Sky Choices a benefit scheme to help you make valuable tax & national insurance savings.
It’s our people that make Sky the UK and Irelands leading entertainment company. That is why we work hard to be an inclusive employer, so everyone at Sky can be their best.
If you are appointed to this role you will be subject to the successful completion of a Criminal Record Check.
Sky - Believe in Better
Desired Skills and Experience
- Review software to improve the availability, scalability, latency, and efficiency of Sky’s services.
- Solve problems relating to mission critical services and build automation to prevent problem recurrence
- Influence and create new designs, architectures, standards and methods for large-scale distributed systems.
- Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.
- Experience with algorithms, data structures, complexity analysis and software design.
- Experience in one or more of: C, C++, Java, Python, Go.
- Expertise in designing, analyzing and troubleshooting large-scale distributed systems.
- Familiarity with running web services at scale; understanding of Unix systems internals and networking.
- Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way.
- Networking: knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing).
- Systematic problem solving approach, coupled with a strong sense of ownership and drive.