What you tell your friend your friends you do?

Drink amazing coffee and work with Technology!

What you’ll really be doing?  

As a Site Reliability Engineer (SRE) you will ensure our customers get the best quality of service and uptime we can give them. Identify where we can expect and how we can tolerate IT failures from our systems as well as those we depend upon.  Work closely with our developers and architects to build and run services and systems that respond consistently to failures by gracefully degrading our services.

Be responsible for ensuring the systems and applications we launch remain available, reliable and efficient at accomplishing their duties even as their duties scale and evolve.  To be involved in every part of our site, from conception of products and their development to deployment, troubleshooting and analysis.  

Design, build and automate tools and processes to ensure and improve scalability, availability and performance across areas of technology.  Build, integrate and run tools to inject, predict and identify infrastructure and service failures on an ongoing basis to help optimize our sites.

How will you be doing this? 

You will use primarily using open source technologies and products in a LAMP environment, so you’ll have extensive commercial experience in supporting and developing high volume commercial web sites using object orientated PHP and MySQL.

Data will underpin your decisions and you will take care to ensure qualitative metrics are held in as high regard as quantitative.

What a day in the office looks like?

Start with a brew (Yorkshire Tea of course!) there will be lots to do.  Deal with any incidents or near misses from the night before. Proceed with any open tickets or attend meetings to plan for the new day.

The team:

Web Operations is made up of 6 engineers that have strong operational backgrounds across a number of disciplines and skills from developers to infrastructure specialists.  The team works both as a central operations function for the entire Bet Tribe and are also embedded in development teams providing operational support, training and guidance.  

Desired Skills and Experience

  • We are a RHEL/CentOS house so a very good understanding of Linux is essential.
  • We have some typical LAMP stacks, though Mongo, Redis, Memcached and RabbitMQ also feature highly.
  • We write our code in PHP and Javascript, making heavy use of Node.js.  There’s the usual mixture of bash, a little Python, and some Ruby.  Our source control is Git.
  • We make heavy use of Chef for our configuration management but experience of this or other CM tools is necessary.
  • We have heavy integration with OpenBet systems underpinning our sportsbook and gaming services.
  • We make use of Graphite, Grafana, New Relic, Splunk and Opsview for monitoring out services.
  • Experience working in a fast-paced agile development environment.