Be part of an amazing team that’s building the world’s largest social platform for play. Every month, over 48 million players come to Roblox to become an overnight retail tycoon, compete as a professional racecar driver, solve a murder mystery, or simply build a dream home and hang out with friends. We’re looking for engineering rock stars who are inspired to power the imaginations of people around the world, one player at a time.

Our ambition is to build a global infrastructure to rival that of any big social network out there. We currently operate more than 15,000 servers in 40 data centers globally. Over the next few years, we expect that number will grow to over 50,000 servers.

As a Site Reliability Engineer, you’ll play a critical role in helping us scale our software stack and hardware infrastructure at a time of incredible growth for our business. At Roblox, you’ll have boundless opportunities to shape the future of the Imagination Platform™ and demonstrate your passion for delivering awesome solutions in front of a global audience. If you know what it takes to build systems that can sustain over one million concurrent players year-round and you take play as seriously as we do, you’ll fit right into our highly-skilled and ever-expanding engineering team.

You are:

You will:

Desired Skills and Experience

  • Experienced: you have a BS degree (or equivalent professional experience) in Computer Science or related engineering field with several years of experience.
  • A coding champion: you’ve been around the block a few times and understand the challenges of building large-scale systems.
  • A Linux (Ubuntu) and/or Windows expert: with solid administration skills in either OS, good system-analysis, configuration, and troubleshooting experience.
  • Passionate about automation: you have 3+ years of hands-on experience with at least one configuration management solution (Chef, Puppet, etc.).
  • Up-to-speed on all things Cloud: you have working experience with public cloud (AWS preferred) and private cloud (OpenStack preferred) solutions.
  • Ambitious: you boldly go where no man or woman has gone before; Consul.io, Vault.io, etcd, Docker, Mesos, InfluxDB etc. might not be technologies you’ve used, but you are keen to learn and grow.
  • Adaptable: you are capable of adjusting to new challenges, and experimentation is in your blood.
  • Develop and deliver solutions to meet the requirements of large scale, real time and 5 9s uptime to ensure our community has an awesome experience on the Imagination Platform™ from anywhere in the world.
  • Identify and solve critical problems and prevent them from reoccurring via root cause analysis and automation.
  • Create, influence and improve the development platform, infrastructure, standards, and methods to ensure our goals of scalability and high availability.
  • Develop and share best practices with development teams to improve scalability and reliability of the Imagination Platform™.
  • Work with a team that is currently distributed throughout the US and Canada, and soon to be globally.
  • While we are growing at our current pace, you may be asked to participate in the on-call rotation for critical infrastructure pieces.
  • Work with an awesome team of smart and motivated people on cool and unique projects that are used by millions of active users every day
  • ROBLOX Admin badge for your avatar, and rockstar status with our community
  • Unlimited paid vacation
  • Gym reimbursement
  • Free catered lunches & a fully stocked kitchen with unlimited snacks
  • 401K
  • Robust medical, dental and vision insurance
  • Free onsite parking & other commuter benefits