ROBLOX has grown by almost 4X in the past year and continues at the same rate. Through this growth, we blew through our original cage, expanded into another, and are now expanding to multiple physical locations. Our production environment is a hybrid of AWS and physical data centers.  We serve 30M monthly active users, and are are rapidly approaching 1M peak concurrent users via a gaming cloud that spans 30+ data centers throughout North America, Europe, and Asia.

Our infrastructure management system is highly automated, enabling engineers to interact with thousands of Windows and Linux servers across multiple environments in parallel via the command line. Our application space includes database scalability, caching, message queuing, search, recommendations, elasticsearch, redis, mobile, and a proprietary gaming cloud management.

Come join the ROBLOX data center team and help us grow our footprint globally.

Responsibilities:

Desired Skills and Experience

  • Contribute to the growing library of automation scripts and tools that will ultimately deliver hardware as a service to our application teams
  • Eliminate all possible human interaction from the build and maintenance of hardware, from device type recognition, through PXE boot, and managed upgrades
  • Design, develop, and support new automation code for application servers, networking gear, third party systems, and maintenance processes
  • Continuously automate solutions to failures in hardware, configuration, networking, vendor outages, and continuous software upgrades
  • Expand our automated monitoring and alerting system to improve team response, triage, and time-to-solution
  • Maintain uptime SLA’s for a 24/7/365 production environment
  • Participate in on-call rotations
  • 5 years of experience in production data center support and management
  • Bachelor’s degree in Engineering or Computer Science, or equivalent work experience
  • Scripting and debugging experience in Powershell, Python, bash (we use Powershell)
  • Expertise with automation tools (Chef, Puppet), including administration and recipes
  • Experience automating the configuration and maintenance of Linux and Windows servers in a mixed OS environment (Linux integration with Active Directory a plus)
  • Programming experience in one or more: C++, C#, Java, Python, Ruby, or equivalent
  • Experience with the administration and automation of ElasticSearch or Redis, plusses
  • Windows and Linux server administration
  • Robust medical, dental and vision insurance
  • 401k 
  • Flexible time-off
  • Wellness reimbursement
  • Free onsite parking & other commuter benefits
  • Free catered lunches & a fully stocked kitchen with unlimited snacks!
  • Chance to work with a top-notch team on cool and unique projects!