Site Reliability Engineer

With Domino's Pizza in Ann Arbor MI US

More jobs from Domino's Pizza

Posted on January 13, 2021

About this job

Compensation: $85k - 100k
Location options: Paid relocation
Job type: Full-time
Experience level: Mid-Level
Role: DevOps, System Administrator
Industry: eCommerce, Information Technology, Software Development / Engineering
Company size: 1k–5k people
Company type: Public

Technologies

linux, kubernetes

Job description

The Platform Engineer – Site Reliability Engineering (SRE) is responsible for the overall maintenance and provisioning of the RedHat Linux environment within eCommerce at Domino’s, both VMWare Guest and Kubernetes platforms.  This position requires a wide base of knowledge from basic Linux administration through capacity planning.

Duties and Responsibilities:

  • Perform regular operating system patching, rebooting, and remediation of identified security vulnerabilities
  • Participate in regular security analysis and operating system hardening requirement discussions
  • Ensure platform consistency is achieved between each stack and environment, prior to each release cycle
  • Ensure base server platforms are upgraded to N, or N-1 where required by the business on a quarterly basis
  • Ensure services are upgraded to N, or N-1 where required by the business on a quarterly basis
  • Perform service benchmarking to determine the impact of application of upgrades, tuning parameters, or business requirements
  • Provide capacity planning and trending analysis with regards to system and service performance over time
  • Ensure a standard platform is available, current, and extensible for both eCommerce and Corp environments
  • Ensure server provisioning practices and documentation are current and maintained
  • Participate in automation activities related to their functions, managing content in revision control

Qualifications:

  • Bachelor’s degree in computer science or equivalent experience
  • 5+ years production application support experience in a high uptime environment
  • 5+ years UNIX administration experience including diagnosis of performance issues, package management, load estimation, kernel tuning, networking configuration, etc.
  • 5+ years hosting experience in a large heavy-traffic environment
  • Excellent troubleshooting and analytic skills
  • Extensive knowledge in platform management in VMWare and Kubernetes
  • Ability to manage and execute scripting such as bash and python
  • Ability to manage content in BitBucket
  • Prefer experience with middleware tools such as ActiveMQ, RadiantLogic and PingFederate
  • Ability to create systematic and manual operations procedures in both technical and user-friendly language.
  • Familiarity with process and efficiency enhancements.

Apply here