Senior Site Reliability Engineer (SRE)

With Ultimate Software in San Francisco CA US

More jobs from Ultimate Software

Posted on April 02, 2019

About this job

Location options: Paid relocation
Job type: Full-time
Experience level: Mid-Level, Senior
Role: System Administrator
Industry: Computer Software, Human Resources, Software Development
Company size: 1k-5k people
Company type: Public


linux, python, puppet, chef, go

Job description

Ultimate Software is seeking a Senior Site Reliability Engineer (SRE) with a robust and diverse background in Software Engineering, Software Design and Systems Architecture with a focus on automation, reliability, and system integration. Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. An SRE ensures that Ultimate Software's services—both our internal and external systems—are reliable with uptime appropriate to users' needs while keeping an ever-watchful eye on capacity and performance.

At Ultimate Software our Site Reliability Engineer (SRE) come from both development and operations backgrounds with a common passion for running products at scale in production.  Our SRE engineers are always seeking to understand how our systems work end to end without boundaries.

Primary/Essential Duties and Key Responsibilities:

  • Engage in and improve the whole lifecycle of services from conception to inception, including: system design, build, and deployment
  • Define and implement standards and best practices related to: System Architecture, Deployment, metrics, operational tasks
  • Support services through activities such as monitoring availability, system health, and incident response
  • Improve system performance, application delivery and efficiency through automation, process refinement, post mortem reviews, and in-depth configuration analysis
  • Engage in Communications across all areas of the organization

Required Qualifications:  

  • Experience with algorithms, data structures, complexity analysis and software design.
  • Experience with highly resilient systems as well as anti-fragility design patterns
  • Experience with distributed systems
  • Experience with service-oriented architectures
  • Experience in one or more of the following: Python, Go, Perl, C, C++, Java or Ruby
  • Experience with Unix/Linux operating systems internals and administration (e.g., filesystems, inodes, system calls) and networking (e.g., TCP/IP, routing, network topologies).
  • Experience with Amazon Web Services and Google Cloud Platform Products
  • Ability to multitask and adapt quickly to changing priorities
  • Ability and willingness to work evenings / nights on occasion (Participate in on-call rotation)
  • Experience with Configuration Management (Puppet/Chef/Ansible)
  • Experience with Linux command-line shell and shell scripting
  • BS degree in Computer Science, or a related technical field involving coding (e.g. physics or mathematics), or equivalent practical experience preferred.

Typical Interview Process:

  • If your application is selected, a Talent Acquisition Manager will reach out to schedule a phone screen with them.
  • If selected to move forward, you will complete a HackerRank Coding Assessment.
  • If you pass, you will either move forward to a technical phone call for an additional screening, OR directly to an onsite interview.
  • Offer stage.

Apply here