Site Reliability Engineer (SRE)
With Ultimate Software in Fort Lauderdale FL USMore jobs from Ultimate Software
Posted on April 02, 2019
About this job
Location options: Paid relocation
Job type: Full-time
Experience level: Mid-Level
Role: DevOps, System Administrator
Industry: Computer Software, Human Resources, Software Development
Company size: 1k-5k people
Company type: Public
devops, python, linux, bash, teamcity
Ultimate Software is seeking a Site Reliability Engineer (SRE) with a robust and diverse background in Software Engineering, Software Design and Systems Architecture with a focus on automation, reliability, and system integration. Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Ultimate Software’s services—both our internally critical and our externally-visible systems—have reliability and uptime appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.
At Ultimate Software our SRE’s come from both development and operations backgrounds with a common passion for running products at scale in production. Our SREs are always seeking to understand how our systems work end-to-end without boundaries.
Primary/Essential Duties and Key Responsibilities:
- Engage in and improve the whole lifecycle of services including: system design, build, deployment, and support
- Define and implement standards and best practices related to: system architecture, deployment, metrics, operational tasks
- Support services through activities such as monitoring availability, system health, and incident response
- Improve system performance, application delivery and efficiency through, automation, process refinement, post-mortem reviews, and in-depth configuration analysis
- Engage in communications across all areas of the organization
- Experience with highly resilient systems as well as anti-fragility design patterns
- Experience with distributed systems
- Experience with service-oriented architectures
- Experience in one or more of the following: Python, Ruby, C#
- Experience with Linux, Unix, and Windows operating systems internals and administration (e.g., filesystems, inodes, system calls) and networking (e.g., TCP/IP, routing, network topologies)
- Experience with OpenStack
- Ability to multitask and adapt quickly to changing priorities
- Ability and willingness to work evenings / nights on occasion (Participate in on-call rotation)
- Experience with Configuration Management (Chef, Ansible, Puppet)
- Experience with shell scripting (Bash, Powershell, or Batch)
- Experience with development pipelines (TeamCity, Jenkins, Concourse)
- BS degree in Computer Science, or a related technical field involving coding (e.g. physics or mathematics), or equivalent practical experience preferred.
Typical Interview Process:
- If your application is selected, a Talent Acquisition Manager will reach out to schedule a phone screen with them.
- If selected to move forward, you will complete a HackerRank Coding Assessment.
- If you pass, you will either move forward to a technical phone call for an additional screening, OR directly to an onsite interview.
- Offer stage.