Site Reliability Engineer
Site Reliability Engineers (SRE) at Birst fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and he/she will also believe that automation is a key component to operating large-scale systems. Responsibilities • Serve as a primary point responsible for the overall health, performance, and capacity of our customer-facing services • Gain deep knowledge of our complex applications. • Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth. • Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale environment. • Work closely with development teams to ensure that platforms are designed with "operability" in mind. • Function well in a fast-paced, rapidly-changing environment. • Troubleshoot issues across the entire stack * hardware, software, application and network • Submit and implement change requests and create root cause analysis on infrastructure issues • Conduct performance testing and capacity planning • Take part in a 24x7 on-call rotation
Desired Skills and Experience
• 5+ years’ experience in fast-paced/mission-critical environment managing Unix/Linux and Windows infrastructure • Extensive experience with TCP/IP, IIS/Apache and Tomcat/JBoss/Weblogic. • Practical knowledge of shell scripting and at least one scripting language (Python, Ruby, Perl) • Prior experience with configuration management tools such as Chef or Puppet • Extensive experience in supporting and maintaining a monitoring system such as Nagios or SolarWinds • Advanced understanding of SAN solutions (Netapp / 3PAR / Hitachi) • Ability to successfully work with Cloud architectures (AWS, VMware ESX, OpenStack, etc) • Strong troubleshooting skills that span systems, network, and applications • Self-starter who is able to take ownership of technical issues and be a productive member in the on-call rotation • Strong personal and professional initiative with a focus on the success of the team and organization • BS or higher degree in Computer Science of other technical discipline
Additional Information
All your information will be kept confidential according to EEO guidelines.