What’s Elementum? Good question. Elementum makes supply chain simple. You know, supply chain—that thing that moves the $25T product economy. That’s “T” as in “Tell me more.” We do this with intuitive apps built on a single platform, powered by the world’s first product graph™. By digitally mapping the product economy, we help companies get the next “gotta have it!” to people faster and more efficiently. We’re missing this one piece though. It’s you. No, really. What’s exciting about a Site Reliability Engineer role at Elementum? At Elementum, our SRE team owns our highly available, scalable, and secure production, development, and test environments on AWS cloud. This team is part of the core engineering team, and provides deployment and monitoring tools / automation, and work closely with our Support team and Development team to ensure 24/7 availability Elementum enterprise cloud services and mobile applications. Equally important, you will contribute to defining and delivering the engineering operations roadmap, and influence the Test roadmap to deliver high quality products This position also requires strong communication, interpersonal and leadership skills, and a good candidate is expected to contribute to the vision and direction of the product line. Competitive Benefits * Medical, Dental, and Vision are 100% covered by Elementum for employees * 401k matching * Free, daily catered lunches * Commuter benefits: CalTrain GoPass & WageWorks * Company outings * Casual dress code * Open vacation policy * Pets at work! * Engage with (and give high-fives to) senior management regularly * Get in on the ground floor of a huge opportunity
Desired Skills and Experience
- 3+ years of experience as a SRE or Software Engineer developing customer-facing, high-availability, large scale web-based applications.
- Bachelor’s Degree in Computer Science or related field, with 3+ years of Industry experience
- 3+ years experience with Amazon Cloud Services (AWS).
- Build infrastructure to deploy/ upgrade server applications and software platforms, with zero down time requirements
- Develop systems to monitor / support Elementum’s SaaS platform towards 24/7 availability and high performance
- Respond to and resolve service incidents in a timely manner and able to multitask.
- Perform root-cause analysis, instituting preventive measures where indicated.
- Strong knowledge of Linux systems (Ubuntu, RHEL, CentOS).
- Strong knowledge of system architecture, performance tuning concepts, and web applications.
- Experience with source code repositories and version control systems (GIT, Perforce, SVN).
- Experience with using JIRA, Confluence or other ticketing and collaboration systems.
- Experience with open source monitoring tools like Nagios and New Relic.
- Experience with web technologies such as Java/Tomcat, NodeJS, Ruby, MySQL/NoSQL, Apache, and Nginx.
- Experience with Scripting languages (ex. Ruby, Python, Perl, Basic Shell Scripting)
- Experience with reports and statistical analysis tools for real time and historical search like ELK, Sumologic, or Splunk.
- Must be legally authorized to work in the US