Site Reliability Engineer (SRE)

With Keelvar Systems Ltd in US

More jobs from Keelvar Systems Ltd

Posted on April 28, 2021

About this job

Location options: Remote
Job type: Full-time
Role: System Administrator



Job description

About Us

This is an exciting opportunity to join a cutting edge company, disrupting its industry. Currently optimizing over $100bn+ in procurement spend for the world's largest companies, Keelvar is more than just a software company. Keelvar is an evolution of how companies work.

Our technology is unparalleled in its space. We are on a fast-paced journey to herald a new era of SaaS 3.0. Using AI, Machine Learning, and Game Theory to build intelligent systems that optimize and automate the procurement sourcing process, we save our customers millions of dollars every year, and help their suppliers find the best customers for them. Many of the world’s top blue chip companies use Keelvar to aid negotiations; they set high standards that we relish achieving because it helps us be the best at what we do.

We believe we can change the world and have fun doing it. We are a hard-working team who love what we do. We believe that a culture of curiosity, experimentation, and risk-taking is the key to finding breakthrough approaches - and we don’t settle for conventional approaches.

We strive for excellence, challenging ourselves and each other, with independent thinking, a lot of focus, and plenty of collaboration. In our eyes, the bigger the challenge, the bigger the reward. We’re not content with just equipping users with good tools; we want to help our customers achieve success and excellence, and sometimes this requires lateral or unconventional thinking. We want you to share your knowledge readily and learn every day. We like to ask questions, and answer questions when we can. We invite you to a workplace that is inclusive and celebrates diversity. We support everyone in being themselves, feeling empowered and inspired to make a difference.

If you are passionate about how technology is changing the world of work and want to work with a great team, this is the role for you.

What you will be doing

As a Site Reliability Engineer (SRE), you will be responsible for building solutions that revolutionize how the best procurement teams in the world are sourcing. On a typical day you will:

  • Work closely with SRE/DevOps colleagues and across the organisation on challenges related to scalability, reliability, performance, security and efficiency of systems.
  • Identify opportunities to automate and optimise.
  • Design, write, test, integrate, and maintain programmatic solutions to automate tasks and improve reliability.
  • Lead out initiatives to improve and scale our infrastructure.
  • Investigate incidents, provide technical resolutions with root cause analysis and supporting post-mortem documentation.
  • Report on key service metrics such as availability, capacity, performance, and latency across all systems.
  • Review colleague’s code, designs, procedures and mentor junior team members.
  • Collaborate with colleagues across the organisation to ensure vision is delivered.
  • Contribute to the ongoing success of this fast-paced, rapidly growing and evolving organization.
  • Participate in on call rotation and determine/implement solutions to reduce production interrupts.

What you can expect

We are a well-treated bunch, with awesome benefits! If there’s something important to you that’s not on this list, talk to us!

  • Competitive salary in a fast-growing start-up
  • MacBooks are our standard, but we’re happy to get you whatever equipment helps you get your job done.

We are a diverse bunch of people and we want to continue to attract and retain a diverse range of people into our organisation. We're committed to an inclusive and diverse Keelvar! We do not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.

Remote work?

  • To help with site reliability coverage, this remote role is preferable located in North America Central or Mountain time zone.

Skills & requirements

Must have

  • 4+ years of site reliability or relevant operations & development experience in a SaaS organization.
  • Strong experience in scripting python and relevant shell scripting languages.
  • Strong experience in server configuration and automation tools such as Pulumi, Fabric, or Ansible.
  • Strong experience in infrastructure configuration using Terraform, CloudFormation or similar technologies.
  • Experience in configuration and management of logging and alerting systems.
  • Operations experience in configuring and managing AWS environments, networks and services.
  • Strong knowledge of Ubuntu or other Linux based operating systems.
  • Experience in source control systems such as Git.
  • A strong focus on delivering and commitment to quality.
  • Excellent written and verbal communication skills.
  • Excellent communication, organizational and analytical skills.
  • Have a Computer Science honours degree or related qualification.

Nice to have

  • Knowledge/experience of MySQL, Postgres, Redis, Django.
  • Experience in establishing continuous delivery systems is a plus.

Apply here