Site Reliability Engineer

With Catawiki in Amsterdam - NL

More jobs from Catawiki

Posted on January 14, 2022

About this job

Compensation: €70k - 90k
Location options: Paid relocation
Job type: Full-time
Role: DevOps, System Administrator
Industry: eCommerce
Company size: 501-1k people
Company type: VC Funded

Technologies

kubernetes, cloud, docker, nosql

Job description

Location: We're happy to support full relocation from anywhere in the world, OR, Remote work from The Netherlands, France, Italy, Spain, UK, Belgium, Germany is also possible

What’s the job

With your expertise in software and systems engineering, you will be responsible for Catawiki Platform reliability and automation by:

  • Proactively assessing reliability aspects and addressing concerns
  • Developing Platform automation and eliminating menial tasks
  • Making sure Platform is properly instrumented and monitored
  • Identifying and establishing Service Level Indicators and Objectives
  • Sharing on-call responsibilities
  • Guiding Engineering teams on reliability best practices and approaches

Here’s Dmitrii, our Team Lead

“Hi! I am Dmitriy! I have been working for Catawiki for the third year, and I never stop being surprised that Catawiki constantly challenges our team to help businesses grow faster. Site Reliability is a young but solid team with open and passionate members. Together with the other developers, we’ve been on an exceptional journey scaling Catawiki Platform from just a few servers managed manually to a dozen Kubernetes clusters running in the Cloud powering tens of microservices. Cloud-Native technologies and automation are just a couple things that we employ on every step to take us further down the line with the greatest efficiency. We are proud of all projects we have completed and look forward to new people joining our team. “

You'll move in sync with…

As a part of a team of professionals (software/systems/data/test engineers and product/project managers) within a functional area, you’ll be making sure scalability and reliability aspects are built-in and being delivered on all steps of the development lifecycle, ensuring smooth operations.

A little bit about you

You measure everything, implement gradual changes, and accept failure as normal. By sharing ownership with developers and using the same tools you reduce organisational silos.

You leverage tooling and automation, effectively solve and communicate problems.

Next, to this it’s likely you’ll also have:

  • Experience in software and systems engineering
  • Experience being an SRE engineer (Observability, on-call rotation, incident management)
  • Practical knowledge of Kubernetes and other Cloud Native technologies (Docker/Helm/Terraform)
  • Good knowledge of relational and familiarity with NoSQL databases
  • High problem solving, analytical, and troubleshooting skills
  • Good knowledge of monitoring and alerting tools (Prometheus/AlertManager, ELK, Grafana)
  • Passion for details and a great level of pragmatism

Apply here