Desired Skills and Experience

  • You will take ownership of our production systems, specifically deployment, monitoring, and some debugging.
  • You will keep our database available, ensuring reliability, scalability and performance. You will tackle problems relating to critical services and prevent problem recurrence.
  • You will automate processes, streamline delivery, deploy new core functionality, and build tools that are a joy to use. You’ll focus on the goal of automating responses to all non-exceptional service conditions.
  • You will work on service capacity management, demand forecasting, software performance analysis, and system tuning.
  • You’ll drive the company through disaster recovery tests, where we manually turn down sections of CockroachDB to test its overall resiliency to failures.
  • You love analyzing, monitoring, and troubleshooting large-scale distributed systems.
  • You have extensive knowledge of networking and operating systems (e.g. processes, threads, concurrency).
  • You are comfortable using programming languages like Go, C++, Java, Python, Ruby, and scripting languages like Shell and Perl. You’re interested in learning and using kubernetes on the infrastructure side.
  • You’re familiar with algorithms, data structures, and complexity analysis.
  • You love building relationships with your colleagues. You enjoy being part of the code review process and collaborating with your teammates on challenging problems.

Apply