Cockroach Labs is the team behind CockroachDB, an open source database whose mission is Make Data Easy. CockroachDB * so named for its survivability and scalability * will survive data center-scale outages, maintain strong consistency through ACID transactions, and enable developers to build scalable applications.

Our Engineering team is architecting the core of our database. We work on complex technology that provides the backbone of how a business’s data is stored. This allows for our businesses running on CockroachDB to give their customers consistent access to the data they need in their day to day lives.

We need extraordinary Site Reliability engineers to join our team! You will be responsible for our database being up and available, ensuring reliability, scalability and performance. You are an integrated member of our engineering team and will take ownership for reliability, automation, and other issues related to CockroachDB’s stability. You love automating processes, streamlining delivery, deploying new core functionality, and building great tools that are a joy to use. You will help make CockroachDB’s more friendly by bringing your expertise to our database and ultimately, make the world a more friendly place for SREs everywhere.

Help us Make Data Easy as a Site Reliability Engineer! You will be part of a small family within a team that builds a product that is distributed across the globe.

Responsibilities

  • Experience running production systems, specifically deployment, monitoring, and some debugging.
  • Keeping a complex system running and solve problems relating to mission critical services.
  • Build automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.
  • Influence and build new designs, architectures, standards and methods for CockroachDB.
  • Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.
  • Drive the company through disaster recovery tests, where we manually turn down pieces of CockroachDB to test it’s overall resiliency to failures.

Requirements

  • Bachelor’s Degree in Computer Science / Math / Physics or related field with 1+ years industry experience or 4+ years engineering experience.
  • Experience in one or more of: C, C++, Java, Perl, Python, Go, or scripting experience in Shell and Perl.
  • Expertise in analyzing, monitoring, and troubleshooting large-scale distributed systems.
  • Familiarity with algorithms, data structures, and complexity analysis.
  • In-depth knowledge of operating systems (e.g. processes, threads, concurrency) and networking.
  • Previous on-call experience, with a sense of urgency.

Cockroach Labs is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status. If you need assistance with applying for a position, please email our office at jobs@cockroachlabs.com.

Desired Skills and Experience

See application page for details