DomainTools is seeking a Site Reliability Engineer to join our team. Our engineers support all aspects of our infrastructure, providing services that are used by our customers for cybersecurity research and threat intelligence. This is a full-time, permanent position at our office in Seattle (Belltown).

We are a growing organization, and while our operations and engineering teams are still small, we see a need for a dedicated Site Reliability Engineer to oversee the infrastructure that our customers depend on. You can expect to spend the majority of your time driving reliability through the automation and optimization of our infrastructure, while providing technical leadership in building fault-tolerant systems from development to delivery. You will work in an environment where productivity is fostered. You will spend very little time in meetings, and you won’t be distracted by processes that get in the way of high-quality results.

Job Responsibilities

Desired Skills and Experience

  • Drive and support DomainTools infrastructure from design to deployment.
  • Analyze infrastructure to plan and scale for massive growth and high performance.
  • Build automation, tooling, and platforms to enable reliability, efficiency, and scalability.
  • Develop and implement evidence-based metrics to evaluate reliability and performance of systems and services.
  • Provide design, architecture, and implementation guidance and support through technical leadership.
  • Establish best practices and grow Site Reliability Engineering at DomainTools.
  • Work with operations to provide 2nd-tier on-call support.
  • Deep software engineering experience: 5+ years.
  • Deep operational or systems engineering experience: 5+ years.
  • Expert in distributed systems, architecture, design, algorithms, and/or data structures.
  • Expert in large-scale data stores such as QFS, Cassandra, and MySQL.
  • Experience with operations and development in a Linux/Unix environment.
  • Independent, self motivated, proactive approach to the prevention and resolution of problems.
  • Positive attitude with strong attention to detail and a desire to produce high-quality results.
  • Bachelor’s degree or higher in Computer Engineering or related field.
  • Experience with distributed computing systems like Hadoop.
  • Experience with microservices and/or lambda architecture.​
  • Passion for building reliable and easy to use services, APIs, and tools.
  • Bash, Python, Java/Scala, C, or R development experience.
  • History of working effectively in a small-team environment.