Site Reliability Engineer
Meraki’s customer base has grown by a factor of 2-3 every year since we started, leading to a current request rate of over 200 million page views per day. Our Site Reliability Engineers comprise the Backend Infrastructure Team and are responsible for everything from our server hardware and operating systems to code deployment and monitoring tools. We provide the stable environment and infrastructure needed for our development teams to iterate quickly, gain insight into application health and performance, and respond quickly when issues arise. As a Site Reliability Engineer on our Backend Infrastructure Team, you will use and gain exposure to a wide variety of open-source and commercial systems. We deliberately choose the right tool for each job and update those choices as our system grows. You will work with and learn nginx, graphite, grafana, statsd, ansible, flapjack, Debian, Pingdom, and New Relic among others. A Day in the Life of a Meraki Site Reliability Engineer:
- Adding network topology awareness to our code deployment pipeline, so that deploys use less bandwidth and complete faster.
- Improving service monitoring to include automated anomaly detection.
- Identifying local and distributed performance bottlenecks and evaluating whether they can be allayed by caching, precomputation, or other similar techniques.
-
Optimizing tools and processes by identifying and eliminating inefficiencies. You are an ideal candidate if you:
- Have 2+ years of experience on a pager rotation where you responded to escalations quickly to minimize customer downtime. You like organizing chaos!
- Automate, automate, automate! You script all the things.
- Believe in the Unix way. You build large systems out of small components that each do one job and do it well.
- Are familiar with systems debugging tools such as strace, atop, iotop, netstat, lsof, iptables, valgrind, gdb.
- Are comfortable digging into other people’s source code in search of the root cause of a problem. Keywords: Production Engineering, Site Reliability Engineering, DevOps, System Administration
Desired Skills and Experience
See application page for details