Desired Skills and Experience
- Eat, sleep, and breathe services. You have experience balancing live-site management, feature delivery, and retirement of technical debt.
- Experience designing, developing, debugging, and operating resilient distributed systems.
- Experience with managing large, complex systems in cloud-based infrastructure.
- Resolve complex technical issues and drive innovations that improve system availability, resilience and performance.
- Familiarity with crash-only and recovery-oriented software design.
- Excited by building reliable, self-healing services on unreliable hardware.
- Agilista capable of driving and delivering thin slices of functionality on a regular cadence with data-driven feedback loops.
- Be passionate about automation and to avoid doing things manually.
- Create, maintain and share technical documentation used by engineers and other team members.
- Having fun!
- 5+ years of professional experience in systems engineering in large scale Linux/UNIX data center environments
- 5+ years professional experience in Java, Go, Scala, C++, Python, Ruby, Perl, or other language
- Solid understanding of how to configure, deploy, manage and maintain large cloud hosted systems; including auto-scaling, monitoring, performance tuning, troubleshooting and disaster recovery.
- Experience delivering on strategic initiatives effectively in a fast paced environment while supporting day-to-day issues
- In depth, hands-on experience with Linux, networking, server, and cloud architectures.
- Knowledge of metrics & monitoring (e.g., Splunk, Nagios etc.) and configuration management tools (e.g., Chef, Puppet, etc.).
- Deep understanding of network technologies like DNS, Load Balancing, SSL, TCP/IP, SQL, HTTP.
- Proficiency with source control, continuous integration, and testing pipelines.
- Bachelor’s Degree in Computer Science or any engineering discipline Or Equivalent Experience
- Experience with software based compute infrastructure such as AWS, Azure, GCE, OpenStack, CoreOS
- Experience with container orchestration systems such as Kubernetes, Docker Compose
- Experience with resource Management systems such as Borg, Mesos, Aurora, Marathon, Yarn
- Expertise in live site operations for stateful services, such as Hadoop, HBase
- Understanding of industry security best practices.