Senior Site Reliability Engineer at VoiceBase Inc. (San Francisco, CA)

What are we building?

What will you be working on?

Cloud scale, distributed systems. You will make sure that our automation is running and our services are never down. You will shorten our deploy time. You will function in a highly agile and quick moving environment with a complex and mixed AWS and data center infrastructure.

Who are you?

A Senior Site Reliability Engineer with a passion for cloud-scale, micro-service architecture to take us to a billion minutes per month and beyond.

What will your responsibilities be?

Desired Skills and Experience

Providing cloud-scale APIs for speech recognition, natural language processing, and predictive analytics
Building highly scalable, distributed microservices that run across 10,000 CPU cores in an architecture designed for 1 billion minutes per month
Implementing a seamless developer experience and showcase UX for voice/speech analytics applications
Design, test and implement solutions to the hardest Ops problems in a mixed public/private cloud and data center environment
Drive the next level of automation for scalability, reliability and uptime
Champion the best security practices for our systems
Work with Ops team and Dev team to improve quality, uptime and monitoring
Continuously improve Ops tools and processes
At least 7 years experience and a track record of IT Ops and cloud maintenance
Strong proficiency in Python, Ruby, Bash, Perl scripting
Demonstrated ability to provide technical ownership in a dynamic, fast-paced environment
Hands-on experience with Chef, Puppet, Saltstack, Ansible or other configuration management system. We are a Chef shop
Prior experience with Linux, networking and routing, IT security, Monitoring and metrics, MongoDB, ElasticSearch, ELK, Zookeeper preferred
Working knowledge of PCI, SOC, ISO, and HIPAA certifications will be beneficial