Site Reliability Engineer at VoiceBase Inc. (San Francisco, CA)

What are we building?

What will you be working on?

Cloud scale, distributed systems. You will make sure that our automation is running and our services are never down. You will shorten our deploy time. You will function in a highly agile and quick moving environment with a complex and mixed AWS and data center infrastructure.

What will your responsibilities be?

Desired Skills and Experience

Providing cloud-scale APIs for speech recognition, natural language processing, and predictive analytics
Building highly scalable, distributed microservices that run across 10,000 CPU cores in an architecture designed for 1 billion minutes per month
Implementing a seamless developer experience and showcase UX for voice/speech analytics applications
Design, test and implement solutions to the hardest Ops problems in a mixed public/private cloud and data center environment
Drive the next level of automation for scalability, reliability and uptime
Champion the best security practices for our systems
Work with Ops team and Dev team to improve quality, uptime and monitoring
Continuously improve Ops tools and processes
IT Ops and cloud maintenance experience
Python, Ruby, Bash, Perl scripting
Hands-on experience with Chef, Puppet, Saltstack, Ansible or other configuration management system. We are a Chef shop.
Prior experience with Linux, networking and routing, IT security, Monitoring and metrics
Working knowledge of PCI, SOC, ISO, and HIPAA certifications will be beneficial