Site Reliability Engineer at VoiceBase Inc. (San Francisco, CA)
What are we building?
What will you be working on?
Cloud scale, distributed systems. You will make sure that our automation is running and our services are never down. You will shorten our deploy time. You will function in a highly agile and quick moving environment with a complex and mixed AWS and data center infrastructure.
What will your responsibilities be?
Desired Skills and Experience
- Providing cloud-scale APIs for speech recognition, natural language processing, and predictive analytics
- Building highly scalable, distributed microservices that run across 10,000 CPU cores in an architecture designed for 1 billion minutes per month
- Implementing a seamless developer experience and showcase UX for voice/speech analytics applications
- Design, test and implement solutions to the hardest Ops problems in a mixed public/private cloud and data center environment
- Drive the next level of automation for scalability, reliability and uptime
- Champion the best security practices for our systems
- Work with Ops team and Dev team to improve quality, uptime and monitoring
- Continuously improve Ops tools and processes
- IT Ops and cloud maintenance experience
- Python, Ruby, Bash, Perl scripting
- Hands-on experience with Chef, Puppet, Saltstack, Ansible or other configuration management system. We are a Chef shop.
- Prior experience with Linux, networking and routing, IT security, Monitoring and metrics
- Working knowledge of PCI, SOC, ISO, and HIPAA certifications will be beneficial