Site Reliability Engineer
Key Accountabilities As a Senior Site Reliability Engineer, you will join a super-talented team of ops-focussed engineers that delivers world-class cloud-based infrastructure to support our global customer base. You and the team will own the cloud infrastructure and design/build automated frameworks for code delivery to test and production environments, plan for growth, analyze and fix problems in real time, and design and implement secure networks.
- Design and build the architecture to run PowerReviews applications
- Work closely with dev teams to build highly available, cost effective systems
-
Own all cloud infrastructure:
- Design and launch CloudFormation stacks in AWS, relying on Puppet, Ruby, and Linux
- Create and harden custom Linux AMI’s
- Manage various AWS services that we use including; DynamoDB, Redshift, SQS, VPC, EC2, S3, CloudFormation, ECR/ECS
- Support and transition Rackspace platforms to AWS
- Design the build, test and release frameworks for various technologies
- Create new tools and scripts designed for auto-remediation of incidents
- Write well documented and tested code intended for automated execution
- Design platforms for extremely high uptime metrics
- Implement log storage, monitoring, alerting and metrics gathering
- Own the security posture of the platforms
-
Fully understand the application interactions Requirements and Preferred Skills
- 5+ years of experience in site reliability, systems engineering, devops, or systems architecture on a high volume platform
- Expert level Linux engineering skills
- Experience with a majority of the following tools: Puppet, Chef, Ansible, Ruby, Python, Tomcat, Java, PostgreSQL, BASH scripting, service oriented architecture, public/private APIs, SSO, Git, Docker
- Advanced, expert level knowledge of the Amazon Web Services platform. You should have built complex AWS implementations before.
- Past experience writing automation tools
- A strong understanding of what lies below application level abstractions
- Mastery of documentation and diagramming
- Thorough comprehension of networking, firewalls, load balancers, IPV4, security standards
- Ability to hand-off platforms to Systems Engineers to run
- Strong communicator: able to effectively work with remote engineers
- A pragmatic approach to architecture and problem-solving
- Lifelong learner, not afraid to take on new technologies Our Tech Stack At PowerReviews, we use lots of open source software and useAmazon Web Services (almost) exclusively. Our current stack consists of Linux, Java, Jenkins, Tomcat, Nginx, PostgreSQL, MySQL, ElasticSearch, React.js, Docker, and some Ruby on Rails. In AWS we make use of EC2, DynamoDB, RDS, Redshift, Elastic Beanstalk, S3, Elasticache, (both Redis and Memcached), Elastic Map Reduce, and Cloudfront. Other stuff we use includes Sumologic, Datadog, Selenium, and Packer.
Desired Skills and Experience
See application page for details