Desired Skills and Experience
- Design, develop, configure and maintain our CI/CD pipeline to support the scale of running several application stacks in the cloud that are consumed worldwide
- Design and develop cloud solutions for EMS Software applications that meet stringent high availability and disaster recovery requirements.
- Automate the deployment and maintenance of cloud platform technologies
- Help support production operations, log management, data warehousing, and database operations, including management of Splunk services
- Design, develop, configure and maintain all monitoring systems (IT, development, service management, Apdex) which support cloud operations
- Enforce consistency of monitoring, reporting, and alarming systems
- Help drive process improvements for service management, including: outage/incident management, rollbacks and reporting
- Research emerging virtualization techniques and advise management
- Perform capacity management, load and scalability planning
- Ensure compliance with deployment and operations documentation
- Assist management in development and optimization of operational cost models
- Build strategic and tactical plans for continued improvement of cloud architecture and operations
- Assist in the establishment of 24x7 performance monitoring and response protocols
- Provide on-call support outside of normal work hours/days when needed
- Help enable effective information security practices for EMS Software and the Cloud Operations team
- You’re driven, humble, and autonomous
- You’re a quick study, a strong communicator, and you’re able to adapt to a fast-paced environment
- You have a working knowledge of Agile Development practices (e.g., SCRUM, TDD)
- You have the mindset of a developer, but are intrigued by the operational aspects of hosting developed solutions
- You’re an expert in Linux or Windows (IIS, SQL Server)
- You have at least 2 year of hands-on production experience with Amazon Web Services (AWS), Google Cloud Platform (GCP) or Microsoft Azure. This includes:
- Configuration of VPCs, with multiple Availability Zones. Experience with multi-region cloud deployments a huge plus
- Experience setting up, maintaining and monitoring global production environments, QA environments and staging environments, with a strong understanding of the differing needs of such environments.
- At least 6 months of experience in a professional production environment
- At least 6 months of experience managing networking infrastructure and monitoring at an application level
- You have performance optimization experience, including: troubleshooting and resolving network and server latency issues; performing hardware evaluation/selection tasks; performance vs cost vs time analysis
- You are devoted to automation and have strong prior automation experience
- You have at least 1 year of experience with automation or scripting tools (e.g., GO, Python, Shell, PowerShell)
- You have at least 3 years’ experience in an application development role (e.g. Java, C#, Ruby, Go, etc.) or automation development role (Ansible, Puppet, Chef, Jenkins, Shell, PowerShell, etc.).
- You have at least 1 years’ experience with continuous integration/continuous delivery tools (GitLab CI/CD, Terraform, Travis CI, AWS CodePipeline, etc.), and you have hands-on experience with building out and maintaining a continuous integration and delivery pipeline
- You’re detail-oriented, with excellent documentation skills, and you’re someone who can successfully manage multiple priorities
- You have troubleshooting skills that range from diagnosing hardware/software issues to large scale failures within a complex infrastructure
- Bachelors in Computer Science or equivalent work experience
- Experience with Mongo, MS SQL Server, Windows Installation/Deployment, Splunk, Grafana, Terraform and Prometheus
- Experience working with Docker, Kubernetes and GO.
- Hands-on experience with performance and information security testing
- We have current Production and Continuous Integration footprints in Google Cloud (primary) and Azure
- We have a well-built CI/CD pipeline using Gitlab CI/CD that allows us to deploy and stand up customers on demand
- We leverage Ansible heavily, Splunk (JSON Logs) is our blood line and we enjoy operational efficiency and accessibility through Hubot and StackStorm
- Our front-end applications leverage React and React Native, Redux, Node, C#, and Knockout
- Our APIs comprises of Golang, .NET and .NET core
- Our back-end comprises of MS SQL Server
Apply