Desired Skills and Experience

  • Design, develop, configure and maintain our CI/CD pipeline to support the scale of running several application stacks in the cloud that are consumed worldwide
  • Design and develop cloud solutions for EMS Software applications that meet stringent high availability and disaster recovery requirements.
  • Automate the deployment and maintenance of cloud platform technologies
  • Help support production operations, log management, data warehousing, and database operations, including management of Splunk services
  • Design, develop, configure and maintain all monitoring systems (IT, development, service management, Apdex) which support cloud operations
  • Enforce consistency of monitoring, reporting, and alarming systems
  • Help drive process improvements for service management, including: outage/incident management, rollbacks and reporting
  • Research emerging virtualization techniques and advise management
  • Perform capacity management, load and scalability planning
  • Ensure compliance with deployment and operations documentation
  • Assist management in development and optimization of operational cost models
  • Build strategic and tactical plans for continued improvement of cloud architecture and operations
  • Assist in the establishment of 24x7 performance monitoring and response protocols
  • Provide on-call support outside of normal work hours/days when needed
  • Help enable effective information security practices for EMS Software and the Cloud Operations team
  • You’re driven, humble, and autonomous
  • You’re a quick study, a strong communicator, and you’re able to adapt to a fast-paced environment
  • You have a working knowledge of Agile Development practices (e.g., SCRUM, TDD)
  • You have the mindset of a developer, but are intrigued by the operational aspects of hosting developed solutions
  • You’re an expert in Linux or Windows (IIS, SQL Server)
  • You have at least 2 year of hands-on production experience with Amazon Web Services (AWS), Google Cloud Platform (GCP) or Microsoft Azure. This includes:
  • Configuration of VPCs, with multiple Availability Zones.  Experience with multi-region cloud deployments a huge plus
  • Experience setting up, maintaining and monitoring global production environments, QA environments and staging environments, with a strong understanding of the differing needs of such environments. 
  • At least 6 months of experience in a professional production environment
  • At least 6 months of experience managing networking infrastructure and monitoring at an application level
  • You have performance optimization experience, including: troubleshooting and resolving network and server latency issues; performing hardware evaluation/selection tasks; performance vs cost vs time analysis
  • You are devoted to automation and have strong prior automation experience
  • You have at least 1 year of experience with automation or scripting tools (e.g., GO, Python, Shell, PowerShell)
  • You have at least 3 years’ experience in an application development role (e.g. Java, C#, Ruby, Go, etc.) or automation development role (Ansible, Puppet, Chef, Jenkins, Shell, PowerShell, etc.). 
  • You have at least 1 years’ experience with continuous integration/continuous delivery tools (GitLab CI/CD, Terraform, Travis CI, AWS CodePipeline, etc.), and you have hands-on experience with building out and maintaining a continuous integration and delivery pipeline
  • You’re detail-oriented, with excellent documentation skills, and you’re someone who can successfully manage multiple priorities
  • You have troubleshooting skills that range from diagnosing hardware/software issues to large scale failures within a complex infrastructure
  • Bachelors in Computer Science or equivalent work experience
  • Experience with Mongo, MS SQL Server, Windows Installation/Deployment, Splunk, Grafana, Terraform and Prometheus
  • Experience working with Docker, Kubernetes and GO.
  • Hands-on experience with performance and information security testing
  • We have current Production and Continuous Integration footprints in Google Cloud (primary) and Azure
  • We have a well-built CI/CD pipeline using Gitlab CI/CD that allows us to deploy and stand up customers on demand
  • We leverage Ansible heavily, Splunk (JSON Logs) is our blood line and we enjoy operational efficiency and accessibility through Hubot and StackStorm
  • Our front-end applications leverage React and React Native, Redux, Node, C#, and Knockout
  • Our APIs comprises of Golang, .NET and .NET core
  • Our back-end comprises of MS SQL Server

Apply