Cloud Operations Engineer (Site Reliability) (4789)
Cloud Operations Engineer (Site Reliability) Looking for a Cloud Operations Engineer (SRE) to support the Cloud Platform team’s production environment. Responsible for lifecycle management of tools and frameworks used to maintain cloud infrastructure/services. Cloud Operations Engineer (Site Reliability) Responsibilities:
- Manage customer requests
- Support a 24x7 cloud production environment
- Build and deploy Continuous integration pipeline
-
Perform Linux administration and troubleshooting in a large scale system Cloud Operations Engineer (Site Reliability) Required Skills:
- BSCS degree; Master’s preferred
- 3+ years’ experience administering large, complex systems (preferably in a cloud-based environment)
- Strong Linux OS knowledge
- Experience with Chef, Puppet, Ansible, or other configuration management tools
- Scripting abilities in Python, Perl, Bash, or Go
-
Proficient in one or more monitoring tools: Zabbix, Elasticsearch, collectd, statsd, Logstash, Ganglia or Nagios/OpsView Desired Skills:
- Strong knowledge of networking services and protocols a plus
- Big Data experience (Hadoop, Spark, Kafka, Storm)
- Openstack platform experience
Desired Skills and Experience
See application page for details