Site Reliability Engineer – DevOps

With JP Morgan Chase in Hyderabad - IN

More jobs from JP Morgan Chase

Posted on December 20, 2019

About this job

Job type: Full-time
Role: DevOps, System Administrator

Technologies

cloud, security, d3.js

Job description

As a member of our Production Support team, you’ll immediately put your love of technology into action. Each day, you and your team will be responsible for making sure our platforms, servers and networks are online and secure. You’ll work together, evaluating, selecting, implementing, integrating, maintaining, upgrading, documenting and designing our infrastructure. You’ll find new and creative solutions to troubleshoot and resolve issues. Communication is the key, both in problem solving with your supervisors and collaborating with your coworkers, as well as other teams in the network.

We are looking for a highly motivated individual that can utilize their software engineering skills to automate or eliminate operational tasks. The candidate will build and implement creative solutions to operational problems, including optimizing existing systems, building infrastructure, Capacity and Resilience management and eliminating work through automation. The candidate will partner with various cross-functional teams across the globe. The candidate will be responsible for maintain products SLI/SLO, availability, reliability, tooling and visualization for business, development, and operational teams to consume.

This position is anticipated to require the use of one or more High Security Access (HSA) systems. Users of these systems are subject to enhanced screening which includes both criminal and credit background checks, and/or other enhanced screening at the time of accepting the position and on an annual basis thereafter. The enhanced screening will need to be successfully completed prior to commencing employment or assignment.

Responsibilities:
Develop tools and visualization to understand our customer experience and their product interaction

Run, maintain and improve the service against established Service Level Objectives by applying software engineering principles

Develop solutions to automate manual development & operational task.

Responsible for the availability, performance, change management, telemetry, and capacity management of their services

Engage in with the development team throughout the life cycle to help build for reliability

Take part in Root Cause Analysis and post-mortem to identify and eliminate gaps and improve service

Analyzes usage and telemetry data to identify patterns to predict and prevent failure

Constantly evaluate and test products specially before and after any change

Manage the efforts to split between manual operational work and engineering work

Part of the 24x7x365 support coverage

Qualifications:
Experience with managing Windows or Linux platform based applications.

Experience with Object Oriented Programing languages such as Java, Python or C# and shell scripting

Strong experience with CI/CD pipeline and testing framework

Experience in Incident, Change and Problem management process in an large scale operations

Experience with integrating solutions in a multi-vendor environment, including SaaS environments

Experience in performance engineering and monitoring using tools such as AppDynamics, Splunk, Apica, Jmeter and Dynatrace

Experience with Automation and Configuration tools like Ansible, Puppet, Chef or Evolven.

Experience with Agile and full software development life cycle disciplines

Experience with Capacity and Resilience management practices and procedures is beneficial

Knowledge of networking protocols is beneficial.

Industry recognized security certifications (security, networking, etc.) – strongly preferred

Experience with Splunk in one of the following areas: IT Operations, compliance, Dev-Ops, network security, and system security, supporting security event management tools (SIEMs)

Working knowledge of Splunk Cloud solution offering – not required but preferred

Good working knowledge of Cloud Engineering. Understanding of private cloud principles and exposure to public cloud offerings such as AWS, Azure, Cloud Foundry or similar technology is preferred

Our CTC Production Support Organization is filled with innovators who love technology as much as you do. Together, you’ll use a disciplined, innovative and cost-effective approach to deliver a wide variety of high-quality products and services. You’ll work in a stable, resilient and secure operating environment where you—and the products you deliver—will thrive.

Apply here