Manager, Site reliability Engineering - Public cloud

With JPMorgan Chase & Co. in Bengaluru - IN

More jobs from JPMorgan Chase & Co.

Posted on February 22, 2021

About this job

Job type: Full-time
Role: System Administrator
Industry: Financial Services
Company size: 10k+ people
Company type: Public

Technologies

cloud, amazon-web-services, elasticsearch

Job description

This role requires a wide variety of strengths and capabilities, including:
* Deep understanding of SRE philosophy, technologies, platforms and tools, SLA management, incident resolution, and automation.
* Mastery of application, data, and infrastructure architecture disciplines
* Command of architecture, design, and business processes Keen understanding of financial control and budget management
* Expertise in working in partnership with colleagues throughout the firm, and in leading collaborative teams to achieve common goals
* Hands-on experience in managing operations of large-scale internet-centric production environments for application or infrastructure services serving tens to millions of end-users.
* Prior experience in large scale internet companies/technologies, where uptime and continuous availability was core to the business.
* Work with Architecture to design reusable patterns to deploy to applications, provide governance around adoption, and influence application development teams on roadmaps and designs.
* Have a software-centric mindset, look for automation opportunities to drive down toil and reduce technical debt.
* Apply standards of cloud compliance to application design to achieve reliability
* Understanding of Networking and cloud technologies, for example Security, Load Balancing, Network routing protocols.
* Willingness to take on-call rotations and flexibility of working hours.

Responsibilities:
* Leads a global SRE team to implement SRE frameworks to support multi-cloud environments, and ensure the highest level of SLA through operational excellence
* Leads the development of SRE-related product technology roadmap
* Leads product design, development, and transition to operations
* Leads failure analysis/root cause analysis when required
* Provides support to develop & improve the quality of technical engineering documentation
* Provides support to drive the maturity of the software development lifecycle
* Provides strategic guidance, mentorship, and problem resolution for engineering activities
* Provides quality control of engineering deliverables
* Performs deployment, administration, management, configuration, testing, and integration tasks related to the big data platforms in a cloud environment
* Helps to develop new cloud engineering strategies and implementations for the firm
* Champion a DevOps model so that services are automated and elastic across all platforms
* Supports the management of relationships with technology vendors
* Identifies, recruits, and staff resources required for product design, development, and test
* Responsible for coaching and mentoring less experienced team members
* Participates in 24x7 SRE on-call rotations and escalation workflows.

Qualifications:
* Bachelor's degree in Computer Science, Information Technology, or equivalent technical field
* 10+ years of Site Reliability experience operating large scale infrastructure.
* 3+ years of Enterprise Cloud infrastructure experience (AWS, Azure, GCP) in a mission-critical environment
* Proven experience in the area of people management on globally distributed teams.
* In-Depth OS experience (RHEL, Ubuntu, Windows Server) with strong debugging, troubleshooting, and problem-solving skills
* Experience in programming in one or more of the following languages: Python, Java, PowerShell, shell scripting, GO
* Hand-on experience with cloud-based technologies and tools especially in deployment, monitoring, and operations, such as Data Dog, Prometheus, Splunk, Elasticsearch, Grafana
* Strong working knowledge of modern development technologies and tools such as Agile, CI/CD, Git, Terraform, and Jenkins.
* Deep knowledge of Internet protocols and web services technologies such as HTTP, DNS, TCP/UDP, SOAP, JSON, and REST
* Good understanding of networking protocols and cybersecurity best practices in the cloud environment
* Experience architecting and managing Kubernetes clusters.
* AWS/GCP certification is highly desirable

JPMorgan Chase & Co., one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.

We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as any mental health or physical disability needs.

Apply here