Site Reliability Engineer

With JPMorgan Chase & Co. in Glasgow - GB

More jobs from JPMorgan Chase & Co.

Posted on February 22, 2021

About this job

Job type: Full-time
Role: DevOps, System Administrator
Industry: Financial Services
Company size: 10k+ people
Company type: Public


cloud, amazon-web-services, devops

Job description

JPMorgan Chase (JPMC) is a leading global financial services firm and the largest bank in the United States with total assets of $2.687 trillion. With an annual tech budget of $10B+, we has started significantly investing and building in the next generation core infrastructure, Cloud, Big Data and AI/ML technology. Our goal is to accelerate the delivery and adoption of the Global Technology Vision - and enable the firm's Global Technology teams to deliver faster and more impactful for customers and clients.

We have a Sr. Site Reliability Engineer (SRE) position to help JPMC Big Data team on production support in public cloud. In this role, you'll be working with AI/ML and cloud engineers to build the platform, pipeline, and monitoring systems to ensure the application landscape is designed to take most advantage of JPMC's global cloud solution.

This role requires a wide variety of strengths and capabilities, including:
* Deep understanding of SRE philosophy, technologies, platforms and tools, SLA management, incident resolution, and automation
* Mastery of application, data and infrastructure architecture disciplines
* Expertise in working in partnership with colleagues throughout the firm, and in leading collaborative teams to achieve common goals
* Hands on experience on managing operations of large-scale internet-centric production environments for application or infrastructure services serving tens to millions of end users.
* Prior experience in big data/cloud technologies, where uptime and continuous availability was core to the business.
* Identify and partner with Infrastructure teams and AD teams to implement automation opportunities to drive down toil and reduce technical debt.
* Apply standards of cloud compliance to application design to achieve reliability
* Understanding of Networking and cloud technologies, for example Security, Load Balancing, Network routing protocols.

* Implement SRE frameworks to support globally multi-cloud environments, and ensure the highest level of SLA through operational excellence
* Provides failure analysis / root cause analysis when required
* Provides support to develop & improve the quality of technical engineering documentation
* Provides support to drive the maturity of the software development lifecycle
* Provides quality control of engineering deliverables
* Provides technical consultation to product management
* Performs deployment, administration, management, configuration, testing, and integration tasks related to the big data platforms in cloud environment
* Helps to develop new cloud engineering strategies and implementations for the firm
* Champion a DevOps model so that services are automated and elastic across all platforms
* Helps on coaching and mentoring junior team members.
* Writes operation documentation and knowledge base of known issues with solutions

• Participates in 24x7 SRE on-call rotations and escalation workflows.


* Bachelor's degree in Computer Science, Information Technology, or equivalent technical field
* Enterprise Cloud infrastructure experience (AWS, Azure, GCP) in a mission critical environment
* In-Depth OS experience (RHEL, Ubuntu, Windows Server) with strong debugging, troubleshooting, and problem-solving skills
* Experience in site reliability engineering in one of the following languages: Python, Java, shell scripting, PowerShell or GO
* Hand-on experience with big data technologies(Hadoop, Spark, Airflow etc) and cloud-based technologies and tools especially in deployment, monitoring and operations, such as Data Dog, Prometheus, Splunk, Elasticsearch, Grafana
* Strong working knowledge of modern development technologies and tools such Agile, CI/CD, Git, Terraform and Jenkins.
* Deep knowledge of Internet protocols and web services technologies such as HTTP, DNS, TCP/UDP, JSON and REST
* Good understanding of networking protocols and cybersecurity best practices in cloud environment
* AWS or EKS certification is highly desirable

J.P. Morgan is a global leader in financial services, providing strategic advice and products to the world's most prominent corporations, governments, wealthy individuals and institutional investors. Our first-class business in a first-class way approach to serving clients drives everything we do. We strive to build trusted, long-term partnerships to help our clients achieve their business objectives.

We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as any mental health or physical disability needs.

Apply here