Site Reliability Engineer

With Lalamove in Kowloon - HK

More jobs from Lalamove

Posted on December 06, 2018

About this job

Compensation: $45k - 90k
Location options: Visa sponsor, Paid relocation
Job type: Full-time
Experience level: Senior
Role: System Administrator
Industry: B2B, Logistics & Distribution, Transportation
Company size: 201-500 people
Company type: VC Funded



Job description

Do you like when sites stay up, like all the time? Do you enjoy finding new and creative ways of surfacing poor quality code in production? Do you think that Netflix’s simian army of chaos is the best thing ever? You may well be exactly who we are looking for!We at Lalamove are now looking for a highly trained SRE who can help us prevent downtime and provide visibility for our engineers into what things may go wrong. You’d be a part of the infrastructure team here at Lalamove and work closely with production engineers, security experts and tooling software engineers to achieve the ultimate goal of plentiful 9s. You’d also work on outreach and education within the greater engineering organisation.

What we imagine you’d be doing

  • Plan, set up and manage the monitoring infrastructure
  • Educate developers on what kinds of application level metrics could be valuable
  • Ensure we have a relevant RED dashboard for our business
  • Help out on the most challenging root cause analysis
  • Find and fix bugs deep inside the systems we are using
  • Plan and implement reliable recovery process
  • Ensure our applications are resilient to infrastructure level failures (chaos testing)

What we’re looking for

  • Humble and able educator
  • Strong programmer in systems languages (Go, C, etc)
  • Experience with different kind of monitoring tools, can tell which ones are good for what

Apply here