Site Reliability Engineer, Compute and Storage Group
In this highly visible role, you will have the responsibility of ensuring that Apple’s world class Silicon Engineering Group will have the infrastructure and tools needed to engineer and design the world’s most advanced silicon devices and products. You will utilize your deep understanding of building and maintaining Linux compute clusters, storage systems, web infrastructure & applications, database servers, tool/license management, monitoring systems, work flow optimization, and directory services. You will utilize your extensive communication skills to interface with internal teams, enabling Apple’s world class product development.
Description
You will be responsible for supporting internal engineering teams by enhancing, maintaining, performance tuning, and planning capacity of compute clusters. Your role will directly impact the development, enhancement and maintenance of compute cluster queuing, storage systems, network interconnects, monitoring, LAMP stack, and load balancing needs.
Education Details
MS/BS Degree or equivalent.
Key Qualifications
Typically requires at least 5+ years of experience in Linux or UNIX systems administration in a large engineering or R&D environment and demonstrated skills in the following:
Linux (RHEL/CentOS preferred)
NFS and NAS appliances (NetApp preferred)
Layer 2 / Layer 3 networking (Arista or Cisco preferred)
Scripting in shell, Perl, Python or Ruby
Revision control systems (SVN, git, Perforce)
Centralized configuration management (Puppet, cfengine)
Software/tool compilation and installation
Flexlm and similar licensing systems
Monitoring systems such as Nagios, Zenoss, Groundwork
LDAP (OpenLDAP, DSEE, OpenDirectory)
IPAM with DNS (BIND) and DHCP
Must be analytical and possess strong organizational/problem-solving skills
Desired Skills and Experience
See application page for details