Desired Skills and Experience
- Maintain mission-critical IGM computing operations – including a cluster computing environment of over 2000 CPU cores and ~2 PB of storage.
- Administer, validate, and review user and system accounts, access controls, audit logs and system integrity to maximize system security and data confidentiality.
- Design, implement, install, configure, maintain, and support the IGM High Availability/ Clustering server environment
- Provide technical support and monitoring of an in-house software application, third-party software, open source software, and MySQL databases.
- Author, implement, execute, and periodically update System Security, Business Continuity, and Disaster Recovery Plans to be consistent with Columbia Medicine policies and standards.
- Interface between IGM and technology vendors on all Hardware/Software updates, installs and purchasing.
- Ensure all measures are in place to support relevant CUMC and IGM Security guidelines.
- Primary contact in an on-call rotation that provides 24x7x365 coverage of mission critical functions
- Solid computer systems knowhow building/maintaining systems in a High Performance computing environment.
- Expert Linux systems administration skills with a minimum of 2 years of experience, preferably within a heterogeneous environment consisting of several subsystems
- Thorough understanding of network setup and configuration, with the ability to troubleshoot and solve network bottlenecks
- Experience working in a diverse data center environment consisting of an HPC cluster and high-volume storage systems
- Experience with shell/python/perl scripting, preferably while integrating with a source code versioning system (git or subversion) is a plus
- Relational database management experience (e.g. MySQL, PostgreSQL, Oracle)
- Proven ability to read, understand, and apply technical documentation, and to learn new technologies quickly
- Ability to multi-task and prioritize, work within a team in a quickly evolving environment
- Ability to communicate effectively with team members and customers, both verbally and through documentation