Desired Skills and Experience
- Evaluate Hadoop projects across the ecosystem and extend and deploy them to exacting standards (high availability, big data clusters, elastic load tolerance)
- Develop automation, installation and monitoring of Hadoop ecosystem components in our open source infrastructure stack, specifically HBase, HDFS, Map/Reduce, Yarn, Oozie, Pig, Hive, Tez, Spark and Kafka
- Dig deep into performance, scalability, capacity and reliability problems to resolve them
- Create application patterns for integrating internal application teams and external vendors into our infrastructure
- Troubleshoot and debug Hadoop ecosystem run-time issues
- Provide developer and operations documentation to educate peer teams
- Experience building out and scaling a Hadoop-based or UNIX-hosted database infrastructure for an enterprise
- 2+ years of experience with Hadoop infrastructure or a strong and diverse background of distributed cluster management and operations experience
- Experience writing software in a continuous build and automated deployment environment
- 2+ years of DevOps or System Administration experience using Chef/Puppet/Ansible for system configuration, or quality shell scripting for systems management (error handling, idempotency, configuration management)
- In-depth knowledge of low-level Linux, UNIX networking and C system calls for high performance computing
- Experience with Java, Python or Ruby development (including testing with standard test frameworks and dependency management systems, knowledge of Java garbage collection fundamentals)
- Experience or exposure to the open source community (a well-curated blog, upstream accepted contribution or community presence)