Desired Skills and Experience

  •  Evaluate Hadoop projects across the ecosystem and extend and deploy them to exacting standards (high availability, big data clusters, elastic load tolerance)
  •  Develop automation, installation and monitoring of Hadoop ecosystem components in our open source infrastructure stack, specifically HBase, HDFS, Map/Reduce, Yarn, Oozie, Pig, Hive, Tez, Spark and Kafka
  •  Dig deep into performance, scalability, capacity and reliability problems to resolve them
  •  Create application patterns for integrating internal application teams and external vendors into our infrastructure
  •  Troubleshoot and debug Hadoop ecosystem run-time issues
  •  Provide developer and operations documentation to educate peer teams
  •  Experience building out and scaling a Hadoop-based or UNIX-hosted database infrastructure for an enterprise
  •  2+ years of experience with Hadoop infrastructure or a strong and diverse background of distributed cluster management and operations experience
  •  Experience writing software in a continuous build and automated deployment environment
  •  2+ years of DevOps or System Administration experience using Chef/Puppet/Ansible for system configuration, or quality shell scripting for systems management (error handling, idempotency, configuration management)
  •  In-depth knowledge of low-level Linux, UNIX networking and C system calls for high performance computing
  •  Experience with Java, Python or Ruby development (including testing with standard test frameworks and dependency management systems, knowledge of Java garbage collection fundamentals)
  •  Experience or exposure to the open source community (a well-curated blog, upstream accepted contribution or community presence)