Desired Skills and Experience
- Takes the lead role in terms of design, implementation and Operation and Maintenance of the Big Data Platform.
- Designs, implements and controls Raw Data Ingest and ETL layers and related processes.
- Maintains detailed documentation describing Platform’s Architecture, Data Model, Cluster configuration, Application landscape.
- Owns and develops Docker based, containerised Sandbox platform.
- Defines Platform Security Policy, Data Access, Users Roles and Authorisation levels.
- Proposes and implements systems enhancements that provide scalability, improve the reliability and cluster performance of the system.
- Monitors usage and performance of the Hadoop ecosystem.
- Troubleshoots Hadoop ecosystem and proactively prevents systems outages.
- Trains personnel on platform usage.
- Bachelor’s Degree in Computer Science, Information Systems or the equivalent combination of education, training, or work experience.
- At least 3+ years of experience as Hadoop Cluster Administrator preferably with Hortonworks HDP Distribution.
- 5 years of Linux systems Admin Experience including LINUX Shell Scripting
- 2 years of Experience in provisioning and configuration of cloud infrastructure environments. Hands-on Amazon AWS experience.
- Excellent understanding of Apache Hadoop ecosystem and experience in handling Hadoop Configuration settings and Components like Ambari Yarn, HDFS, Hive , Hbase , Pig Sqoop, Avro, Flume, Oozie, Zookeeper, Ranger, Kerberos
- Strong working knowledge of Hadoop configuration requirements and cluster setup for Apache Spark
- Good working knowledge of Docker technology and understanding of Distributed Applications concept based on containers
- Working knowledge of both SQL and NoSQL technologies
- Knowledge of network monitoring and tools (TCP/IP, Load Balancers, Firewalls, Proxy)
- Fluency in Java and at least one of the following programming languages e.g.: Python, R, Scala, SQL
- Knowledge of technical writing principles and practices
- Experience in setting up and running development, test and production environments
- Skill to effectively analyse and solve problems
- Ability to work independently as well as with a team. Must enjoy working as part of cross-functional teams
- Advanced knowledge of Hortonworks Hadoop, UNIX/Linux operating systems as well as cloud and HDFS storage configurations.
- Hortonworks Certification a big advantage.
- Experience with data architecture methodologies and principles, distributed computing and Advanced Analytics systems.
- Any experience with Data mining, Predictive Analytics or Machine Learning platforms a plus.
- Experience of configuration management tools e.g. Ansible, Salt, Chef, Puppet