Desired Skills and Experience
- Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities
- Implementing ETL process
- Monitoring performance and advising any necessary infrastructure changes
- Defining data retention policies
- Proficient understanding of distributed computing principles
- Develop automation and management capabilities of Hadoop cluster, with all included services
- Ability to troubleshoot and solve any ongoing issues with operating the cluster
- Proficiency with Hadoop v2, MapReduce, HDFS
- Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
- Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
- Experience with Spark
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O
- Good understanding of Lambda Architecture, along with its advantages and drawbacks
- Experience with Cloudera/MapR/Hortonworks
- Knowledge of CEPH filesystem
- Demonstrated ability to conceive, manage, and complete software deliverables
- Linux systems administration skills, across distributions, and especially in a cloud or virtualized environment
- Understanding of IP networking and traffic scaling
- Experience with agile development methodologies, rapid application development, and project management
- Proven ability to design and present understandable and practical solutions to complex problems
- Demonstrated leadership skills in a fast-paced, team-driven environment
- Strong verbal and written communication skills, including visual presentation skills
- Demonstrated experience in research data collection, analysis, and presentation
- Experience with intellectual property portfolio management, especially patents and trademarks
- Ability to work effectively across internal and external organizations
- Ability to travel when needed; expected travel is 5-25%
- Ability to promote technologies to large audiences or top level executives