Desired Skills and Experience
- Practical cloud computing with AWS technologies (EC2, S3, ECS, etc.) in high performance and data intensive architectures for ingesting, computing, and managing spatial and non-spatial datasets.
- Strong programming skills in one of the following: Python / Scala / Clojure / C or C++. Experience programming with the ability to quickly create prototype solutions on Unix / Linux platforms.
- Strong knowledge of Linux with the ability to be able to build applications via the command line, debug issues, update software on Linux servers & perform system administration tasks.
- Proficiency in multiple facets of software engineering including command-line application development, source control, continuous integration, and automated deployment by employing best practices and employing basic project scaffolding approaches (semantic versioning, security, infrastructure-as-code, etc.).
- Gather and process raw data at scale from multiple data sources via ETL pipelines including implementing storage solutions using S3 and RDBMS (Postgres, MySQL, Oracle, SqlServer) – RDS preferred.
- Working knowledge of Docker container orchestration strategies and best practices for solution development and deployment of large-scale data processing pipelines for data discovery.
- Interest and/or experience in data science and building statistical learning models for data analysis.
- Background in data mining and statistical analysis.
- Additional experience with big data and compute platforms such as Hadoop, Spark, MapReduce are preferred.
- Experience with NoSQL databases preferred.
- Experience in research, life sciences, or in data science is a plus.
Apply