Desired Skills and Experience

  • Practical cloud computing with AWS technologies (EC2, S3, ECS, etc.) in high performance and data intensive architectures for ingesting, computing, and managing spatial and non-spatial datasets.
  • Strong programming skills in one of the following: Python / Scala / Clojure / C or C++. Experience programming with the ability to quickly create prototype solutions on Unix / Linux platforms.
  • Strong knowledge of Linux with the ability to be able to build applications via the command line, debug issues, update software on Linux servers & perform system administration tasks.
  • Proficiency in multiple facets of software engineering including command-line application development, source control, continuous integration, and automated deployment by employing best practices and employing basic project scaffolding approaches (semantic versioning, security, infrastructure-as-code, etc.).
  • Gather and process raw data at scale from multiple data sources via ETL pipelines including implementing storage solutions using S3 and RDBMS (Postgres, MySQL, Oracle, SqlServer) – RDS preferred.
  • Working knowledge of Docker container orchestration strategies and best practices for solution development and deployment of large-scale data processing pipelines for data discovery.
  • Interest and/or experience in data science and building statistical learning models for data analysis.
  • Background in data mining and statistical analysis.
  • Additional experience with big data and compute platforms such as Hadoop, Spark, MapReduce are preferred.
  • Experience with NoSQL databases preferred.
  • Experience in research, life sciences, or in data science is a plus.

Apply