Desired Skills and Experience
- ~100 servers, mostly hosted on AWS
- 8 AWS regions, as well as multiple colocated hosting providers
- Hundreds of public IP addresses
- 300+ HTTPS requests per second
- 25+ FTP/SFTP/FTPS logins per second
- 100+ file transfers per second
- 4,000 log entries per second
- 150,000+ metrics
- 99.9% uptime record
- Significant experience working with GNU/Linux servers, including a complete understanding of the command line, /proc, services, etc.
- Comprehensive understanding of networking concepts, including layers, firewalls, DNS, VPN, etc.
- Proficiency with configuration management tools, such as Chef or Puppet, and fluency with at least one major scripting language.
- Experience building distributed, failure-resistant architecture, including disaster recovery, backups, failover, etc.
- Experience with the advanced featured of public cloud platforms such as AWS or Azure (we use AWS).
- Familarity with large scale monitoring and analysis systems, such as ELK or Splunk (we use ELK).
- Complete understanding of how to build secure infrastructure and an awareness of common server security vulnerabilities.
- Ability to manage a large database at scale (we use MySQL).
- History developing and supporting actual infrastructure that has seen production usage at equal to or greater than our scale. (We talk about our size earlier in the post.)