Site Reliability Engineer
Remote job description
As a well rounded site reliability engineer, you should definitely be the type that appreciates diversity in your day, and challenges outside of your comfort level! A typical day in the life of a PacketFabric site reliability engineer might include these types of activities:
- Troubleshoot issues along with developers, providing systems level and architecture insight to the current issue.
- Extend configuration management systems with new features and assist developers in bringing new services & software to the appropriate devices.
- Work autonomously to solve complex or unintuitive system stability issues.
- Research, investigate, and provide justification for new technologies that would benefit development and systems.
As a well rounded system engineer and scripter, with a diverse set of skills, this makes you one of the very best people to troubleshoot, monitor the platform, and be on top of releases. You should definitely be the type that appreciates diversity in your day, and challenges outside of your comfort level!
More specifics include:
- Experience working in an environment leveraging remote communication collaboration tools like slack, zoom etc. across multiple time zones.
- Experience with git in a multi-contributor/team environment.
- High degree of drive to improve and automate your environment with minimal guidance
- Be able to solve for immediate, and plan to accommodate for future problems
- Experience in automating tasks through scripting. You should be able to use Python and be familiar with a variety of packages.
- Extensive Ubuntu and systemd knowledge
- Extensive experience with a message queue system like RabbitMQ or Kafka.
- Experience with time-series data stores.
- Experience with Ansible, Salt, Terraform, Chef, Puppet, or CFEngine. Experience with Ansible and Terraform preferred.
- Experience with build pipelines, integration testing, Jenkins, and github actions.
- Experience administering a wide variety of *nix platforms, including multiple Linux variants.
- Experience with Docker and Kubernetes
- Solid understanding of web protocols such as HTTP, TLS, HTTP/2, Server send events, CDN.
- Solid understanding of nginx and SSL.
- Familiarity with Arista/Cisco/Juniper/Nokia platforms.
- Experience with extremely large scale network management and monitoring.
- Experience with Postgres and grafana
- Experience with cloud platforms (public and/or self-hosted)
- Experience in PXE based deployments
Company: Packet Fabric
Job title: Site Reliability Engineer at Packet Fabric () (allows remote)
Job tags: ansible, kubernetes, gcp, terraform, automation, sysadmin