
Senior Site Reliability Engineer
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in United States.
• Provide support for production systems across platforms such as ESXi, Azure, AWS, and GCP.
• Employ configuration management tools for scalable and repeatable systems management, including Ansible and Puppet.
• Design, develop, and maintain automation frameworks, scripts, and operational tools to enhance scalability, reliability, and operational efficiency across infrastructure and platform services.
• Configure, maintain, patch, and troubleshoot Linux operating systems, with a foundational understanding of Windows operating systems.
• Ensure adherence to security and data handling policies to comply with PCI, HIPAA, and other standards.
• Develop and sustain Infrastructure-as-Code (IaC) solutions utilizing tools such as Terraform, Ansible, and Puppet to facilitate repeatable and standardized deployments.
• Collaborate as a responsible and supportive member of Amwell's technology teams.
• Participate in a 24/7 on-call rotation and scheduled maintenance activities.
• A minimum of 5 years of experience managing Linux-based systems; relevant certifications are a plus.
• Significant experience with Infrastructure-as-Code and configuration management technologies, including Terraform, Ansible, Puppet, or similar automation frameworks.
• Create automation workflows for system provisioning, patch management, monitoring, configuration management, incident response, and operational remediation.
• Familiarity with on-premises and cloud-based virtualization platforms for compute and storage, such as ESXi, Azure, AWS, and GCP.
• Proficient in using the Elasticsearch/Logstash/Kibana analytics engine (ELK Stack).
• Experience with managing Identity and Authentication solutions, including LDAP, Active Directory, and Multi-Factor Authentication.
• Strong scripting and software development abilities using languages such as Python, Bash, or PowerShell, with a track record of building reusable automation tools and operational integrations in hybrid cloud and on-premise settings.
• Experience in developing monitoring, alerting, and self-healing automation solutions.
• A solid understanding of TCP/IP networking principles.
• Proven experience supporting large-scale production environments with an automation-first operational approach.
• Flexible Personal Time Off (Vacation time).
• 401K matching.
• Competitive healthcare, dental, and vision insurance plans.
• Paid Parental Leave (Maternity and Paternity leave).
• Employee Stock Purchase Program.
• Complimentary access to Amwell’s Telehealth Services, SilverCloud, and The Clinic by Cleveland Clinic’s second opinion program.
• Free Subscription to the Calm App.
• Tuition Assistance Program.
• Pet Insurance.
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.