
Staff Cyber Resilience Engineer
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in Massachusetts.
• Take ownership of our Recovery Architecture.
• Design and construct our Isolated Recovery Environment — a secured AWS account featuring immutable vaults that disrupt the attacker’s kill chain before it impacts our data.
• Conduct a thorough threat model of our environment, possessing a comprehensive understanding of cloud-native attack vectors: IAM privilege escalation, backup deletion, ransomware persistence, and lateral movement across accounts.
• Assess and continuously enhance backup configurations to guarantee recoverability, focusing on existence as well.
• Standardize and automate infrastructure processes.
• Spearhead our shift to a fully Infrastructure as Code approach. Each asset (VPCs, IAM roles, security groups) must be defined in Terraform to allow for the complete redeployment of the stack into a pristine account through an automated pipeline.
• Develop automated recovery workflows capable of dismantling a compromised environment and initializing a fresh, secured one from verified code and clean data.
• Create and maintain executable recovery playbooks that specify the precise API calls and CLI commands required to restore the application — these should be tested, versioned, and executable, not mere static documents.
• Validate, test, and lead recovery exercises.
• Create automated scripts (using Python or Go) to perform smoke tests on recovered data and verify its integrity following restoration.
• Facilitate regular hands-on recovery drills that mimic the complete loss of a critical environment and the full restoration into a secondary clean account. Manage the after-action review process and implement improvements.
• Promote engineering standards.
• Serve as the resilience authority for the engineering organization — influencing high-availability architecture decisions, participating in design reviews, and elevating the standards for recoverability.
• Collaborate with the Site Reliability Engineering team on multi-region deployments and high-availability design, ensuring that cyber resilience is integrated into architecture from the outset.
• Advocate for IaC and immutable infrastructure practices across various teams, not limited to your own workstream.
• Over 8 years of experience in complex cloud environments (AWS/GCP/Azure), including a minimum of 3 years in AWS.
• Experience with EKS/Kubernetes is a significant advantage.
• Proficient in Terraform. You should be able to modularize intricate environments to ensure they are environment-agnostic.
• Practical knowledge of the Secure Vault pattern: safeguarding data in a separate, highly restricted AWS account with stringent network controls.
• Advanced shell scripting skills and proficiency in either Python or Go to automate restoration tasks that native AWS tools do not cover.
• Familiarity with CI/CD tools (Scalr, GitHub Actions, or similar) to facilitate widespread adoption of recovery pipelines throughout the organization.
• Demonstrated capability to engineer and automate comprehensive restoration workflows.
• 401(k) matching.
• Medical, dental, and vision insurance.
• Life and disability insurance.
• Generous paid time off, including vacation, sick leave, floating and fixed holidays, maternity, and bonding leave.
• Employee Assistance Program (EAP).
• Additional wellbeing resources.
• And much more.
Divert
Get handpicked remote jobs straight to your inbox weekly.