This is a fully remote position, open to applicants in Massachusetts.

📋 Description

• Take ownership of our Recovery Architecture.

• Design and construct our Isolated Recovery Environment — a secured AWS account featuring immutable vaults that disrupt the attacker’s kill chain before it impacts our data.

• Conduct a thorough threat model of our environment, possessing a comprehensive understanding of cloud-native attack vectors: IAM privilege escalation, backup deletion, ransomware persistence, and lateral movement across accounts.

• Assess and continuously enhance backup configurations to guarantee recoverability, focusing on existence as well.

• Standardize and automate infrastructure processes.

• Spearhead our shift to a fully Infrastructure as Code approach. Each asset (VPCs, IAM roles, security groups) must be defined in Terraform to allow for the complete redeployment of the stack into a pristine account through an automated pipeline.

• Develop automated recovery workflows capable of dismantling a compromised environment and initializing a fresh, secured one from verified code and clean data.

• Create and maintain executable recovery playbooks that specify the precise API calls and CLI commands required to restore the application — these should be tested, versioned, and executable, not mere static documents.

• Validate, test, and lead recovery exercises.

• Create automated scripts (using Python or Go) to perform smoke tests on recovered data and verify its integrity following restoration.

• Facilitate regular hands-on recovery drills that mimic the complete loss of a critical environment and the full restoration into a secondary clean account. Manage the after-action review process and implement improvements.

• Promote engineering standards.

• Serve as the resilience authority for the engineering organization — influencing high-availability architecture decisions, participating in design reviews, and elevating the standards for recoverability.

• Collaborate with the Site Reliability Engineering team on multi-region deployments and high-availability design, ensuring that cyber resilience is integrated into architecture from the outset.

• Advocate for IaC and immutable infrastructure practices across various teams, not limited to your own workstream.

⛳️ Requirements

• Over 8 years of experience in complex cloud environments (AWS/GCP/Azure), including a minimum of 3 years in AWS.

• Experience with EKS/Kubernetes is a significant advantage.

• Proficient in Terraform. You should be able to modularize intricate environments to ensure they are environment-agnostic.

• Practical knowledge of the Secure Vault pattern: safeguarding data in a separate, highly restricted AWS account with stringent network controls.

• Advanced shell scripting skills and proficiency in either Python or Go to automate restoration tasks that native AWS tools do not cover.

• Familiarity with CI/CD tools (Scalr, GitHub Actions, or similar) to facilitate widespread adoption of recovery pipelines throughout the organization.

• Demonstrated capability to engineer and automate comprehensive restoration workflows.

🏝️ Benefits

• 401(k) matching.

• Medical, dental, and vision insurance.

• Life and disability insurance.

• Generous paid time off, including vacation, sick leave, floating and fixed holidays, maternity, and bonding leave.

• Employee Assistance Program (EAP).

• Additional wellbeing resources.

• And much more.

Staff Cyber Resilience Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Forward Deployed Engineer

Business Resumption Engineer

Senior Process Engineer – Gas Handling

Senior Engineer, iAuto

Site Manager – Resident Engineer

Customer Delivery Engineer

Never miss a great job!