
Senior Engineer, Site Reliability
Posted Jun 19

Posted Jun 19
This is a fully remote position, open to applicants in California.
• Design and develop automated cloud infrastructure for Syniti-hosted SaaS applications utilizing Azure and AWS platforms.
• Create and oversee CI/CD pipelines through GitHub Actions and ArgoCD GitOps workflows for various deployment environments (dev, preprod, prod, GovCloud).
• Build and sustain Terraform modules for provisioning infrastructure, including EKS/AKS clusters, Aurora PostgreSQL, Redis, OpenSearch, and S3/Azure Storage.
• Integrate and uphold observability frameworks (Prometheus, Grafana, Loki, Mimir, Jaeger) while enforcing structured logging for application, audit, and security events.
• Assist in enhancing Istio service mesh configurations, encompassing mTLS policies, AuthorizationPolicies, and SPIFFE-based workload identity.
• Implement and manage supply chain security measures: container image signing (Cosign), SBOM generation (Syft), provenance attestation (SLSA), and vulnerability scanning (Trivy, Inspector, Snyk).
• Collaborate with security teams to fulfill FedRAMP High, Cyber Essentials+, NIST 800-53, and SecNumCloud control objectives across Azure and AWS platforms.
• Deliver L3 incident response and spearhead root cause analysis for application-tier outages on the Northstar platform.
• Support the automated compliance gate (11 controls: NAC, FIPS, STIG, IRSA, SAST, DAST, SBOM, Sign, Vuln, Audit, Auth) and ensure that application services meet all compliance requirements.
• Oversee Kubernetes workload autoscaling (Karpenter, KEDA, HPA) and pod security policies (Kyverno) for production services.
• Over 10 years of experience in SRE, DevOps, or Cloud Engineering.
• More than 5 years of practical experience with Microsoft Azure, including AKS, Storage, Monitor, and IAM/Entra ID.
• At least 3 years of hands-on experience with AWS (EKS, RDS/Aurora, IAM, S3, SQS, Cognito).
• Minimum of 3 years of Terraform module development experience for cloud infrastructure provisioning.
• At least 3 years of experience with CI/CD tools (GitHub Actions, ArgoCD, or similar GitOps platforms).
• Strong background in Kubernetes operations, including cluster management, pod security, autoscaling, and troubleshooting.
• Familiarity with service mesh technologies (Istio, Envoy, or Linkerd) and workload identity (SPIFFE/SPIRE, IRSA, or workload identity federation).
• Proficient in scripting languages such as Python, Bash, or PowerShell; knowledge of Go or .NET is a plus.
• Experience with implementing regional compliance controls (FedRAMP, SOC 2, Cyber Essentials+, or similar).
• Understanding of Zero Trust principles, mTLS, service identity, and network segmentation.
• Experience with observability stacks (Prometheus, Grafana, Loki, or similar) and distributed tracing.
• Knowledge of container supply chain security practices (image signing, SBOM, vulnerability scanning) is advantageous.
• Trust in your talent.
• Growth opportunities.
• Supportive environment.
• Recognition of individual achievements.
• Commitment to inclusion and diversity.
Innovative Solutions
Caspar Health
IVIX
Investigo
Get handpicked remote jobs straight to your inbox weekly.