
Senior Cloud DevOps Engineer
Posted May 20

Posted May 20
This is a fully remote position, open to applicants in Egypt.
• Design, develop, and implement scalable, secure, and cost-effective AWS infrastructure across various regions, adhering to the AWS Well-Architected Framework.
• Construct, manage, and enhance all cloud infrastructure using Terraform, promoting module reusability, remote state management, and Infrastructure as Code (IaC) best practices across environments.
• Take ownership of and optimize the PostgreSQL database, focusing on performance, scalability, and reliability.
• Deploy, administer, and optimize workloads on Kubernetes clusters utilizing Helm and Kustomize.
• Design, implement, and sustain CI/CD pipelines using GitHub Actions.
• Lead disaster recovery planning, runbook development, and failure scenario modeling, including strategies for database backup and recovery.
• Assist and facilitate engineering teams with infrastructure and database requirements.
• Operate and enhance monitoring, logging, and alerting systems, including database-specific monitoring (query performance, replication health, connection saturation) to ensure high availability and prompt incident response.
• Engage in and contribute to improving the weekly on-call rotation.
• 6–8+ years of professional experience in Cloud Engineering, DevOps, or Site Reliability Engineering (SRE), with a proven history of managing highly scalable, high-availability systems in production.
• Extensive, hands-on experience with core AWS services (EKS, ECS, EC2, VPC, IAM, RDS, Amazon Aurora, S3, Route 53, CloudFront, ALB/NLB, etc.) in real production settings.
• Expert-level knowledge of Terraform, including module design, remote state management, and configurations for multi-environment/multi-region setups.
• Strong PostgreSQL expertise in production environments, including query and index performance tuning, sharding strategies (such as application-level sharding or partitioning), replication setup and management (streaming, logical), connection pooling (PgBouncer), vacuum tuning, and planning/executing major version upgrades with minimal downtime.
• Experience managing large-scale PostgreSQL databases (ranging from hundreds of GBs to TBs) under high-traffic conditions, with a comprehensive understanding of how schema design, indexing, and partitioning choices impact performance at scale.
• Significant experience in operating and optimizing Kubernetes clusters (deployments, scaling, RBAC, networking, security policies, cluster upgrades).
• Proven experience in designing and maintaining CI/CD pipelines using GitHub Actions.
• Solid understanding of GitOps principles and tools; hands-on experience with ArgoCD is highly preferred.
• Strong grasp of networking fundamentals (DNS, VPC peering, Transit Gateway, VPN, load balancing) and cloud security best practices.
• Experience with logging, monitoring, and alerting stacks (e.g., ELK, EFK, LGTM, CloudWatch) across multiple environments, including database-specific monitoring.
• Proficiency in Bash and Python for automation and tooling purposes.
• Strong knowledge of Git workflows, including branching strategies and code review practices.
• Experience in designing and implementing multi-region architectures with failover and disaster recovery strategies.
• Flexible work arrangements
• Professional development opportunities
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.