
DevOps/Platform Engineer
Posted May 23

Posted May 23
This is a fully remote position, open to applicants in Brazil.
• Design, construct, and operate cloud infrastructure, CI/CD pipelines, and developer platforms that support digital innovation initiatives.
• Build and maintain infrastructure for the lifecycle management of AI/ML models: training environments, model serving, and production monitoring.
• Ensure that deploying an AI model into production is as reliable, repeatable, and observable as a traditional software service deployment.
• Implement deployment strategies: blue/green, canary, phased updates, and feature flags — for traditional services and AI model endpoints.
• Build and maintain a comprehensive observability stack: metrics, logs, traces, and AI-specific monitoring.
• Design and implement security policies as code and identity management.
• Over 6 years of experience in DevOps, SRE, or platform engineering.
• Experience with infrastructure as code: Terraform (primary), with exposure to Pulumi, CloudFormation, or Bicep.
• Proficiency in Kubernetes (EKS, AKS, or GKE): cluster management, Helm charts, operators, auto-scaling, and troubleshooting.
• In-depth experience with CI/CD pipeline design: GitHub Actions, GitLab CI, Azure DevOps Pipelines, or Jenkins — including multi-stage pipelines with automated quality gates.
• Strong cloud infrastructure experience in at least two platforms: AWS, Azure, GCP — with practical skills in networking, computing, storage, identity, and security services.
• Proficient in scripting and automation: Python, Bash, PowerShell, and at least one of: Go, TypeScript.
• Experience building observability stacks: Prometheus, Grafana, Datadog, ELK, OpenTelemetry, and alerting/incident management systems (PagerDuty, Opsgenie).
• Solid understanding of security engineering: secret management, network security, IAM, container security, and compliance automation.
• Experience with GitOps practices and tools: ArgoCD, Flux, or equivalent.
• Fluent in English, both written and spoken.
• Proven experience in international projects, including collaboration with global and multicultural teams.
• Strong communication skills, stakeholder management, and problem-solving abilities.
• Previous experience mentoring engineers or serving as a technical lead is highly preferred.
• Hands-on experience in MLOps: model serving, GPU infrastructure management, and knowledge of chaos engineering tools such as Chaos Monkey.
• A Bachelor's degree in Computer Science, Information Systems, Engineering, or a related field is preferred.
• 100% Remote
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.