Remotery

DevOps Engineer – ML & Data Infrastructure

Posted 7 hours ago

This is a fully remote position, open to applicants in United States.

📋 Description

• Oversee, configure, and automate cloud infrastructure utilizing tools such as Terraform and Ansible.

• Develop CI/CD pipelines for machine learning models and data workflows, emphasizing automation, version control, rollback, and monitoring with tools like Vertex AI, Jenkins, and DataDog.

• Create and sustain scalable data and feature pipelines for both real-time and batch processing, employing BigQuery, BigTable, Dataflow, Composer, Pub/Sub, and Cloud Run.

• Establish infrastructure for model monitoring and observability, identifying drift, bias, and performance issues through Vertex AI Model Monitoring and custom dashboards.

• Enhance inference performance, focusing on reducing latency and improving cost-efficiency of AI workloads.

• Maintain overall system reliability, scalability, and performance across the ML/Data platform.

• Define and enforce infrastructure best practices for deployment, monitoring, logging, and security.

• Diagnose and resolve complex issues impacting ML/Data pipelines and production systems.

• Ensure adherence to data governance, security, and regulatory standards, particularly in real-money gaming environments.

• Mentor and lead DevOps engineers, assisting in technical decision-making and operational processes.

• Aid in sprint planning, task prioritization, and cross-functional collaboration across infrastructure and platform efforts.

• Perform code reviews, disseminate best practices, and contribute to fostering a high-performing engineering culture.

• Work closely with ML, Data, Product, and Security teams to ensure infrastructure strategy aligns with business goals.


⛳️ Requirements

• A minimum of 5 years of experience as a DevOps Engineer, preferably with a focus on ML and Data infrastructure.

• Proven experience in leading projects, mentoring engineers, or managing technical teams.

• Extensive hands-on experience with Google Cloud Platform (GCP), particularly with BigQuery, Dataflow, Vertex AI, Cloud Run, and Pub/Sub.

• Proficient in Terraform, with additional skills in Ansible considered a bonus.

• Strong understanding of containerization (Docker, Kubernetes) and orchestration platforms such as GKE.

• Experience in building and maintaining CI/CD pipelines, ideally with Jenkins.

• Comprehensive understanding of monitoring and logging best practices for cloud and data systems.

• Proficient in scripting languages such as Python, Groovy, or Shell.

• Familiarity with AI orchestration frameworks like LangGraph or LangChain is advantageous.

• Excellent communication, collaboration, and stakeholder management abilities.

• Bonus points if you have experience in gaming, real-time fraud detection, or AI-driven personalization systems.


🏝️ Benefits

• Competitive salary and comprehensive benefits package.

• Opportunities for professional growth and development.

• Collaborative and inclusive work environment.

• Flexible work arrangements to support work-life balance.

People also viewed

Ping Identity7 hours ago

Staff Site Reliability Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$136.3k – $170k/year
ApplyView job
Stack AV7 hours ago

Site Reliability Engineer

US flagPennsylvania OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
May Mobility7 hours ago

Autonomy Release Engineer II

US flagMichigan OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$128k – $165k/year
ApplyView job
Practical DevSecOps7 hours ago

Senior Security Engineer, Content Engineering

US flagCalifornia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Mercury Insurance7 hours ago

Manager – Site Reliability Operations

US flagCalifornia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$118.7k – $230.6k/year
ApplyView job
Ad Hoc LLC7 hours ago

Senior Site Reliability Engineer

North AmericaFull-timeDevOps & Site Reliability Engineer (SRE)$135k – $150k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers