Remotery

Data & AI Operations Specialist

Posted May 20

This is a fully remote position, open to applicants in India.

📋 Description

• Design & Architecture: Oversee the monitoring framework for AI/ML platforms and set up sophisticated dashboards using Grafana and Azure Monitor.

• Environment Governance: Administer Azure Machine Learning (AML) workspace setups, compute targets, and manage the lifecycle of Databricks clusters (including runtime versioning and platform updates).

• Resource Optimization: Manage GPU resource distribution, reserved capacity, and cost-performance enhancements to comply with FinOps objectives.

• Security Integration: Guarantee that all AI services leverage private endpoints, VNET integration, and RBAC controls to safeguard sensitive citizen information.

• Pipeline Engineering: Take charge of the design, enhancement, and correction of Azure Data Factory (ADF) and Synapse pipelines.

• Advanced Troubleshooting: Address intricate bottlenecks related to authentication issues, data format modifications, and ETL performance.

• SOP Leadership: Create detailed Standard Operating Procedures (SOPs) for the L1 NOC team to manage routine monitoring and initial triage effectively.

• Automation: Establish CI/CD pipelines for model training, testing, and deployment to AML endpoints.

• Model Reliability: Set up data drift detection thresholds and automated retraining triggers.

• Recovery Operations: Design self-healing scripts and automated recovery runbooks for essential AI workflows.

• Audit Management: Implement and sustain audit logging for all AI decisions and model outputs, ensuring that logs are directed to the SIEM/vSOC.

• Regulatory Alignment: Perform quarterly AI governance assessments to guarantee adherence to NESA standards and data privacy regulations.


⛳️ Requirements

• AI/ML Platforms: Extensive knowledge of Azure Machine Learning and Databricks.

• Data Integration: Skilled in Azure Data Factory and Synapse.

• Infrastructure-as-Code (IaC): Proficient with Terraform or ARM Templates for consistent deployments.

• Observability: Capability to utilize Dynatrace, Grafana, and Azure Monitor for in-depth diagnostics.

• Containerization: Understanding of AKS, Istio Service Mesh, and KEDA.

• ITIL Mastery: Profound comprehension of ITIL-aligned Incident, Change, and Problem management.

• Security Mindset: Acquainted with NESA standards and UAE data residency obligations.

• Technical Writing: Competence in creating complex SOPs and Root Cause Analysis (RCA) documents within 48 hours of an incident.

• Certifications: Microsoft Azure Data Scientist Associate or Azure AI Engineer Associate is highly valued.


🏝️ Benefits

• Competitive salary and compensation package.

• Opportunities for professional development and certification.

• Collaborative and supportive work environment.

• Flexible working hours and remote work options.

People also viewed

10x.Team29 min ago

IAM Consultant – AI Trainer – Freelance

NL flagNetherlands OnlyFreelanceArtificial Intelligence€90 – €158/hour
ApplyView job
10x.Team29 min ago

PR Specialist – AI Trainer, Freelance

FR flagFrance OnlyFreelanceArtificial Intelligence€75 – €130/hour
ApplyView job
Anyone AI29 min ago

Physics Expert – AI Trainer

UY flagUruguay OnlyPart-timeArtificial Intelligence$40/hour
ApplyView job
Anyone AI46 min ago

Biology Expert – AI Trainer

SE flagSweden OnlyPart-timeArtificial Intelligence$40/hour
ApplyView job
Stronger GmbH46 min ago

AI Operations Manager

DE flagGermany OnlyFull-timeArtificial Intelligence€36k – €40k/year
ApplyView job
Anyone AI46 min ago

Physics Expert – AI Trainer

EC flagEcuador OnlyPart-timeArtificial Intelligence$40/hour
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers