This is a fully remote position, open to applicants in United Arab Emirates (UAE).

📋 Description

• Design, develop, and maintain scalable backend services for a media intelligence platform, emphasizing clean, maintainable, and production-ready systems.

• Take ownership of essential backend components from system design and API contracts to implementation, deployment, monitoring, and iteration.

• Influence architectural decisions related to APIs, processing pipelines, distributed computing, storage, search, observability, cloud infrastructure, and model-serving workflows.

• Create data models and storage patterns for media assets, generated metadata, embeddings, processing jobs, model outputs, search indexes, and audit trails.

• Develop high-throughput media ingestion and processing pipelines capable of handling large amounts of video, audio, image, and text content.

• Construct distributed, event-driven workflows for media processing utilizing queues and pub/sub systems such as SQS, Kafka, Pub/Sub, or similar technologies.

• Implement dependable asynchronous processing patterns that include retries, idempotency, dead-letter queues, backpressure management, and fault-tolerant job execution.

• Lead the development and enhancement of metadata extraction, content analysis, scene detection, transcription, embedding generation, and multimodal AI inference workflows.

• Integrate and optimize AI/ML services within backend workflows, which includes model APIs, embedding pipelines, OCR, speech-to-text, scene analysis, multimodal inference, batching, caching, and fallback strategies.

• Collaborate with ML engineers, data scientists, or external model providers to benchmark models, evaluate quality/latency trade-offs, and safely implement model upgrades.

• Optimize AI/ML inference workflows for latency, throughput, reliability, and cost across both real-time and batch processing paths.

• Work with model-serving systems like vLLM, Triton, TGI, SageMaker, Vertex AI, or custom inference services to enhance batching, concurrency, warmup behavior, timeout handling, autoscaling, and GPU utilization.

• Assess and apply practical model optimization techniques such as quantization, model distillation, batching, caching, prompt optimization, and routing to smaller or more economical models when suitable.

• Design and maintain vector search and indexing systems employing technologies such as Pinecone, Weaviate, Qdrant, Elastic Vectors, FAISS, pgvector, or similar tools.

• Develop retrieval workflows that facilitate semantic search, similarity matching, duplicate detection, media discovery, and structured metadata search.

• Deploy and manage systems on AWS, GCP, Azure, or similar cloud platforms, covering compute, storage, networking, queues, model-serving infrastructure, and monitoring systems.

• Ensure system reliability through logging, metrics, tracing, alerting, dashboards, operational runbooks, and best practices for incident response.

• Collaborate with product, design, data, and ML teams to deliver media-rich, AI-powered product features.

• Mentor junior and mid-level engineers, assist in technical planning, review designs, and enhance engineering quality across the team.

• Engage in code reviews, documentation, technical planning, and the continuous improvement of engineering practices.

• Maintain code quality through testing, peer review, clear documentation, and sustainable implementation patterns.

⛳️ Requirements

• Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.

• 5-7+ years of backend engineering experience, preferably in building scalable distributed systems, media platforms, data pipelines, or high-throughput backend services.

• Previous experience in owning major backend modules end to end, encompassing architecture, implementation, deployment, monitoring, and production operations.

• 3+ years of experience in integrating AI/ML inference systems into backend workflows, including model APIs, embedding pipelines, OCR, speech-to-text, scene detection, or multimodal model outputs.

• Practical experience in creating AI-powered processing pipelines for image, video, audio, or text analysis.

• Hands-on experience with production model optimization, particularly for image, video, embedding, or multimodal models, including batching, caching, quantization, prompt optimization, routing strategies, latency reduction, and cost optimization.

• Prior experience with vector search, semantic search, media retrieval, or similarity-matching systems is highly preferred.

• Experience mentoring engineers, leading technical discussions, and influencing architectural decisions across backend, infrastructure, and AI/ML workflows.

🏝️ Benefits

• Competitive salary

• Flexible working hours

• Professional development opportunities

• Remote work options

Machine Learning Systems Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Revenue Systems Architect, AI & Automation

Senior Systems Engineer

Systems Engineer

System Engineer – Bare Metal

Platform Systems Architect – Azure

Senior Systems Engineer

Never miss a great job!