
Machine Learning Systems Engineer
Posted Jun 4

Posted Jun 4
This is a fully remote position, open to applicants in Spain.
• Design, develop, and manage scalable backend services for a media intelligence platform, with an emphasis on creating clean, maintainable, and production-ready systems.
• Take full ownership of crucial backend components, overseeing everything from system architecture and API contracts to implementation, deployment, monitoring, and iterative improvements.
• Influence architectural choices across APIs, processing pipelines, distributed computing, storage solutions, search mechanisms, observability, cloud infrastructure, and model-serving workflows.
• Create data models and storage strategies for media assets, generated metadata, embeddings, processing jobs, model outputs, search indexes, and audit logs.
• Develop high-throughput media ingestion and processing pipelines to handle substantial volumes of video, audio, image, and text content.
• Construct distributed, event-driven workflows for media processing that utilize queues and pub/sub systems like SQS, Kafka, Pub/Sub, or similar technologies.
• Implement dependable asynchronous processing techniques, including retries, idempotency, dead-letter queues, backpressure management, and fault-tolerant job execution.
• Spearhead the development and enhancement of metadata extraction, content analysis, scene detection, transcription, embedding generation, and multimodal AI inference workflows.
• Incorporate and optimize AI/ML services within backend workflows, covering model APIs, embedding pipelines, OCR, speech-to-text, scene analysis, multimodal inference, batching, caching, and fallback strategies.
• Enhance AI/ML inference workflows for latency, throughput, reliability, and cost efficiency across both real-time and batch-processing scenarios.
• Collaborate with model-serving systems like vLLM, Triton, TGI, SageMaker, Vertex AI, or custom inference services to optimize batching, concurrency, warmup behavior, timeout management, autoscaling, and GPU utilization.
• Design and manage vector search and indexing systems utilizing technologies such as Pinecone, Weaviate, Qdrant, Elastic Vectors, FAISS, pgvector, or comparable tools.
• Bachelor's degree in Computer Science, Engineering, or a related field with equivalent practical experience.
• 5-7+ years of experience in backend engineering, ideally focusing on scalable distributed systems, media platforms, data pipelines, or high-throughput backend services.
• Previous experience managing significant backend modules from end to end, encompassing architecture, implementation, deployment, monitoring, and production operations.
• Over 3 years of experience integrating AI/ML inference systems into backend workflows, including model APIs, embedding pipelines, OCR, speech-to-text, scene detection, or multimodal model outputs.
• Practical experience in developing AI-driven processing pipelines for image, video, audio, or text analysis.
• Hands-on experience with production model optimization, particularly for image, video, embedding, or multimodal models, focusing on batching, caching, quantization, prompt optimization, routing strategies, latency reduction, and cost efficiency.
• Strong preference for candidates with prior experience in vector search, semantic search, media retrieval, or similarity-matching systems.
• Proven track record of mentoring engineers, leading technical discussions, and influencing architectural decisions across backend, infrastructure, and AI/ML workflows.
• Work remotely from any location across the globe.
• Collaborate with skilled teams from around the world.
• Enjoy opportunities for professional growth and development.
• Experience a flexible work environment.
• Be part of a global fintech revolution.
harrison.ai
Pavilion
State of Rhode Island
Get handpicked remote jobs straight to your inbox weekly.