
Senior Applied ML Engineer – Speech, Audio
Posted 23 hours ago

Posted 23 hours ago
This is a fully remote position, open to applicants in Egypt.
• Design, refine, and enhance sophisticated machine learning models tailored for Arabic voice applications.
• Engage in the entire development lifecycle, from constructing data pipelines and experimenting with models to optimizing inference and deploying in production.
• Evaluate and benchmark TTS and ASR models utilizing Arabic-specific test sets, assessing metrics like Word Error Rate (WER), naturalness, and dialect coverage.
• Fine-tune generative models for tasks such as voice cloning, zero-shot speaker adaptation, and speech synthesis.
• Develop and sustain data pipelines focused on Arabic, encompassing: audio collection and preprocessing, diacritization (Tashkil), and data cleaning and augmentation.
• Optimize model inference for production settings through: quantization, KV-cache tuning, and streaming inference techniques.
• Integrate and assess comprehensive speech-to-speech conversational pipelines.
• Conduct experiments grounded in contemporary research papers and translate findings into production-ready solutions.
• Collaborate with engineering and product teams to implement robust and scalable speech systems.
• Over 5 years of experience in Machine Learning, Applied AI, or AI Research.
• Proficient programming skills in Python.
• Extensive practical experience with PyTorch and the Hugging Face ecosystem.
• Demonstrated expertise in training and fine-tuning neural models for: Text-to-Speech (TTS), Automatic Speech Recognition (ASR), and audio codecs.
• In-depth knowledge of modern speech architectures, including: Whisper, Conformer, HiFi-GAN, and diffusion-based models.
• Familiarity with audio processing techniques such as: Voice Activity Detection (VAD), speaker diarization, and neural vocoders.
• Proven capability to implement and adapt research findings into effective production experiments.
• Strong grasp of the challenges associated with the Arabic language, including: diacritization (Tashkil), dialectal variations, and code-switching.
• Experience with inference optimization strategies like: quantization, streaming inference, and NVIDIA TensorRT.
• All employees enjoy complimentary benefits (including our renowned games room, daily breakfast, fruits, coffee and other hot beverages, soft drinks and juices, company outings, and parties).
• Social insurance coverage.
• Open-door management approach.
• Comprehensive medical insurance.
• Accommodation and transportation allowances.
• A welcoming environment that encourages innovation and efficiency.
• Exciting prospects for career advancement and talent development.
• Encouragement for feedback.
• Recognition and rewards programs.
• Competitive salaries and incentives.
• A friendly work atmosphere.
• Flexible and comfortable working schedule.
• Engaging committees.
• Monetary rewards.
• A team of fun, intelligent, and creative individuals.
• Career opportunities within a growing team.
• Paid vacation days.
• Social benefits.
Flock Safety
Inspiren
OneStudyTeam
CDW
Get handpicked remote jobs straight to your inbox weekly.