
Backend AI, Data Pipeline Engineer
Posted Jun 3

Posted Jun 3
This is a fully remote position, open to applicants in Pakistan.
• Take ownership of the complete data processing infrastructure that fuels Yuzee's advanced course and job matching platform.
• Design and sustain scalable, event-driven pipelines that handle tens of thousands of records daily, produce semantic embeddings, and support a growing knowledge graph.
• Create and manage semantic embeddings using Amazon Bedrock (Titan v2), index them in MongoDB Atlas Vector Search, and fine-tune similarity thresholds.
• Develop and oversee a knowledge graph that connects jobs, courses, skills, and industries utilizing FP-Growth association rules and archetype-to-SOC code mapping.
• Construct and enhance a two-stage discovery and matching API on AWS Lambda.
• Maintain and optimize daily job scrapers across various sources and create institution data scrapers with robust HTML cleaning pipelines.
• A minimum of 1 year of backend engineering experience concentrating on data pipelines, ML infrastructure, or search systems.
• Practical experience with AWS serverless and container services — including Lambda, ECS Fargate, EventBridge, and Step Functions.
• Proficient Python skills — encompassing Pandas, asynchronous processing, bulk database operations, and text cleaning.
• Familiarity with vector databases and semantic similarity search; experience with MongoDB Atlas Vector Search is highly advantageous.
• Cost-conscious infrastructure perspective — you consider per-record compute costs, free tiers, Spot resilience, and right-sizing in your approach.
• Capability to document and articulate complex architecture clearly to both technical and non-technical audiences.
• Nice to have: Experience with knowledge graphs or association rule mining techniques (FP-Growth, Apriori).
• Nice to have: Experience utilizing LLMs for re-ranking or eligibility assessment based on vector retrieval results.
• Background in edtech, jobtech, or recommendation/matching systems.
• A degree or proven experience in a relevant field.
• Fully remote / work-from-home position.
• Flexible working hours aligned with the team’s expected schedule and business needs.
• Opportunity to engage in real backend, data, and AI infrastructure initiatives.
• Exposure to practical engineering challenges in scraping, pipelines, retrieval, and cloud systems.
• Continuous growth and development opportunities within a rapidly evolving technology landscape.
• Potential to build long-term value and advance with the company based on performance, including increased responsibilities over time.
• Some flexibility in working hours, contingent on team requirements and deliverables.
• Hands-on experience working on significant backend, data pipeline, and AI-related systems.
• Opportunity to contribute to a developing platform with genuine product and engineering challenges.
• Professional growth in a dynamic, fast-paced environment.
• Strong potential for long-term advancement based on performance, irrespective of location.
Bemobi
Unisys
Dailymotion
SoftExpert - Software for Excellence
Get handpicked remote jobs straight to your inbox weekly.