📋 Description

• Design and sustain three unique processing pipelines — scheduled job ingestion, event-driven course processing, and a periodic knowledge graph builder — each featuring independent trigger logic and cost management.

• Generate and oversee semantic embeddings utilizing Amazon Bedrock (Titan v2), indexing them in MongoDB Atlas Vector Search, and adjusting similarity thresholds to guarantee match precision.

• Construct and maintain a knowledge graph that connects jobs, courses, skills, and industries through FP-Growth association rules and archetype-to-SOC code mapping.

• Develop and enhance a two-stage discovery and matching API on AWS Lambda — initiating with vector retrieval, followed by deep eligibility scoring with LLM re-ranking.

• Optimize Fargate Spot instances and design resumable processing loops that can withstand interruptions, ensuring infrastructure costs remain manageable as data volume increases.

• Maintain and enhance daily job scrapers from various sources and create institution data scrapers equipped with robust HTML cleaning pipelines.

⛳️ Requirements

• 1+ years of backend engineering experience with a focus on data pipelines, ML infrastructure, or search systems.

• Practical experience with AWS serverless and container services — Lambda, ECS Fargate, EventBridge, and Step Functions.

• Proficient in Python — including Pandas, asynchronous processing, bulk database operations, and text cleaning.

• Familiarity with vector databases and semantic similarity search; experience with MongoDB Atlas Vector Search is a significant advantage.

• A cost-aware infrastructure mindset — you consider per-record compute costs, free tiers, Spot resilience, and right-sizing.

• Ability to document and clearly communicate complex architecture to both technical and non-technical stakeholders.

• Nice to have: Experience with knowledge graphs or association rule mining (FP-Growth, Apriori).

• Nice to have: Experience utilizing LLMs for re-ranking or eligibility assessment based on vector retrieval results.

• Degree or relevant proven experience.

🏝️ Benefits

• Fully remote / work-from-home position.

• Some flexibility in working hours, contingent on team needs and deliverables.

• Hands-on experience working on impactful backend, data pipeline, and AI-related systems.

• Opportunity to contribute to a growing platform with genuine product and engineering challenges.

• Professional growth in a dynamic, fast-paced environment.

• Strong potential for long-term progression based on performance, irrespective of location.

Backend Engineer, AI & Data Pipeline

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

AI Ops Engineer – Backend Developer, Python

Senior Software Engineer – Python

Java Developer

Senior Java Developer

Mid-level .NET Developer

Backend Engineer – Internal Tools

Never miss a great job!