
Senior AI Engineer – Vision
Posted May 19

Posted May 19
This is a fully remote position, open to applicants in Latin America.
• Unlocking Visual Data: Developing pipelines capable of interpreting intricate documents, including layout, charts, and visual context, using Vision-Language Models (GPT-4V, Claude 3.5) and Layout Analysis.
• Orchestrating Intelligence: Taking ownership of the application logic layer. You will utilize LangChain or LangGraph to construct agents and chains that query our data, reason with it, and generate responses.
• Native PDF Handling: Navigating the complexities of PDF processing (PyMuPDF, layout parsing) to maintain structural integrity before AI analysis.
• Prompt Engineering & Logic: Designing intricate prompts and control flows to guarantee accurate interpretation of financial charts and layouts by the models, minimizing hallucinations.
• Cost & Scale: Implementing a cost-optimization approach (batch processing, model selection) to ensure our vision and orchestration layers remain economically feasible.
• LLM Orchestration (Must Have): Extensive experience with LangChain, LangGraph, or comparable frameworks. You possess knowledge in managing context windows, tool invocation, and agent-based workflows.
• Multimodal AI Experience: Practical experience in integrating cutting-edge vision models (GPT-4V, Claude 3.5 Sonnet) and embedding models (CLIP).
• Document Intelligence Specialist: Acquainted with specialized models (e.g., Donut, Pix2Struct) and tools such as Unstructured.io or Docling.
• PDF Processing Mastery: Expert-level familiarity with tools like PyMuPDF or pdfplumber for native element extraction.
• Python ML Stack: Strong skills in PyTorch or TensorFlow.
• 18 days of PTO per year, observance of local holidays, and an annual break between Christmas and New Years.
• A monthly wellness stipend and snack boxes delivered to your home.
Webedia
TechBiz Global
The Flex
Nodeworthy
Get handpicked remote jobs straight to your inbox weekly.