This is a fully remote position, open to applicants in United States.

📋 Description

• The Forward Deployment Engineer (FDE) is responsible for the on-site deployment, integration, and scaling of our enterprise Generative AI solutions.

• This position involves direct collaboration with customer engineering teams to operationalize Large Language Models (LLMs) and retrieval systems across various multi-cloud platforms (AWS, Azure, GCP).

• You will serve as a bridge between AI research and production-grade cloud infrastructure.

• Collaboration with cross-functional teams and business partners will be essential, as you leverage your analytical skills to drive both current and future strategies, ensuring business value and effectively communicating the results.

⛳️ Requirements

• AI Solution Deployment: Expertise in deploying, fine-tuning, and optimizing large-scale Gen AI models and LLM orchestration frameworks within customer cloud environments.

• Infrastructure Engineering: Ability to design scalable infrastructure for AI workloads by utilizing GPU/TPU orchestration, high-performance storage, and low-latency networking.

• Data & Retrieval Pipelines: Proficiency in designing and implementing high-throughput data ingestion pipelines and Vector Database architectures for Retrieval-Augmented Generation (RAG).

• Multi-Cloud Management: Skills in creating agnostic, resilient cloud deployments across AWS, Azure, and GCP using Infrastructure as Code (IaC).

• Technical Advocacy: Serve as the primary technical consultant, assisting enterprise clients with AI safety, prompt engineering patterns, and inference cost optimization.

• Product Collaboration: Provide deployment insights on edge cases back to core AI research and platform engineering teams to enhance product robustness.

• Technical Requirements- AI Frameworks: Hands-on experience with LLM orchestration tools (LangChain, LlamaIndex, AutoGen) and deep learning frameworks (PyTorch, Hugging Face).

• Vector Databases: Proven experience in setting up and querying vector stores (Milvus, Pinecone, Qdrant, Chroma, or pgvector).

• Model Operations (LLMOps): Proficient in model serving frameworks (vLLM, TGI, Triton Inference Server) and evaluation tools.

• Cloud & Containers: Advanced knowledge of cloud AI primitives (AWS Bedrock/SageMaker, Azure OpenAI, GCP Vertex AI) and Kubernetes (K8s) for GPU workloads.

• IaC & Automation: Mastery of Terraform or OpenTofu for provisioning complex multi-cloud compute environments.

• Programming: Strong coding skills in Python (preferred) or Go, with a focus on writing clean, concurrent code.

• Soft Skills- AI Consultation: Capability to manage customer expectations regarding LLM non-determinism, hallucinations, and performance trade-offs.

• Rapid Adaptability: Enthusiasm for staying current with the latest developments in the Generative AI field.

• Critical Debugging: Exceptional ability to identify errors across complex software layers, from GPU drivers to prompt engineering logic.

• Mobility: Willingness to travel to client sites to lead high-stakes, on-site deployment sprints.

🏝️ Benefits

• This role presents an outstanding opportunity for significant career growth in a dynamic and challenging entrepreneurial environment, characterized by a high level of individual responsibility.

• Tiger Analytics is committed to providing equal employment opportunities to all applicants and employees, without discrimination based on race, color, religion, age, sex, sexual orientation, gender identity/expression, pregnancy, national origin, ancestry, marital status, protected veteran status, disability status, or any other basis protected by federal, state, or local law.

Forward Deployment Engineer, Generative AI

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Cloud Engineer – DevOps

DevSecOps/DevOps Engineer

Deployment Engineer

Senior Cloud - Kubernetes SRE

DevOps Engineer

DevSecOps Engineer

Never miss a great job!