
Software Engineer, AI Platform
Posted 6 days ago

Posted 6 days ago
This is a fully remote position, open to applicants in Greece.
• Develop and sustain the services, data models, and APIs that drive the platform, ensuring they are accurate, testable, and scalable.
• Engage with systems that manage intricate, multi-step interactions between AI agents and external systems, enhancing their reliability and throughput.
• Create systems that evaluate agent outputs, integrating deterministic checks with model-assisted evaluation to ensure scoring is reliable, explainable, and reproducible.
• Construct pipelines that produce, transform, and conduct quality checks on large volumes of structured data and benchmark content.
• Implement the tests, instrumentation, and safeguards necessary to trust outputs from systems that are intrinsically non-deterministic.
• Over 4 years of experience in building and delivering production software, with a strong command of Python.
• Solid understanding of software engineering principles: system and API design, data modeling, concurrency/async programming, testing strategies, debugging, and code reviews; you can manage a significant service from start to finish.
• Experience in designing and managing distributed or service-oriented systems (queues, workers, APIs) rather than merely making calls to them.
• Proficiency in designing schemas and working with relational databases, including the associated migrations and performance considerations.
• Familiarity with LLM APIs, including orchestration, structured outputs, and managing non-determinism; while effective use of LLMs is expected, this is not a prompt-engineering position.
• Capability to reason about the reliability of probabilistic systems: how to test, measure, and trust outputs that are not byte-for-byte deterministic.
• High standards for quality: you write tests, types, and documentation as a default practice, maintaining small and reviewable changes.
• **Bonus points for:**
• - Experience in constructing agentic or multi-agent systems, tool-use, or orchestration frameworks.
• - Background in the evaluation and benchmarking of ML or LLM systems (rubrics, golden datasets, model-as-judge, inter-rater reliability).
• - Experience with distributed task queues and asynchronous workloads.
• - Familiarity with modern Python tools and typed codebases (e.g., type checkers, linters, Pydantic, FastAPI).
• - Experience in retrieval/search and managing data ingestion pipelines.
• - Some familiarity with infrastructure aspects (Docker, CI/CD) to enable deployment of your work.
• Competitive salary.
• Training budget to enhance your skills through top tech partners like Microsoft, AWS, Salesforce, and Databricks — whether it's certifications or courses, we’ve got you covered.
• Private insurance, top-tier tech equipment, and the opportunity to collaborate with an exceptional team.
Credo AI
Get handpicked remote jobs straight to your inbox weekly.