Remotery

AI Evaluation Engineer – Data Analysis, Multi-Agent Systems

Posted May 14

This is a fully remote position, open to applicants in Colombia.

📋 Description

• Design and create **benchmark tasks for multi-agent systems** that focus on intricate data analysis workflows.

• Develop or compile **realistic datasets** (including CSV, JSON, logs, reports, financial, or operational data).

• Construct tasks that necessitate:

• - Cross-referencing multiple data sources.

• - Anomaly detection and identification of contradictions.

• - Statistical analysis and interpretation.

• Establish **task decomposition strategies** among specialized sub-agents (such as financial, technical, and operational analysis).

• Create **verification logic** to ensure accurate analytical outputs (rather than generic summaries).

• Implement evaluation pipelines utilizing **Python and SQL**.

• Develop reproducible environments using **Docker**.

• Assess task performance and enhance for **clarity, difficulty, and scoring precision**.


⛳️ Requirements

• Over 5 years of experience in **data analysis or analytics-focused roles**.

• Strong expertise in **Python (pandas, NumPy)** and **SQL**.

• Experience in handling **real-world, complex datasets** (such as CSV, JSON, logs, reports).

• Capability to design **analytical challenges with clear and verifiable outcomes**.

• Comprehensive knowledge of **statistics** (including distributions, correlations, and outliers).

• Familiarity with **AI benchmarks or evaluation environments** (for instance, SWE-bench or similar).

• Practical experience with **Docker** (including Dockerfiles, image builds, and debugging).

• **Nice to Have**

• Experience in **financial analysis, operations analytics, or risk analysis**.

• Exposure to **data pipelines or ETL workflows**.

• Experience with **data quality validation or anomaly detection systems**.

• Familiarity with **AI/ML data workflows or evaluation frameworks**.


🏝️ Benefits

• Competitive salary and performance bonuses.

• Flexible working hours and remote work options.

• Opportunities for professional development and continuing education.

• Collaborative and innovative work environment.

• Health and wellness benefits.

People also viewed

Jerry27 min ago

Senior Manager, BizOps, Analytics

US flagUnited States OnlyFull-timeData Analyst$170k – $210k/year
ApplyView job
Hack The Box27 min ago

Data Analyst

US flagUnited States OnlyFull-timeData Analyst$95k – $120k/year
ApplyView job
Bryant Park Consulting27 min ago

Senior NetSuite Consultant, Data Analytics

IN flagIndia OnlyFull-timeData Analyst₹2700k – ₹4000k/year
ApplyView job
CACI International Inc27 min ago

Data Analyst – Temporary Assignment

US flagVirginia OnlyFull-timeData Analyst$75.2k – $158.1k/year
ApplyView job
SmithRx49 min ago

Marketing Data Analyst

US flagUnited States OnlyFull-timeData Analyst
ApplyView job
Lantheus49 min ago

TechOps Business Insights and Analytics Specialist

US flagMassachusetts OnlyFull-timeData Analyst$89k – $148k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers