This is a fully remote position, open to applicants in Florida.

• Assess end-to-end interactions generated by AI in coding.

• Evaluate whether the outputs are:

• Useful

• Correct at a high level

• Consistent with the thought process of a strong engineer

• Analyze the quality of reasoning and explanations, not just the code itself.

• Differentiate between various levels of response quality (e.g., what distinguishes a score of 2 from a 4).

• Offer clear and opinionated feedback regarding:

• What worked well

• What did not meet expectations

• What seemed “off” or misleading

• Assist in defining the characteristics of excellence when interacting with tools such as Cursor.

• Staff or Principal-level engineer (or equivalent experience).

• Strong expertise in one of the following:

• TypeScript / JavaScript

• Python

• Practical experience with:

• OpenAI Codex

• Claude Code

• Cursor

• In-depth knowledge of modern AI-assisted development workflows.

• Capable of evaluating code without the necessity of executing or thoroughly reviewing each line.

• Comfortable providing direct and opinionated feedback.

• High standards for what constitutes “good engineering.”

• Nice to Have:

• Experience with tools like Cursor or other similar AI-first IDEs.

• Previous exposure to prompt design or evaluation workflows.

• Experience mentoring senior engineers or establishing engineering standards.

• Competitive salary and performance-based bonuses.

• Flexible work hours and remote work options.

• Professional development opportunities.

• Collaborative and innovative work environment.

Senior Engineer – AI Evaluator

People also viewed