
AI Product Engineer
Posted 6 days ago

Posted 6 days ago
This is a fully remote position, open to applicants in Canada.
• Develop agents that analyze incidents. They identify anomalies, respond to the question "why is production down?", and utilize ClickStack as their foundation.
• Create skills, not merely prompts. Construct a repository of reusable skills that encapsulates our team's debugging processes, root cause analysis, ClickHouse query writing, and incident response procedures, enabling agents to adopt the appropriate playbook without starting from scratch.
• Take full ownership of the agent stack from start to finish. This includes context engineering, tool design, evaluations, tracing, and cost management. You are accountable for the agent's performance in production.
• Enhance ClickStack to be an exceptional platform for running AI workloads. Design and build the MCP servers, SDKs, and integrations that enable customers' agents to read telemetry, take actions, and maintain observability.
• Operate transparently. Work collaboratively with OSS contributors and customers, troubleshoot their issues alongside them, and integrate your findings back into the product.
• Address challenging aspects such as latency, cost, context window limitations, evaluation coverage, and inaccuracies with real telemetry.
• A minimum of 5 years in software engineering, including 1–2 years working with LLM-powered systems or agents in production environments.
• Proficient backend development skills in TypeScript/Node.js and/or Python. Comfortable working in both languages, even if one is your primary focus.
• Practical experience in building agents, including multi-step tool usage, planning, memory management, and error recovery. You've successfully deployed them and navigated their failure modes.
• Experience in designing skills (such as Markdown-based workflow encodings or Anthropic-style) with a clear understanding of when to utilize a skill, a tool, or a combination of both.
• Familiarity with MCP, including server building, tool design, and considerations for authentication, scoping, and observability within agentic systems.
• Strong evaluation practices, including the use of golden sets, LLM-as-judge methodologies, and regression detection.
• Proficient in SQL — able to write ClickHouse queries directly.
• Comfortable with Docker and Kubernetes.
• Active involvement in open source projects and the developer community.
• Flexible work environment - ClickHouse is a globally distributed company that supports remote work. We currently operate in over 20 countries.
• Healthcare - Employer contributions towards your healthcare costs.
• Equity in the company - Every new team member receives stock options upon joining.
• Time off - Flexible time-off policies in the US and generous entitlements in other countries.
• A $500 allowance for home office setup for remote employees.
• Global Gatherings – We believe in the importance of in-person connections and provide opportunities for engagement with colleagues during company-wide offsites.
EverCommerce
PlanetScale
Slingshot Aerospace
Upstart
Get handpicked remote jobs straight to your inbox weekly.