← Back to Jobs

DevOps Engineer - AI Model Evaluator

Obsidian | helsinki, Finland | Posted July 01, 2026

Position Overview

About the Role Mercor is partnering with a leading AI research lab to support a Frontier Code Agents project. 
Contributors help evaluate and improve frontier AI coding models through structured technical assessments. 
The work focuses on realistic infrastructure engineering workflows and model evaluation. 
Spots are limited and filling quickly on a first come, first serve basis. 
What You'll Do Use frontier AI coding agents to complete and evaluate complex infrastructure engineering tasks. 
Review model-generated implementations involving cloud platforms, Kubernetes, CI/CD systems, observability, and infrastructure automation. 
Identify bugs, edge cases, reliability issues, and failure modes. 
Compare outputs from multiple frontier models and assess their strengths and weaknesses. 
Apply professional engineering judgment to realistic infrastructure engineering scenarios. 
<...
        

🇬🇧 SearchUKJobs.co.uk

DevOps Engineer - AI Model Evaluator

Position Overview

About the Role

What You'll Do

Ready to Apply?