Britain's Premier Job Portal
Join to apply for the Staff Research Engineer, Model Efficiency role at Cohere.
Our mission is to scale intelligence to serve humanity. Weβre training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.
Large Language Models (LLMs) continue to push the boundaries of what AI systems can do β but inference is still the bottleneck. The Model Efficiency team is responsible for pushing the limits of LLM inference efficiency across our foundation models. We explore and ship breakthroughs across the model execution stack, including: