Position Overview
Excel in AI at Cohere as a Data Pipeline Specialist specializing in pre-training data. Your role will strengthen cutting-edge language models through innovative data strategies.
In this full-time position, you'll be a Member of Technical Staff dedicated to optimizing data pipelines crucial for model training. Drawing on your experience with Python and large-scale datasets, you'll conduct crucial data ablations and collaborate with top-tier engineers and researchers. Your work will directly impact advancements in natural language processing, empowering AI systems to better serve humanity.
Key Responsibilities:
• Assess data quality through rigorous data ablation procedures
• Build robust data modeling techniques for optimal training
• Research and implement advanced data curation strategies
• Work closely with cross-functional teams on language model data needs
Requirements:
• Strong software engineering expertise with Python
• Familiarity with data attributio...