← Back to Jobs
OneForma | Remote, New-Zealand | Posted May 22, 2026
Position Overview
- Language: en_NZ (resource must be New Zealand with expert-level in English)
- Task Scope has two:
- Consulting adversarial prompts to test the LLM
- Adversarial Prompt Engineering
- Design targeted attack prompts across risk categories: harmful content, bias, misinformation, privacy violations, policy circumvention etc
- Apply diverse attack strategies leveraging including role-playing, multi-turn manipulation, chain-of-thought techniques and more.
- Conduct systematic adversarial attacks to identify security vulnerabilities, safety gaps, and operational risks in GenAI systems
- Execute multi-vector attack scenarios including prompt injection, jailbreaking, context manipulation, and edge case exploitation
- Native linguists need to review system responses and fix verification.
- Review system responses to identify harmful content, safety violations, and policy breaches
- Classify...