AI Red-Teamer (English)
Job Description
The global reach of AI necessitates robust safety across all languages. As an AI Red-Teamer (English), you will be on the front lines, expertly probing AI systems in English to uncover subtle vulnerabilities, prevent harmful content generation, and fortify their defenses against sophisticated attacks.
Key Responsibilities
Craft and execute advanced adversarial prompts and 'jailbreaks' specifically designed to bypass English-language safety filters in AI models.
Identify and document instances where AI generates toxic, biased, or otherwise undesirable content in English.
Develop complex, multi-turn conversational attacks to test AI's coherence, factual accuracy, and resistance to manipulation.
Provide precise, actionable feedback on AI's English language comprehension and generation vulnerabilities.
Categorize and analyze different types of safety failures, such as misinformation, hate speech, or privacy breaches in English outputs.
Stay updated on emerging red-teaming techniques and adversarial prompting strategies for English language models.
Ideal Qualifications
Native-level fluency and exceptional command of the English language, including nuances, slang, and cultural contexts.
Demonstrated experience in prompt engineering and interacting extensively with large language models (e.g., GPT-4, Claude, Gemini).
A creative, adversarial mindset with a proven ability to think 'outside the box' to find system weaknesses.
Strong analytical skills to dissect AI responses and identify subtle failures.
Excellent written communication for detailed bug reporting and vulnerability analysis.
Background in cybersecurity, ethical hacking, or content moderation is a plus.
Project Timeline
Start Date: Within 2 weeks
Duration: 6 months (renewable)
• Commitment: Flexible, 20-40 hours/week
Secure English-speaking AI – become our Red-Teamer!