LLM / Prompt Engineering Specialist
Job Description
About this role
Prompt engineering and LLM application design have become a discipline in their own right, with patterns that work, patterns that fail silently, and patterns that look fine but waste tokens. As an LLM / Prompt Engineering Specialist for AI training, you will help AI critique its own kind: prompts, system messages, and orchestration code across OpenAI, Anthropic, Google, and open-source models.
Key Responsibilities
• Generate and evaluate instruction-response pairs covering prompt structure, few-shot design, and chain-of-thought.
• Review AI-generated LLM application code (OpenAI SDK, Anthropic SDK, Vertex, Bedrock).
• Provide feedback on tool/function calling, structured outputs, and JSON-mode reliability.
• Validate AI handling of context windows, token budgeting, and caching strategies.
• Evaluate AI outputs for evaluation harnesses, regression tests, and offline evals.
• Identify subtle issues in prompt injection defense, jailbreak resilience, and grounding.
Ideal Qualifications
• 3• years building production LLM applications with at least two major providers.
• Deep familiarity with prompt engineering patterns and failure modes.
• Strong grasp of tool use, structured outputs, and agent loops.
• Experience with eval frameworks (lm-eval-harness, OpenAI Evals, custom harnesses).
• Comfort with Python or TypeScript SDKs for LLM development.
• Familiarity with fine-tuning and RAG architectures is a plus.
Project Timeline
• Start Date: Immediate
• Duration: Ongoing
• Commitment: Flexible, 10-25 hours/week
Contract & Payment Terms
• Independent contractor agreement
• Remote work — anywhere in eligible locations
• Weekly payment via Stripe or bank transfer
• Flexible hours
Help AI evaluate AI — apply now!