Build AI applications with fine-tuning, RAG, and model optimization expertise
LLM Engineers design, deploy, and optimize large language model systems at scale. Working alongside ML teams at companies like OpenAI, Anthropic, Google, and Stripe, they focus on fine-tuning models, implementing retrieval-augmented generation (RAG), optimizing prompts, managing vector databases, and evaluating model performance. Day-to-day involves writing Python code, working with APIs (OpenAI, Anthropic, Hugging Face), conducting experiments with prompt engineering, and ensuring production systems handle millions of inferences reliably.
LLM Engineer with 4+ years scaling language model systems at OpenAI and Anthropic. Expert in RAG architecture, fine-tuning, and prompt optimization, delivering 40% latency reduction and 35% cost savings across production systems. Passionate about building next-generation AI applications with measurable business impact.
Recruiters want three things: First, hands-on experience with production LLMs (GPT, Claude, Llama) and APIs—not just theory. Second, quantified impact: latency improvements, cost reductions, or accuracy gains with numbers and percentages. Third, full-stack capability—fine-tuning to vector databases to prompt optimization. They also value evaluation rigor: candidates who measure performance, not guess. Experience with scaling (millions of inferences), monitoring, and A/B testing separates senior candidates. Research the company's model preferences beforehand—OpenAI candidates mention GPT experience, Anthropic emphasize Claude.
Core technical: fine-tuning frameworks (LoRA, QLoRA), RAG systems, vector databases (Pinecone, Weaviate, Milvus), prompt engineering, and model evaluation. APIs: OpenAI, Anthropic, Hugging Face Inference. Languages: Python (essential), with PyTorch and TensorFlow experience. Tools: LangChain, LlamaIndex, ONNX, BitsandBytes for quantization. Soft skills matter too—communication (explaining model performance to non-technical stakeholders), documentation, and experimentation rigor. Don't list outdated skills; focus on models from 2023+ (GPT-3.5, GPT-4, Claude, Llama 2). Include specific model sizes you've worked with (fine-tuning 7B vs. 70B requires different expertise).
Mistake 1: Vague bullets like 'worked on LLM projects' with no metrics. Always include numbers. Mistake 2: Claiming expertise in 20 tools when depth matters more—pick the 8-10 you genuinely know. Mistake 3: Omitting evaluation rigor. Recruiters want to see you test, measure, and iterate, not just deploy. Mistake 4: Forgetting business impact. Saying 'optimized latency' is weak; 'reduced inference latency from 2.1s to 0.8s, improving user adoption by 22%' is strong. Mistake 5: Listing only recent work. Include earlier roles where you built ML foundations. Mistake 6: No mention of production experience—side projects matter less than scaling to real users at scale.
Use a clean, single-column layout with clear date formatting (Month Year). Lead each bullet with action verbs: Architected, Optimized, Deployed, Fine-tuned, Evaluated. Keep bullets to 12-20 words—recruiters scan in 6 seconds. Quantify everything: numbers, percentages, currency, latency (ms), or accuracy metrics. Group related experiences together logically by tech stack, not chronology. Use consistent date ranges and avoid employment gaps without context. Include a 2-3 sentence professional summary highlighting your strongest achievement. For skills, use 2-3 columns to save space. Add relevant GitHub links or portfolio projects showing LLM work. Bold company names and your job title for scannability.
Figures in USD. Ranges reflect mid-level experience (3–7 years). Senior roles and major metro areas typically sit at the top of these bands.
Target OpenAI, Anthropic, Google DeepMind, and Stripe—they aggressively hire LLM Engineers. For OpenAI roles, emphasize scale: mention processing millions of API calls, optimizing embeddings at 50M+ dimensions. Anthropic values safety-focused work and careful evaluation; highlight your model evaluation frameworks and testing rigor. Google prefers candidates experienced with Gemini or PaLM APIs plus ML infrastructure; showcase infrastructure work. Stripe wants LLM solutions improving fraud detection, customer support, or content moderation—quantify business impact. Tailor your CV showing relevant API experience and the specific model versions you've worked with.
Use this template or start from scratch — our AI builder will guide you.