Leadership
How do you upskill a QA team in AI testing when most of them have no ML background?
You structure learning around three tiers: conceptual literacy (enough to understand AI behaviour without ML maths), practical skills (prompt engineering, eval writing, AI tool use), and specialist depth (for 1–2 engineers who go deep on ML testing). You deliver through doing — real projects, not theory courses.
Why it exists:
QA engineers don't need to be ML engineers to test AI features effectively. They need enough understanding to design good tests, interpret results, and spot failure modes. Over-investing in ML theory for the whole team is expensive and demoralising; under-investing leaves the team unable to do their jobs.
Walked-through example:
``text
QA AI literacy upskill programme — 3 tiers:
Tier 1 — Conceptual literacy (whole team, 4 hours):
Topics: what is an LLM, what is a token, what is temperature, what is hallucination,
what is RAG, why AI is non-deterministic.
Format: 2 × 2-hour workshops with live demos.
Goal: enough to understand what they're testing.
Tier 2 — Practical skills (whole team, 8 hours over 4 weeks):
Module 1: Writing effective prompts for test planning (hands-on, 2h)
Module 2: Writing golden-set evaluations (hands-on with a real feature, 2h)
Module 3: Testing AI features — safety, bias, hallucination (case studies, 2h)
Module 4: Integrating LLM API calls into Playwright test scripts (lab, 2h)
Format: do-first sessions — each module produces a real work output.
Tier 3 — Specialist depth (1–2 senior engineers, self-directed):
Topics: vector databases, embedding models, fine-tuning risks, eval frameworks (LangSmith).
Format: book club + 1:1 mentoring with a ML engineer.
Timeline: 3-month deep-dive, quarterly review.
``
Real-world QA use case:
A QA manager runs this programme for her 8-person team before the company ships its first AI feature. After the Tier 2 workshops, three engineers independently identify prompt injection vulnerabilities in a new feature — they knew what to look for because they'd practised it in Module 3. The team goes from "nervous about AI" to "confident AI testers" in 6 weeks.
Rule of thumb: literacy before tools, tools before depth — a team that understands what they're testing will adopt tools faster than one given tools without understanding.
💡 Plain English: Learning to drive: you don't need to understand how the engine works to drive safely, but you do need to understand what a red light means and how the car behaves in the rain. Theory serves the practice.