How to Interview an AI Engineer When You're Not One

You're a VP of Engineering, a CTO, or a hiring manager at a company that just greenlighted an AI initiative. You need to hire someone who can build it. The problem: you're not an AI engineer yourself. You can tell a good frontend developer from a bad one because you've managed frontend teams for a decade. But LLM pipelines? RAG architectures? Fine-tuning vs. prompt engineering? You're evaluating a discipline you didn't grow up in.

The resume in front of you says "5+ years of machine learning experience." The interview is tomorrow. Here's how to not waste it.

What not to ask

Skip the textbook questions. "Explain the difference between supervised and unsupervised learning" tells you whether someone passed a Coursera course, not whether they can ship production AI.

Skip LeetCode-style algorithm problems. The correlation between "can reverse a binary tree in 20 minutes" and "can build a reliable retrieval pipeline that doesn't hallucinate" is approximately zero.

Skip "tell me about a time you used [specific framework]." Frameworks change every six months. The person who built their last project on LangChain might build your project on something else entirely. What matters is whether they understood WHY they chose the tool, not whether they can recite its API.

What to ask instead

The goal is to figure out one thing: has this person shipped production AI, or have they only experimented with it? That distinction is everything.

Here are five questions that separate the two:

1. "Walk me through the last AI system you put into production. What broke first?"

Everyone can describe what they built. The tell is whether they can describe what went wrong. Engineers who've shipped production AI have war stories: the model that performed great in testing and collapsed on real data. The embedding that drifted after three months. The LLM that started hallucinating when the context window filled up. If the candidate can only describe successes, they're describing demos, not production.

Good answer sounds like: "We deployed the recommendation engine in March. Within two weeks, we found that the embeddings were stale because our data pipeline wasn't refreshing nightly the way we assumed. We added a staleness check and a fallback to the previous model version."

Red flag: "It worked great. We got 94% accuracy." (No mention of monitoring, drift, or failure modes.)

2. "If I gave you a new dataset tomorrow and asked you to build a prediction model, what's the first thing you'd do?"

You're screening for process, not technique. A junior AI person says "I'd try a neural network." A production engineer says "I'd look at the data quality, check for missing values and bias, figure out what the business actually needs to predict, and decide whether this even needs ML or whether a rules-based approach would be simpler and more reliable."

Good answer starts with: data exploration, problem framing, and questioning whether AI is the right tool.

Red flag: jumps straight to model architecture without asking about the data.

3. "How would you explain your last project to a product manager who doesn't have a technical background?"

This tests communication, not intelligence. AI engineers who can only explain their work to other AI engineers are a management burden. You need someone who can sit in a product review and translate "we reduced the perplexity score by 18%" into "the chatbot gives wrong answers 40% less often now."

Good answer: uses business outcomes, not metrics. Talks about what the system does for users, not how it works internally.

Red flag: defaults to jargon and can't simplify without getting frustrated.

4. "What's a project where you decided NOT to use AI, and why?"

This is the single best question for separating real engineers from AI enthusiasts. Someone who's built production systems knows that AI is expensive, hard to maintain, and often worse than a simpler solution. They've been in a room where someone wanted to use a neural network and they pushed back with "a SQL query and a threshold would solve this in two hours."

Good answer: gives a specific example and explains the tradeoff (cost, latency, reliability, maintenance burden).

Red flag: can't think of one. Every problem they've encountered apparently needed AI.

5. "What happens to your model six months after deployment? Who owns it?"

Production AI rots. Data distributions shift. User behavior changes. Upstream APIs get deprecated. A model that worked in January can silently degrade by July. The question is whether the candidate thinks about this.

Good answer: talks about monitoring, retraining schedules, alerting on performance drift, data pipeline ownership, and who gets paged when the model starts producing bad output.

Red flag: "We trained it and deployed it." No mention of what happens after.

Red flags that don't require technical knowledge

Even without AI expertise, you can spot these in any interview:

Can't name a failure. Everyone who's shipped real systems has broken something. If every project was a success, the projects weren't real.
Talks about research, never production. "I published a paper on..." is great for an R&D lab. You're hiring for a production team. Ask: "Was that paper's approach ever deployed? What happened?"
Knows one framework deeply but can't explain tradeoffs. "I use PyTorch for everything" is not a technical opinion. It's a comfort zone.
Can't estimate timelines. "How long would it take to build a basic RAG pipeline for our internal docs?" A production engineer gives a range ("two to four weeks for a working prototype, another month to harden it"). A non-production person says "it depends" and can't narrow it.

Green flags that don't require technical knowledge

They ask YOU questions about your data before discussing solutions. The best AI engineers interview the problem before they interview for the job.
They've turned down or simplified an AI project. Restraint is a signal of experience.
They talk about infrastructure and data as much as models. Production AI is 20% model and 80% everything around the model. If the candidate only talks about the 20%, they've only done the fun part.
They have opinions about testing. "How do you test an AI system?" is a question most AI candidates have never been asked. The ones who have good answers have shipped.

Skip the guesswork

We screen AI and Adobe specialists hundreds of times a year. If you'd rather not run a 5-question framework yourself, send us the role. Screened resumes in 48 hours.

Contact Focus GTS →

Or skip the guesswork

I've been running our Adobe AI staffing practice since 2018. This is what my team does hundreds of times a year. If you'd rather not run a 5-question framework yourself, send us the role. We screen for production experience, not keyword matches, and we deliver screened resumes in 48 hours through our AI staffing team.

Contact Focus GTS. Or explore Navigator if you want the specialist embedded in your team on a subscription basis.

The interview is tomorrow. At least now you know what to ask.