Hiring an 'Agent Engineer': JD and Skills Profile

The role exists, sort of. The skills, the interview signals, and the JD template, with the parts that should differ between platform teams and product teams.

The skills profile

Four skills define the agent engineer. Strong software engineering (the agent is a software system, not just a prompt; engineering rigour matters); evals fluency (knows how to design test suites, score outputs, detect regressions; this is the rare skill); prompt engineering (writes and refines prompts; less rare than evals but still uncommon at depth); production operations (knows what observability, on-call, and SLOs mean, otherwise the agent ships without them).

Interview signals

Four signals separate real candidates from pretenders. “Tell me about an eval suite you built” (real candidates have stories, pretenders have generalities); “how do you debug a prompt that started failing in production?” (real candidates describe a methodology, pretenders describe symptoms); “what is the cost trade-off between Haiku and Opus for triage?” (real candidates have an opinion with reasons, pretenders quote vendor pages); take-home design an agent for X workflow.

JD for platform team

The platform-team JD is infra-flavoured. Focus: build the agent platform that other teams use; think infrastructure engineer with LLM specialty. Skills: distributed systems, observability, API design with LLMs as one component among many. Compensation: senior software engineer level with LLM premium of 10-15%.

JD for product team

The product-team JD is workflow-flavoured. Focus: build agents for specific workflows; think feature engineer with LLM specialty. Skills: domain expertise, prompt engineering, eval design with LLMs as the core tool. Compensation: senior software engineer level with domain premium based on the product area.

Retaining them

Three things keep agent engineers engaged. Give them ownership of the eval suite (without owning quality, they cannot improve it); give them visibility into the impact (their work changes MTTR, show the chart); give them time for craft (prompt and eval engineering reward iteration, under-resourcing produces mediocrity).