The next prompt-injection fight may not start in a chat window. It may start inside a resume.
A new ACL 2026 paper posted to arXiv studies what happens when job applicants add subtle self-promotional text to resumes processed by large language models. The text does not add real qualifications. It is designed to steer the model's evaluation anyway.
That is a sharp shift from the usual prompt-injection story. Most teams still think about hidden web-page instructions, poisoned documents, or malicious emails. Hiring systems are different. They involve high-stakes decisions, adversarial incentives, and a pile of semi-structured text that employers are already tempted to automate.
The researchers found that these injections can reliably improve rankings when the applicant pool is fairly homogeneous and only a small number of candidates use the tactic. That is exactly the uncomfortable scenario for many screening workflows: dozens or hundreds of people with similar credentials, with an LLM asked to separate them quickly.
The effect is not magic. The paper also finds that prompt injection loses power when many candidates use it, and it is less effective when candidate quality varies widely. But the important point is not that every manipulated resume wins. It is that lower-quality candidates can sometimes outrank stronger ones when the system is close to indifferent.
For companies, the lesson is blunt: an LLM resume screener is not just a classifier. It is a target. If applicants know or suspect that a model is ranking them, some will optimize for the model. That means hidden instructions, flattering phrasing, keyword stuffing, and model-specific tricks become part of the labor market.
The fix is not to ban AI from recruiting. It is to stop treating model output as neutral infrastructure. Teams using LLMs in hiring should strip or quarantine suspicious instruction-like text, compare rankings across model and prompt variants, log why candidates moved up or down, and keep humans responsible for decisions that materially affect applicants.
Vendors also need to prove more than accuracy. They should be tested against adversarial resumes, not only clean benchmark sets. A hiring product that performs well on ordinary applications but fails when candidates insert model-facing persuasion is not ready for unsupervised ranking.
This paper matters because it shows how quickly AI adoption creates a second system around the first one. Once a model becomes a gatekeeper, people learn to speak to the gatekeeper. In hiring, that does not just create noise. It can create fairness, compliance, and trust problems before anyone notices the workflow has been gamed.