Articles

You Can't Delegate Accountability to an Agent

AI can support decisions and speed up work, but accountability for outcomes still belongs to people and teams.

Robbin Schuurman

Updated June 19, 2026

9 minutes

The Future of AI-Powered Product Development

Article 2 of 9 in the series.

Three sentences have started showing up in product teams right now. "The agent handled that." "Copilot manages our backlog now." "The coding agent shipped it." They sound modern, efficient, and forward-looking. They are actually accountability drift, dressed up in new vocabulary.

In the first article of this series, I argued that the Product Owner is about to have, for the first time in Scrum's thirty-year history, the time to do the job the Scrum Guide describes. I'm not talking about Backlog Management, but real Value Maximization. As AI agents take on more of the work, the three accountabilities in Scrum do not get smaller. They get sharper, more consequential, and more exposed. Who is accountable for what the agent just did?

This is a visionary piece, not a predictive one. I cannot tell you exactly how accountability will evolve over the next five years. I can share how a room full of experienced Professional Scrum Trainers thinks it must evolve, and why the answer matters for your team this Sprint. The full list of contributors from the Amsterdam Face-to-Face Event sits at the foot of the article.

The Scrum Guide Got This Right in 2020

The Scrum Guide is unusually clear on one point, and it has aged remarkably well. It names three accountabilities: the Product Owner, the Scrum Master, and the Developers. Three accountabilities. Not three roles. Not three job titles. Accountabilities.

Make no mistake; the choice of word was deliberate, careful, and, in hindsight, future ready. The Scrum Guide has always insisted that someone must answer for the outcome. AI does not change that commitment. AI raises the stakes, because AI-powered teams are less 'in control' than we are today.

Responsibility Is Not Accountability

Responsibility is who does the task. Accountability is who owns the outcome and must answer when it is good, when it is bad, or when it is challenged. The distinction is old, boring, and load-bearing.

An agent can take on responsibility. It can draft a Product Backlog item, refine acceptance criteria, generate code, or write the release notes. That is all related to responsibility, or task execution. What an agent cannot do is stand in a Sprint Review and defend the judgment call behind the work, stand in front of a customer whose trust has been broken, or stand in front of an auditor whose compliance question has just landed. Those are accountability moments. They require a named human, every time.

The Three Accountabilities, Re-Read for an AI-Powered World

The three Scrum accountabilities hold up cleanly when you re-read them with agents in the room.

Product Owner, accountable for the product's value. An agent can draft items, generate variants, and score them against a Definition of Value (Article 1 of this series). Only the Product Owner is accountable for whether the product creates real, adopted, and proven value. This involves truely understanding and empathizing with customers and users. This is about ensuring that the product doesn't cross (un)ethical boundaries.

Scrum Master, accountable for the team's effectiveness. An agent can run retrospective analytics, summarize sentiment, and flag patterns. Only the Scrum Master is accountable for whether the team (including its AI agents) becomes better, stronger, and more capable over time. That they work within (ethical) boundaries, with courage, commitment, openness, focus, and respect.

Developers, accountable for the Increment. An agent can write code, scaffold tests, and generate documentation. Only the Developers are accountable for whether the Sprint's Increment matches the agreed upon quality standards, is built in a scalable and maintainable way, and worth releasing.

Re-read carefully, these accountabilities do not shrink in an AI-powered world. They concentrate. The humans who hold them are not being replaced. They are being held to a higher standard of judgment, defense, and ownership.

One shape of team this concentration makes plausible is worth naming. In the next five years, it is not unreasonable to expect teams composed of several Product Managers and a single Developer, supported by a bank of specialized agents, rather than the classical three-to-nine-Developer team of today. That is not an argument against Developers. It is an argument that the accountability for what to build, and whether it created value, is becoming work that scales with people, while the work of producing the Increment is becoming work that scales with agents. Teams that redesign with that in mind will look different from teams that do not.

The Accountability Test

Put a short test on the wall, next to the Definition of Value from the first article. For any agent-produced artifact, ask three questions.

Explain. Can a named human on the team explain what the agent produced, and why?
Defend. Can that human defend the choice to a customer, an auditor, or a teammate who disagrees?
Redo. Can that human redo the work if the agent disappears tomorrow?

If any answer is no, accountability is drifting. The work is happening. The ownership is not. That gap is where trust, quality, and compliance quietly fail.

The AI Did It Is Not Going to Work

I want to be direct about a pattern I am seeing more often. Executives, and occasionally Product Owners, are starting to speak about AI as a deflection mechanism. The phrase "the AI did it" is creeping into incident reviews, customer complaints, and vendor discussions.

Legally, this does not hold. The NIST AI Risk Management Framework and the EU AI Act both enshrine human accountability as a non-negotiable requirement. Ethically, it does not hold either. An agent is a tool. A hammer does not miss the nail. A person swinging a hammer does.

And in front of customers, it absolutely does not hold. A customer whose data was exposed, whose money was lost, or whose trust was broken does not want to hear that an agent handled it. They want to hear that a human on your team owns what happened and is going to make it right.

What This Looks Like in Practice

Picture a SaaS team using an agent to generate Product Backlog items from customer feedback. The agent produces a well-written item about access controls for a new enterprise tier. The item passes the team's Definition of Done. The work ships. Three weeks later, a prospective customer asks a direct question in a procurement call: why did you choose to allow a certain permission pattern? Nobody on the team can answer. The agent chose it. No human ever deeply understood the choice.

The team loses the deal. More importantly, they lose the thread.

The Sprint after, they run the Accountability Test on every agent-produced artifact before it enters a Sprint. A Developer explains the item, defends the design choice, and confirms they could redo it without the agent. The Product Owner signs off on the value assumption. The Scrum Master flags, at retrospective, that the team had grown too comfortable letting the agent decide. Refinement slows down by about ten minutes per item. Customer trust goes back up, and the deal cycle follows.

That is the difference holding accountability makes.

Four Things You Can Do in Your Next Sprint

Tag every agent-produced artifact with the accountable human. Item, code change, test, documentation. If an agent generated it, a named human signs off. Simple, visible, and surprisingly hard to argue against.
Re-read the three accountabilities as a team, with AI in the room. Twenty minutes in your next retrospective. Walk each accountability. Ask: where is the agent taking responsibility, and where is the human still holding accountability? The gaps show themselves.
Apply the Accountability Test to one recent agent-produced item. Pick a Product Backlog item, a code change, or a release note produced with agent assistance. Run Explain, Defend, Redo. You will likely find one or two places where no human can fully do all three. Sharpen those first.
Write a rollback policy, and write the guardrails. When an agent's work fails on value, on effectiveness, or on Increment quality, who decides to pull it? Who communicates to customers? Who documents the learning? Name the humans in advance. Do not discover them mid-incident. And when the agent is one that can adjust its own behavior over time, the team's rollback policy needs a sibling: a set of explicit guardrails, reviewed on a cadence, that define the range inside which the agent is allowed to operate autonomously and the points at which a human must approve before it moves on. Self-improving systems without guardrails are not an efficiency gain. They are a governance incident waiting for its date.

The Turn

For years, the prevailing story about AI in product development has been one of replacement. Agents will do the backlog. Agents will do the code. Agents will do the review. I do not believe that story, and neither does the room that spent several days debating this in Amsterdam.

The more accurate story is concentration. As agents do more of the execution, accountability does not spread out. It concentrates onto a smaller number of human judgment points. The three accountabilities in Scrum are exactly those judgment points. They are, in an AI-powered world, the sharpest, most exposed, and most non-negotiable parts of the team's operating model.

Over to You

Think about the last agent-produced artifact your team shipped. Who was accountable for it? Could that person pass the Accountability Test today? If not, that is worth a conversation in your next retrospective.

The next article in the series goes inside the Scrum event where accountability is most often quietly abdicated: refinement. When an agent can generate five viable solutions in the time it takes to read one Product Backlog item, refinement stops being a task-decomposition exercise. It becomes an option-framing exercise. See you there.

Contributors

This article was created based on the Scrum.org PST Face-to-Face Event #137 in Amsterdam. It would not have been possible without the discussions with: Dave West, Merel van de Wiel-Riedeman, Tommi Kemppi, Sjoerd Nijland, Jesse Houwing, Robbin Schuurman, Martijn Magermans, Guus Verweij, Steven Deneir, Gregor Stuhldreier, Paul Kuijten, Mehdi Hoseini, Simon Kneafsy, Vivien Colas, Jeroen de Jong, Kate Hobler, Olivier Ledru, Roderick Schoon, Stephan Vlieland, Tiffanie Newton, and Karel Smutný. The arguments here are mine. The thinking is ours.

< Read the previous article