One of the Agile Manifesto’s twelve principles states, “At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.” Many Agile teams hold biweekly retrospectives that result in concrete actions executed in the next time period (sprint). There are many types of tuning and adjustments that a team can do, such as improve the workflow, automate tasks, and increase team cooperation. Is it a good habit for retrospectives to focus on the same type of improvement, or should the team alter the type of improvements? In this blog, I look into the effect of multiple consecutive actions that affect the flow of work. The simulation is inspired by the getKanban Board Game, a physical game designed to teach the concepts and mechanics of Kanban for software development in a class or workshop setting.
An ExperimentIdeally, an experiment would compare two equivalent teams. The first team would perform consecutive actions to improve the workflow, and the other team would only make one adjustment to improve the workflow, then focus subsequent improvements in other areas. After a set period, the workflow would be measured and verified for each team simultaneously to see which achieved better results. But such an experiment is difficult to perform, so in this blog, I will use a simulation.
SimulationFor this simulation, a team consists of three specialists: a designer, a developer, and a tester. The team uses a kanban process to achieve flow. See the picture below for the beginning situation. The team determines how much work it will complete at the beginning of each workday and the average cycle time is measured during the simulation. The initial work in progress (WIP) limits are set to 3 for each column, indicated by the red 3s. The average amount of work done by the team and the average effort of one work item are such that, on average, it takes one card about 5.5 days to complete. At the end of each workday, cards are pulled into the next columns (if allowed by the WIP limits). The policy is to always pull in as much work as allowed so the columns are maximally filled. Furthermore, the backlog is assumed to always have enough user stories ready to be pulled into the “design” column. This resembles developing a new product when the backlog is filled with more than enough stories. The system starts with a clean board and all columns empty. After letting the system run for seventy-five simulated workdays, we trigger a policy change and increase the WIP limit for the design from three to five. After this policy change, the system runs for another 100 work days. From the chart showing the average cycle time, we will be able to study the effect of WIP limit changing adjustments.
Note:The simulation assumes a simple uniform distribution for the amount of work done by the team and the effort assigned to a work item. I assume this is OK for the purpose of this blog. A consequence of this is that the result probably can’t be scaled. For instance, the situation in which a column in the picture above is a single Scrum team is not applicable since a more complex probability distribution should be used instead of the uniform distribution.
ResultsThe picture below shows the result of running the experiment. After the start, it takes the system little over 40 work days to reach the stable state of an average cycle time of about 24* days. This is the cycle time one would expect. Remember, the “ready”’ column has a limit of 3 and the other columns get work done. So, one would expect a cycle time of around 4 times 5.5, which equals 22 days - close to 24. At day 75 the WIP limit is changed. As can be inferred from the picture, the cycle time starts to rise only at day 100 (it takes about one cycle time (24 days) to respond). The new stable state is reached at day 145, with an average cycle time of around 30** days. It takes 70 days (!) to reach the new equilibrium. The chart shows the following interesting features:
- It takes roughly two times the (new) average cycle time to reach the equilibrium state.
- The response time (when one begins to notice an effect of the policy change) is about the length of the average cycle time.
ConclusionIn this blog, we have seen that when a team makes adjustments that affect the flow, the system needs time to get to its new stable state. Until this state has been reached, any new tuning of the flow is questionable. Simulations show that the time it takes to reach the new stable state is about two times the average cycle time. For Scrum teams that have two-week sprints, the system may need about two months before new tuning of flow is effective. Meanwhile, the team can very well focus on other improvements, e.g. retrospectives that focus on the team aspect or collaboration with the team’s environment. Moreover, don’t expect to see any changes in measurements of e.g. cycle time within the time period of the average cycle time after making a flow affecting change. To summarize, after making flow-affecting changes (e.g. increasing or decreasing WIP limits):
- Let the system run for at least the duration of the average cycle time so it has time to respond to the change.
- After it responds, notice the effect of the change.
- If the effect is positive, let the system run for another duration of the average cycle time, to get to the new stable state.
- If the effect is negative, do something else, e.g. go back to the old state, and remember that the system needs to respond to this as well!