Backlog ordering done right!
Various methods exist for helping product owners to decide which backlog item to start first. That this pays off to do so (more or less) right has been shown in blogs of Maurits Rijk and Jeff Sutherland.
These approaches to ordering backlog items all assume that items once picked up by the team are finished according to the motto: ‘Stop starting, start finishing‘. An example of a well-known algorithm for ordering is Weighted Shortest Job First (WSJF).
For items that may be interrupted, this results not in the best scheduling possible. Items that usually are interrupted by other items include story map slices, (large) epics, themes, Marketable Features and possibly more.
In this blog I’ll show what scheduling is more optimal and how it works.
Weighted Shortest Job First (WSJF)
In WSJF scheduling of work, i.e. product backlog items, is based on both the effort and (business) value of the item. The effort may be stated in duration, story points, or hours of work. The business value may be calculated using Cost of Delay or as is prescribed by SAFe.
When effort and value are known for the backlog items, each item can be represented by a dot. See the picture to the right.
The proper scheduling is obtained by sweeping the dashed line from the bottom right to the upper left (like a windshield wiper).
In practice both the value and effort are not precisely known but estimated. This means that product owners will treat dots that are ‘close’ to each other the same. The picture to the left shows this process. All green sectors have the same ROI (business value divided by effort) and have roughly the same value for their WSJF.
Product owners will probably schedule items according to: green cells from left-to-right. Then consider the next ‘row’ of cells from left-to-right.
Other Scheduling Rules
It is known at least since the 1950’s (and probably earlier) that WSJF is the most optimal scheduling mechanism if both value and size are known. The additional condition is that preemption, i.e. interruption of the work, is not allowed.
If either of these 3 conditions (known value, known size, no preemption) is not valid, WSJF is not the best mechanism and other scheduling rules are more optimal. Other mechanisms are (for a more comprehensive overview and background see e.g. Table 3.1, page 146 in [Kle1976]):
No preemption allowed
- no value, no effort: FIFO
- only effort: SJF / SEPT
- only value: on value
- effort & value: WSJF / SEPT/C
- Story map slices: WSJF (no preemption)
FIFO = First in, First out
SEPT = Shortest Expected Processing Time
SJF = Shortest Job First
C = Cost
Examples: (a) user stories on the sprint backlog: WSJF, (b) production incidents: FIFO or SJF, (c) story map slices that represent a minimal marketable feature (or short Feature). Leaving out a single user story from a Feature creates no business value (that’s why it is a minimal marketable feature) and starting such a slice also means completing it before starting anything else. These are scheduled using WSJF. (d) User stories that are part of Feature; they represent no value by themselves, but all are necessary to complete the Feature they belong to. Schedule these according to SJF.
- no value: SIRPT (SIJF)
- effort & value: SIRPT/C or WSIJF (preemption)
- SIRPT = Shortest Imminent Remaining Processing Time
SIRPT/C = Shortest Imminent Remaining Processing Time, weighted by Cost
SIJF = Shortest Imminent Job First
WSIJF = Weighted Shortest Imminent Job First
The ‘official’ naming for WSIJF is SIRPT/C. In this blog I’ll use Weighted Shortest Imminent Job First, or WSIJF.
Examples: (a) story map slices that contain more than one Feature (minimal marketable feature). We call these Feature Sets. These are scheduled using WSIJF, (b) (Large) Epics that consist of more than 1 Feature Set, or epics that are located at the top-right of the windshield-wiper-diagram. The latter are usually split in smaller one containing most value for less effort. Use WSIJF.
- User Story (e.g. on sprint backlog and not part of a Feature): WSJF
- User Story (part of a Feature): SJF
- Feature: WSJF
- Feature Set: WSIJF
- Epics, Story Maps: WSIJF
Weighted Shortest Imminent Job First (WSIJF)
Mathematically, WSIJF is not as simple to calculate as is WSJF. Perhaps in another blog I’ll explain this formula too, but in this blog I’ll just describe what WSIJF does in words and show how it affects the diagram with colored sections.
WSIJF: Work that is very likely to finish in the next periods, has large priority
What does this mean?
Remember that WSIJF only applies to work that is allowed to be preempted in favour of other work. Preemption happens at certain points in time. Familiar examples are Sprints, Releases (Go live events), or Product Increments as used in the SAFe framework.
The priority calculation takes into account:
- the probability (or chance) that the work is completed in the next periods,
- if completed in the next periods, the expected duration, and
- the amount of time already spent.
Example. Consider a Scrum team that has a cadence of 2-week sprints and time remaining to the next release is 3 sprints. For every item on the backlog determine the chance for completing it in the next sprint and if completed, divide by the expected duration. Likewise for completing the same it in the next 2 and 3 sprints. For each item you’ll get 3 numbers. The value divided by the maximum of these is the priority of the backlog item.
Qualitatively, the effect of WSIJF is that items with large effort get less priority and items with smaller effort get larger priority. This is depicted in the diagram to the right.
Example: Quantifying WSIJF
In the previous paragraph I described the basics of WSIJF and only qualitatively indicated its effect. In order to make this concrete, let’s consider large epics that have been estimated using T-shirt sizes. Since WSIJF affects the sizing part and to less extent the value part, I’ll not consider the value in this case. In a subtle manner value also plays a role, but for the purpose of this blog I’ll not discuss it here.
Teams are free to define T-shirt sizes as they like. In this blog, the following 5 T-shirt sizes are used:
- XS ~ < 1 Sprint
- S ~ 1 – 2 Sprints
- M ~ 3 – 4 Sprints
- L ~ 5 – 8 Sprints
- XL ~ > 8 Sprints
Items of size XL take around 8 sprints, so typically 4 months. These are very large items.
Of course, estimates are just what they are: estimates. Items may take less or more sprints to complete. In fact, T-shirt sizes correspond to probability distributions: an ‘M’-sized item has a probability to complete earlier than 3 sprints or may take longer than 4 sprints. For these distributions I’ll take:
- XS ~ < 1 Sprint (85% probability to complete within 1 Sprint)
- S ~ 1 – 2 Sprints (85% probability to complete within 3 Sprints)
- M ~ 3 – 4 Sprints (85% probability to complete within 6 Sprints)
- L ~ 5 – 8 Sprints (85% probability to complete within 11 Sprints)
- XL ~ > 8 Sprints (85% probability to complete within 16 Sprints)
As can be seen from the picture, the larger the size of the item the more uncertainty in completing it in the next period.
Note: for the probability distribution, the Wald or Inverse Gaussian distribution has been used.
Based on these distributions, we can calculate the priorities according to WSIJF. These are summarized in the following table:
Column 2 specifies the probability to complete an item in the next period, here the next 4 sprints. In the case of an ‘M’ this is 50%.
Column 3 shows that, if the item is completed, what the expected duration will be. For an ‘M’ sized item this is 3.22 Sprints.
Column 4 contains the calculated priority as ‘value of column 2’ divided by ‘value of column 3’.
The last column shows the value as calculated using SJF.
The table shows that items of size ‘S’ have the same priority value in both the SIJF and SJF schemes. Items larger than ‘S’ actually have a much lower priority as compared to SJF.
Note: there are slight modifications to the table when considering various period lengths and taking into account the time already spent on items. This additional complexity I’ll leave for a future blog.
In practice product owners only have the estimated effort and value at hand. When ordering the backlog according to the colored sections shown earlier in this blog, it is easiest to use a modified version of this picture:
Schedule the work items according to the diagram above, using the original value and effort estimates: green cells from left to right, then the next row from left to right.
Most used backlog prioritization mechanisms are based on some variation of ROI (value divided by effort). While this is the most optimal scheduling for items for which preemption is not allowed, it is not the best way to schedule items that are allowed to be preempted.
As a guide line:
- Use WSJF (Weighted Shortest Job First) for (smaller) work items where preemption is not allowed, such as individual user stories with (real) business value on the sprint backlog and Features (minimal marketable features, e.g. slices in a story map).
- Use SJF (Shortest Job First) for user stories within a Feature.
- Use WSIJF (Weighted Shortest Imminent Job First) for larger epics and collections of Features (Feature Set), according to the table above, or more qualitatively using the modified sector chart.
[Kle1976] Queueing Systems, Vol. 2: Computer Applications, Version 2, Leonard Kleinrock, 1976
[Rij2011] A simulation to show the importance of backlog prioritisation, Maurits Rijk, June 2011, https://maurits.wordpress.com/2011/06/08/a-simulation-to-show-the-importance-of-backlog-prioritization/
[Sut2011] Why a Good Product Owner Will Increase Revenue at Least 20%, Jeff Sutherland, June 2011, https://www.scruminc.com/why-product-owner-will-increase-revenue/