Mike Cohn is quite explicit when he says Story Points reflect the effort required to get a Story to “Done”:
In a class a few years back, I was given a wonderful example of this. Suppose a team consists of a little kid and a brain surgeon. Their product backlog includes two items: lick 1,000 stamps and perform a simple brain surgery — snip and done. These items are chosen to presumably take the same amount of time. If you disagree, simply adjust the number of stamps in the example. Despite their vastly different complexities, the two items should be given the same number of story points — each is expected to take the same amount of time.
– Mike Cohn
Mike also reminds us to estimate size and derive time.
Collecting Size vs Time data: Gather Cycle Time
Collecting cycle time is key to understanding whether size estimates are accurate. It’s as simple as putting a “dot” on the Product Backlog Item (PBI) for every day it’s “In-Progress” when the team gathers for its Daily Scrum.
Red dots indicate waiting time where the black dots indicate the team’s effort. Both are valuable to understand the timing of a PBI — when is the latest time it can start in a Sprint, and how often that type of work is typically blocked or parked.
Plotting Size versus Cycle Time
When teams estimate size and collect cycle time over several Sprints, we end up with a range of PBIs of varying sizes that then reflect the time taken for an item to go from “In-Progress” to “Done”. This helps teams communicate to stakeholders the golden question: “when will the work be done”.
If we plot size by time, we also get a picture of the size by its range of times.
The range of times around the average is called the variance. With both average and variance we can understand:
- Are our estimations accurate?
- Are we differentiating enough between our size categories?
- Are we sizing in a consistent way?
A graph of size and variance also helps a team inform customers when an item is likely to be done once we start.
The Product Owner asks Team Avatar to deliver a new search feature. The Team look at their historical data and note that last time they delivered something to do with search they classified it as a “L”.
“L” size items currently take Team Avatar 7 days on average to complete, but they could finish it as early as 5 days or as late as 9 days.
Using Cycle Time, Sequencing and Predicting Impediments
Estimating size and deriving time also has the advantage of helping a Scrum Team to sequence its work. In the current example, an “L” sized item could start no later than day 2 in the Sprint if the team wants to guarantee that it will finish by the end of the Sprint as it could take them up to 9 days to complete.
Furthermore, if a “L” takes the team 7 days on average to get to “Done”, the Scrum Master should know by day 7 whether he feels the team is on track or whether there’s some impediment slowing the team down (whether the team realise it or not).
Looking at Cycle Time in the Retrospective
Cycle time data is vital in a Retrospective.
The team ends up sizing the search item as a “S”, but it ends up taking 6 days of cycle time to complete. During the Sprint, they found some tricky problems, did a work around, and delivered it by the end of the Sprint.
When the team looks at its baseline, they see that “L” PBIs have 7 days +/- 2 days cycle time. The team discuss the data and agree that the story should have been a “L”.
The team now know that when they are asked to deliver something similar in the future, if they think the PBI is likely to contain a similar problem, then the story is not a “S” but more likely to be a “L”.
Clustering items on a wall by their cycle time is a useful way of looking for patterns amongst PBIs and their sizes. I did this recently with one of the teams I was coaching and found that while they had a wide range of sizes for different kinds of work. In the end, we found that there was no real difference in cycle time between “XXS”, “XS”, “S” or even “M” sized stories.
What the PBIs had in common, though, was that they were a certain type of electronic form that the team worked on with a specific tool. We also determined that there were two clusters of work: one that involved updates and the one that involved brand new forms.
Based on the clusters, updates took 1-2 days, while new forms took 4-5 days. The Scrum Master and team agreed to call the updates “S” and the new forms “M”. They then used this as their baseline for their relative sizing/estimation sessions.
With another team, when we plotted size by cycle time, we found that there was significant overlap between “S” and “M” PBIs.
The team were not making enough differentiation between the two sizes. Some PBIs that were “S” could have been “M”.
Cycle time helps me to understand the flow of work for different types of PBIs, and this helps the teams I coach to improve their sizing and make it more consistent and accurate. It’s not something the Scrum Guide tells you to do, but I find it’s one of my most useful tools to help improve an agile team’s performance.