Agile healthchecks – 5 points we learned

When I started leading a new development group, I set up a team healthcheck . I wrote an article at the time about why I felt this was a good idea. That was a year ago.  So did it work and what have we learned? If you just want the key learnings, I’ve summarised these at the end.

 This article has an associated video sketch. 

Why have a healthcheck?

Team healthchecks are quite widely used among agile teams.  Rather than checking whether teams are following agile process (“doing agile”) they look at whether teams feel agile is working (“being agile“). The idea was popularly addressed by Henrik Kniberg back in 2014 in the “Squad Health Check model” . If you haven’t seen the original article, it’s still worth reading.

For my own group the focus is to move continuously towards autonomous, self-managing teams.  There is value in a mechanism for the teams to assess their own progress and feed back to the business. This is separate from a Sprint Retrospectives.  Sprint Retrospectives are great for identifying actions that the team can take to improve. 

Sprint Retrospective provides a formal opportunity to focus on inspection and adaptation

The Scrum Guide

The Healthcheck looks at a bigger picture.  The team uses it to build up self-awareness about what’s working and what’s not.   Leaders and coaches use it to look for patterns and warning signs.

Chance of success
Kniberg’s opinion on
likely success rate

Of course like any tool this could be misused.
It could be used to judge or rate teams.
It could be a weapon or a threat.

The approach depends on the company culture.
Without an open culture already in place the teams will never openly discuss their problems. 

Kniberg agrees it won’t be effective in every organization, but:

 there IS a potential gain, and there are ways to avoid the disaster scenario.

Kniberg – “Squad Health Check model

What did we do differently?

Kniberg had each individual scoring red/amber/green using a card system similar to planning poker. The team then agree which colour to use as to represent the whole team.  In Kniberg’s system, Red is “awesome” and Green is “crappy”.

We found this didn’t really give enough options.  “Awesome” and “Crappy” are fairly extreme. With only three levels, it becomes a big step for a team to choose an extreme or to “move” between states.  You risk ending up with a very complacent approach with a lot of amber indicators.

Instead we used a numerical scale rating the questions 1-5.  That means people can choose less extreme options. We also decided to abandon the “planning poker” approach in favour of an online survey. Kniberg very clearly disagrees.

We’ve found that online surveys suck for this type of thing.

Kniberg – “Squad Health Check model

We felt there were advantages and disadvantages either way. Kniberg makes the very valid point that an online survey cuts out the discussion.  But the team are already having the discussion through a Sprint Retrospective. The main concern was avoiding groupthink and pressure. People may be unwilling to show opinions openly, and a “team” score may be very influenced by a vocal few.

The questions

We have used a set of questions which are based largely on the original set used in Kniberg’s article.  We took this as a starting point, although we fully expected to be changing the question set.

These questions are just a starting point, a default selection.

Kniberg – “Squad Health Check model

We modified the language a little.  In particular we asked a clear question while Kniberg gave the rather extreme end points. For example Kniberg says:

Suitable Process:

  • Green: Our way of working fits us perfectly
  • Red: Our way of working sucks

As mentioned above, “green” and “red” are pretty extreme and lead to a lot of “Amber”.  I can’t imagine many teams agreeing with either of the statements above. Maybe that’s a British thing.  Instead we ask how much people agree with a statement on a 5 point scale.  The equivalent question for us is:

“Process: Our way of working is modern, world class and suits the team well”

We dropped one question because we couldn’t make it work in this form.

Fun:

  • Green: We love going to work, and have great fun working together
  • Red: Boooooooring.

We added one question because we felt it was too important to leave out.

Teamwork: We work together well as a team with great collaboration and alignment

The full list is in the next section.

The meta-questions

Increasingly as we used the healthcheck we found that the questions grouped naturally into categories.  In some ways the actual questions felt less important than the groupings. As this is about motivation, it felt natural to use the groupings proposed by Pink in his book “Drive”.

Autonomy .. Mastery .. Purpose – these are the building blocks of new way of doing things.

Daniel Pink, “Drive: The Surprising Truth About What Motivates Us

Autonomy – the urge to direct our own lives

These questions relate to how the team works internally and with others

  • Teamwork: We work together well as a team with great collaboration and alignment
  • Players: We are in control of our own destiny. We decide what to build and how to build it.
  • Support: We get great support and help from the rest of the business when we ask for it

Purpose – the yearning to do what we do in the service of something larger than ourselves

These questions relate to the business and the sense of direction

  • Mission: We know exactly why we are here and we’re really excited about it!
  • Value: We deliver great stuff! We’re proud of it and our customers are really happy.

Mastery – the desire to get better and better at something that matters

These questions relate to how we work and whether we are making something we are proud of.

  • Learning: We are learning lots of interesting things and improving all the time
  • Process: Our way of working is modern, world class and suits the team well
  • Code: We are proud of our code.  It’s clean, easy to read and has great test coverage
  • Speed: We get things done quickly with no unnecessary delays and interruptions
  • Releases: Getting code reviewed, tested and released to customers is simple, safe and painless

What does the data show?

Well the good news is that the data shows a positive overall trend.  Since I want us to be making things better, that’s a relief. When averaged across all questions, individuals and teams, we’ve seen an increment of half a point over a year.  That might not seem much, but with individual scores between 1 and 5, an increase of 0.5 points represents significant progress on our agile journey. 

By capturing numerical data in a consistent format over time, we can look in a bit more detail. Kniberg looked at current data and change from previous state. However, we find that there is more value in looking at longer term trends. We capture all data anonymised but categorised by team, which lets us slice the data in different ways.

I give some real example data below. I have removed the actual numbers and will point out that the graphs do not show the full range from 1 to 5.

Top level picture

Let us look at the top level. If we group the questions into the categories of Autonomy, Purpose and Mastery we get the chart below showing the trends.

Firstly, it’s clear there is a lot of variation, even averaged across the teams. This is where I feel Kniberg’s focus only on change from one survey to the next would leave us swamped by the “noise” in the data.

However, it’s also clear that there are trends which are statistically significant beyond the variation. These trends are responding to the focus of the agile change that we are bringing.

The Mastery line is lowest. Partly this represents the existence of real technical debt and partly this shows that the associated issues have a high impact. If you have just been wrestling with code or tools problems, you respond with a lower score. As the teams self-organise better they address the issues and we see the technical environment improving.

The Autonomy line also started low and has been continuously improving. You would expect (and hope) that a shift to more agile ways of working would result in an increase in perceived autonomy. It’s not all plain sailing as the variation shows. One factor that we realised early on is that you are not answering questions in the same way. As teams become more agile, expectations for autonomy increase.

Finally the Purpose line shows much less change. In fact it seems to show some drop and then a rise. This is a response that we might expect for any change program. While purpose came previously from the outside – directive management – we now expect the team to take more ownership.

Detailed analysis

The data is also useful at a much more detailed level. Let’s look at a specific team (for obvious reasons I won’t go into too much detail in this case study). Mid year, this team we showing the trends below. Something is clearly wrong.

The chart shows a team declining on all of the measures.  Unlike other teams we see mastery and purpose declining together.  This looks suspiciously like a team ground down by technical debt.  Technical debt is dragging down not only the Mastery line (the team’s assessment of what they are working on) but also the Purpose line (the team’s assessment of the value of the work).

Entire engineering organizations can be brought to a stand-still under the debt load

Ward Cunningham

As a tool this helps understand the team situation and how the team is perceiving it. This team weren’t solving the issue and needed an intervention to address the situation. With leaders working with the team to address the problem, the chart below shows the resulting recovery.

Now we see a virtuous circle. Team autonomy allows them to decide what matters and drives an increase in mastery. Addressing the issues around the product then drives a sense of purpose. The final end point becomes comparable to the other teams.

Key learnings

So what are our key learnings from running a healthcheck for a year so far?

  1. The healthcheck has been some work to manage but proved a valuable tool for both teams and leaders
  2. Individual values are very impacted by variability. Single data points are almost worthless, while trends are more valuable.
  3. We have been able to look for teams or questions which have unexpected results and address real underlying issues
  4. The big picture (autonomy, mastery, purpose or similar) may be more meaningful (and easier to communicate) than individual questions.
  5. Don’t expect dramatic increases in numbers – expectations increase along with improvement.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s