Learning from our mistakes

Flicr CC Image from David C FosterMichael Krigsman has a good story today on ZDNet about transparency and learning. He analyzes Amazon S3 team’s After Action Review (AAR) process following a disruption in their service. This reminds me of the importance of learning from failures and mistakes, rather than forgetting or covering them up. In fact there is a whole community dedicated to learning from mistakes, The Mistake Bank. Here is a quick recap of the useful practices Amazon deployed when they had a breakdown in their services. I’ve edited out some of the text so as not to cross ZDNet’s copyright, so click into the story for the full details.

Amazon’s S3 post-mortem demonstrates maturity | IT Project Failures | ZDNet.com
THE PROJECT FAILURES ANALYSIS

In analyzing the failure, Amazon asked four questions:

What happened? The first step to a successful post-mortem is establishing a clear understanding of what went wrong. You can’t analyze what you don’t understand.

Why did it happen? After after determining the facts, the post-mortem team should assess why failure occurred….

How did we respond and recover? … A useful post-mortem depends on the analysis team gaining a reasonable level of honesty, insight, and cooperation from the organization.

How can we prevent similar unexpected issues from having system-wide impact? … Planning must also consider the business process and management responses the team initiates when a failure occurs. A complete post-mortem addresses both technical and management issues.

Amazon’s technical failure disrupted its customers’ business and hurt the company’s credibility. However, their open and transparent response to the failure and its aftermath demonstrates a level of organizational maturity rarely found among Enterprise 2.0 companies.

Pulling our mistakes out and looking with them, alone and with the aid of colleagues, is a simple and effective learning practice. But it takes both a personal commitment to productively looking at our warts (rather than simple self-flagellation or guilt) and an organizational culture that values learning along with success. And we all know it… we learn more from our failures than our successes. 😉

Here are a few resources for learning from mistakes and failures (some repeated from embedded links above, but I want to make it easy to scan for the resources!):

Have any to add? Knowledge sharing in action!

Photo Credit: Flickr/CC