Cultural Change: The most fundamental task. The most difficult task?  

Posted by Dino in , ,

It is one thing to talk about "best practises". It is quite another to have them implemented and working effectively within a team or organisation. Since the end of June, when I took a team on to shape and mould into an Operations team, perhaps the most striking problem has been some members' shear resistance to adopting new working practises. This has to be a problem in any organisation attempting to improve.

In particular, I want the team to make notes on their incident investigations. There are a multitude of reasons for this, such as allowing other engineers to review and continue the work if necessary and allowing the notes to be reviewed retrospectively if similar incidents occur in future.

My first and default method for getting the engineers to follow this practise was to tell the team quite simply what my expectations were and that we should be doing with respect to taking notes. This was enough for one of the team of 5 to take it all on board and start working as expected.

I then worked through some problems and showed how this could be of benefit. No further engineers were swayed to this new way of working.

The next step was to organise a "training". In this, I invited the users of the Operations teams - project manager, developers and others who would be using the services of the Operations team. I asked them to tell me what they thought would make a great Operations team. They came up with suggestions like "knowledge sharing in the team", "clear idea of where an investigation is and the process". I then went through how I would investigate a problem using this note taking working practise and how this satisfied their requirements. Almost all the users liked what they saw and approved.

The training had an interesting affect - one of the engineers requested to change teams soon afterwards, leaving a team of 4. The others became more convinced of the usefulness, but after an initial stab at trying the new method, soon reverted to their old ways.

After a major incident, a retrospective was held and some of the same themes re-emerged: the need to knowledge share, the need for more logging/notes on the investigation. These themes came from the team members themselves, yet still behaviours have not changed and the working practises have not been adopted.

During and after other incidents, users have sent emails relating to these same points and themes. The team members have seen these mails, yet still continue to work in the same way.

Changing working practises and creating a working environment where this is possible has now become the major and most fundamental issue. Everything else, such as what kind of things are "best practise" or IT Service processes, is a secondary issue.

Read More >>

Giving customers visibility of issue progression - Skype example  

Posted by Dino in ,

A week ago there was a massive outage at Skype - none of their 9 million users could use their service for 2+ days. You can imagine that if Skype is one of the central ways in which you speak with your friends, you would have been very frustrated - the frustration you would feel with an outage to your mobile phone network for a few days.

What is interesting is that they used their blog to keep their users updated on progress: http://heartbeat.skype.com/

If you look at the entries for the month at
http://heartbeat.skype.com/2007/08/
you can see the entries they made throughout the incident to keep their users posted on what was happening. I've copied edited down snippets here and I really recommend going through these updates pretending to be one of the frustrated Skype users wanting their service working. I have further comments below.

Problems with Skype login
By Joosep on August 16, 2007.
UPDATED 14:02 GMT: Some of you may be having problems logging into Skype. Our engineering team has determined that it's a software issue. We expect this to be resolved within 12 to 24 hours...

Thanks for your support
By Villu Arak on August 16, 2007.
We'd like to thank everyone who has taken the time to send us their thoughts...

The latest on the Skype sign-on issue
By Villu Arak on August 16, 2007.
... we wanted to dispel some of the concerns ... The Skype system has not crashed or been victim of a cyber attack...

Further on the sign-on issue
By Villu Arak on August 17, 2007.
...We feel that we are on the right track to bring back services to normal. (Updated at 2:15am GMT)

Where we are at 0400 GMT
By Sten on August 17, 2007.
...We're fixing issues in our networking software and monitoring the clients getting online with increased success...

Looking slightly better at 0700 GMT
By Sten on August 17, 2007.
...even though it is too early to call out anything definite yet we are now seeing signs of improvement in our sign-on performance...

Where we are at 1100 GMT
By Villu Arak on August 17, 2007.
...We're on the road to recovery. Skype is stabilizing... Neither Wednesday's planned maintenance of our web-based payment services nor any form of attack was related to the current sign-on issues in any way.

Update at midnight GMT
By Villu Arak on August 18, 2007.
...Skype presence and chat may still take a few more hours to be fully operational....

The words we've all been waiting for
By Villu Arak on August 18, 2007.
Take a deep breath. Skype is back to normal.

What happened on August 16
By Villu Arak on August 20, 2007.
...The disruption was triggered by a massive restart of our users' computers across the globe within a very short timeframe as they re-booted after receiving a routine set of patches through Windows Update...

Now, as you were waiting for your Skype to start working again... How did that make you feel, reading those updates (compared to not having anywhere to look to see what was going on)? What if they were updating the blog every few minutes as they worked on the problem rather than every few hours and added technical detail which you may or may not understand - would that have made you feel more or less happy that the problem was being investigated and resolved? Then compare that with the service from, for example a bureaucratic government organisation or even your lawyer during the process of buying/selling a house. There is no place to go and see what is happening with your issue and you feeling you are banging our head against the wall - constantly chasing for updates through phone calls or other means.

This Skype example gives a glimpse of what is possible through using a ticketing/bug tracking system when engineers working on the problem update those tickets with notes. The dramatic increase in visibility of an issue being progressed gives greater confidence to customers, reducing their anxiety.

Read More >>