And yet I still think all but a very few 360 degree surveys are, at best, a waste of everyone's time, and at worst actively damaging to both the individual and the organization. We could stop using all of them, right now, and our organizations would be the stronger for it.
My problem with 360s is not the quality of the feedback given to the leader. On the contrary, I've seen some extraordinary coaches use 360 results as the jumping off point for insightful and practical feedback sessions. Nor is my problem that most 360 feedback focuses predominantly on the gaps between what the leader thinks are his strengths and what everyone else thinks. We know from a wealth of applied psychological research that the group of people whose own self-ratings match up most closely with others' ratings are people who are clinically depressed. (The best leaders always slightly inflate their scores, a finding called "benevolent distortion.") Nor, finally, do I much care that most 360 surveys are built on a logical non-sequitur: namely that since a particulargroup of exemplary leaders possesses all the competencies measured by the 360, therefore the best individual leader is she who possesses all of them.
No, my beef with 360 surveys is more basic, more fundamental. It's the data itself. The data generated from a 360 survey is bad. It's always bad. And since the data is bad, no matter how well-intended your coaching, how insightful your feedback, how coherent your leadership model, you are likely leading your leaders astray.
What do I mean by "bad"? Well, think about the most recent 360 survey you participated in, or pull it out of the drawer if you have it handy, and look at it. Virtually all 360s are built the same way. They measure a set of competencies by breaking these competencies down into behaviors, and then various colleagues — your peers, your boss, your direct reports — rate you on these behaviors. For example, to measure the leadership competency "vision," your evaluators score a list of behavioral statements such as, "Marcus sets a clear vision for our team" and "Marcus shows how our team's work fits the vision of the entire company."
On the surface, breaking down a complex competency such as "vision" into specific behaviors, and then rating me on these behaviors makes sense. But probe a little deeper and you realize that by doing so we ruin our survey.
Why? Because your rating reveals more about you than it does about me. If you rate me high on setting a clear vision for our team, all we learn is that I am clearer on that vision than you are; if you rate me low, we learn that you are clearer only relative to me.
This applies to any question where you are rating my behavior. You rate me on 'Marcus makes decisions quickly' and your rating reveals simply whether I make decisions more quickly than you do. Rate me on "Marcus is a good listener" and we learn whether I am a better listener than you. All of these questions are akin to you rating me on height. Whether you perceive me as short or tall depends on how short or tall you are.
The bottom line is that, when it comes to rating my behavior, you are not objective. You are, in statistical parlance, unreliable. You give us bad data.
"Well, that's alright," you may say, "because I am not the only rater. There are others rating you, Marcus, and whatever objectivity I may lack is compensated for by all those others."
Again, this sounds right, but it still doesn't hold up. Each individual rater is equally unreliable. This means that each rater yields bad data. And, unfortunately, when you add together many sources of bad data, you do not get good data. You get lots of bad data.
The only way to avoid this effect is to ensure that your group of raters is a perfectly representative sample of the competencies you are trying to measure. This is what polls do. They select a sample — usually a little over 1,000 people — that is nationally representative of ages, races, regions, genders, and political affiliations. This carefully selected sample then proves to be a far more reliable measure of national opinions than a random group ten times as large.
But the raters of your 360 survey are not a sample carefully selected to represent the competencies being measured. Nor are they a random sample. Instead, your raters are a non-random group of people who happen to work with you or report to you. In statistics we call this a "skewed sample." Add up all their ratings and you do not get an accurate, objective measure of your leadership behaviors. You get gossip, quantified.
Thankfully, the solution to this problem is simple. Although you are not a reliable rater of my behavior, you are an extremely reliable rater of your own feelings and emotions. This means that, although you cannot be trusted to rate me on "Marcus sets a clear vision for my team," you can be relied upon to rate yourself on a statement such as "I know what the vision of my team is." Likewise, while your ratings of me on "Marcus is a good listener" are bad data, your ratings of you on "I feel like my opinions are heard" are good data. This is true for any statement crafted so that it is asking you to rate you on you.
So, to create a reliable 360 survey, all you need do is cut out all the statements that ask the rater to evaluate others on their behaviors, and replace them with statements that ask the rater to evaluate himself on his own feelings.
Doing this will transform your 360 survey into a tool everyone can trust. But until then, it's just blather.