Do you think all the responses in that survey of yours are equal in value?
Think again. Read on to find how your survey responses fail and how it will affect in your data analysis.
Check out the picture above. The picture is mock up based on real survey data that my company Acorn Analytics was hired to evaluate. I wasn’t actually looking for comment data, but making my way to the last row in an Excel document of sample data, where a colleague had added some statistics we needed to review. However, the “Nobody fuc..” caught my eye. Naturally, I decided to investigate.
All the Ways this Survey Failed
Here’s the backstory. A company we’ll refer to as Acme sent out an annual engagement survey to its employees. Most of the survey was numerical data. For instance, Likert Scale items where a respondent must rate a statement on a scale of strongly negative to strongly positive on a “scale of 1 to 10”. Luckily in this particular survey employees were given multiple opportunities throughout to provide “additional comments” in an open-ended format.
Reading through the rest of the responses by this one person it was apparent that the data was not very valuable. In both the open-ended responses and the numerical data the pattern of responses suggested a troubling lack of thoughtfulness. Here are more comments from this same individual:
“That’s OK. Nobody fucking cares.”
“For question 7, you’ve meant reading online comic strips, right?”
“And that’s great, because they can concentrate on making the motherfucking code.”
“And I don’t fucking care.”
“I could use a raise.”
Based on these comments alone, how would you rate the value of the rest of this person’s responses in terms of value? How representative do you think this person’s response will be of other responses? Should a company include this individual’s data to make an important strategic decision?
I’m not claiming that this person’s comments are completely without value. If there were more people who responded in a similar way then the argument could be made that this person’s opinion is representative of a sub-culture in your organization.
If this individual had been actually providing some concrete or actionable answers, they could have dropped as many f bombs as they could write and the response would have been included. What was clear was that this was a bad response.
Cursing and providing a negative response isn’t what devalues a respondent. A bad response one that at best is partially suspect. At it’s worst a bad survey response only serves to mask the truth for the sake of disruption or attention.
Treating Surveys As Sacred
The data in engagement surveys is incredibly valuable. It is one of the few opportunities a company has to get the truth from a very important group of stakeholders. Your employees are on the front lines, so they know what’s really happening in the trenches. Your employees are also the ones responsible for executing whatever plans you come up with at the conference room table. You cannot run the company without their support. Your employees literally ARE your company.
At the same time, let’s acknowledge that not everyone in your company thinks that surveys are important. Many in fact will think it is a waste of time; that their time is better spent writing code or selling to customers.
This is why for every thousand responses you receive in a company-wide employee survey a response that is little better than a troll is better omitted from the aggregate statistics. In order to give you a more accurate picture you must cull these extreme outliers.
In most cases this person’s feedback will still be represented elsewhere. For instance, we could look at issues within a particular department with a further survey. Or we could use this response to highlight a sentiment that is trending in certain parts of the organization. What we probably won’t do is include this person when we roll up the results at a higher level, using metrics like the total mean score.
Finding the Bad Apples
Now that we’ve established that bad responses do exist, what do we do about it? How do we go about finding them? Sometimes a bad response is easy to spot, like in this case. I happened to glance at an employee survey the other day and my eye caught just a truncated fragment of a comment in a data point.
However, you won’t have time or patience to go through hundreds or thousands of comments individually. You certainly don’t have time to analyze the Likert Scale responses by hand in order to figure out who to reject. Fortunately, computers are awesome at stuff like this. Plus, when you have a starting point like this individual from Acme, we can conceptualize a model and set loose the computers to round up some suspects for us to interrogate.
There are a lot of ways to approach filtering out the noise, and you’ll probably use a combination of techniques. (If you are interested in a review of such things, let me know with a comment.) The point of this article is that you do something to clean up your data. Don’t let the bad apples muddy up your results.
Keep Trolls Under the Bridge
Otherwise ask yourself: if the accuracy isn’t important then why are you even conducting a survey in the first place?
If you are in charge of a dataset like this employee engagement survey, and you are not finding at least one person to reject out of every thousand, you are doing something wrong. On your next survey, my advice is that you try to find at least one response to throw out. Explore statistically why this person could be skewing your data, and decide how to handle it for the entire analysis.
If you do this, your work will be rewarded. Not only will your dataset be cleaner, but you will understand the nuances of the data in a more sophisticated way. This make you and the leaders of your organization that much more confident in how you use this data to make decisions.