Self Reported vs. Actual Data

Last Updated on September 11, 2023

Written 2021.

I started thinking about this mental model in the context of my study and work in statistics and data engineering/data analytics. When you are trying to find something out, maybe about a human behavior, you can get data by surveying people in some way – “self reporting”, or looking at actual data on how humans behave.

From the medical scientists:

Self-reporting is a common approach for gathering data … This method requires participants to respond to the researcher’s questions without his/her interference. Examples of self-reporting include questionnaires, surveys, or interviews. However, relative to other sources of information, such as medical records or laboratory measurements, self-reported data are often argued to be unreliable and threatened by self-reporting bias.

The arrival of the internet means we can get even more actual data on people’s behavior, instead of relying on self reported data.

Everybody Lies

We often consciously or unconsciously lie when self reporting data.

Everybody lies. People lie about how many drinks they had on the way home. They lie about how often they go to the gym, how much those new shoes cost, whether they read that book. They call in sick when they’re not. They say they’ll be in touch when they won’t. They say it’s not about you when it is. They say they love you when they don’t. They say they’re happy while in the dumps. They say they like women when they really like men. People lie to friends. They lie to bosses. They lie to kids. They lie to parents. They lie to doctors. They lie to husbands. They lie to wives. They lie to themselves. And they damn sure lie to surveys.

Here’s my brief survey for you:

Have you ever cheated in an exam?

Have you ever fantasized about killing someone?

Were you tempted to lie?

Many people underreport embarrassing behaviors and thoughts on surveys. They want to look good, even though most surveys are anonymous. 

Everybody Lies

The more impersonal the conditions, the more honest people will be. For eliciting truthful answers, internet surveys are better than phone surveys, which are better than in-person surveys. People will admit more if they are alone than if others are in the room with them. However, on sensitive topics, every survey method will elicit substantial misreporting. People have no incentive to tell surveys the truth.

How, therefore, can we learn what our fellow humans are really thinking and doing? Big data. Certain online sources get people to admit things they would not admit anywhere else. They serve as a digital truth serum. 

Everybody Lies

Survey estimates of normative behavior—like voting, exercising, and church attendance—often include substantial measurement error as respondents report higher rates of these behaviors than is warranted.

Lies, Damned Lies, and Survey Self-Reports? Identity as a Cause of Measurement Bias

The Hawthorne Effect

There are some similarities to The Observer Effect – sometimes known as The Hawthorne Effect – which I wrote about in Mental Models Weekly, my newsletter.