In Statistics, we need samples to work with. Most of the time the easiest way to collect samples is to ask people. But they tend to lie, especially if we ask about their body parts. Right guys?
Example
We collect data from 200 randomly selected people by measuring their height. Here is the histogram for the data:
Let me show you what will happen if we ask people, not measure them:
Do you spot the difference? There are bumps around the 5s and 10s.
They lied!
The two histograms together show that people rounded their height to the nearest 5 or 10.
Why people lie?
The example above is a typical representation of reporting bias.
Reporting bias is the tendency to underreport or overreport the data, which will become inaccurate. It usually happens when data is collected through surveys.
Reporting bias in this context means that people tend to provide altered information about their actual height.
There are several reasons for reporting bias:
Convenience. This is the simplest answer. If you are 174 cm tall, it's easier to say "I'm 175". Rounding to the nearest 5 or 10 is easy, so people usually do it in similar cases.
Estimation Errors. If you are not exactly sure about your height, you may just give a rough estimate. Again, a rounded number sounds logical in this case.
Social reasons. In many cultures, there is a social pressure associated with height. This pressure makes people to report being taller than they actually are.
Psychology. Some people feel uncomfortable when facing such questions, so they just give a short (usually inaccurate) answer.
This bias can be relevant in many fields, such as medicine, psychology, social sciences, and market research. Reporting bias is huge in Finance. Companies may report fake earnings, so they look better on paper to investors. Depending on the field some other reasons may occur.
How to achieve accurate samples?
Measure not ask. As the example above showed, if we measure people instead of asking them, the data will be more accurate. We can eliminate a lot of reasons for reporting bias by measurements.
Ensure anonymity. If we reassure people that they will remain anonym after answering the questions, they may be more open and honest.
Ask the same questions differently. Changing the phrasing or context of a question can make it less direct. Also by asking the same question with different wording, we can cross-validate responses. It's hard to achieve in cases like asking about height, but in psychology, it's a proven technique.
Ensure neutrality. As social pressures can influence the answers, try to be neutral when asking and when receiving answers
Consider cultural and language differences. The same wording may mean totally different things in different languages. Be careful with those and consider your responders.
Conclusion
Always be careful with self-reported data! If you are working with them, always consider that reporting bias may lurking in it! And if you are reading about stuff online or hearing something in the news, where they say "We asked people", be careful! They may lie!