Last Updated on March 16, 2021
I’ve always worked heavily with data, since my very first professional job.
Here’s a few important things I’ve learned along the way. They might seem like common knowledge, but in reality these principles aren’t practiced as commonly as you’d hope.
Never forget GIGO
GIGO stands for garbage in, garbage out. (Sometimes it’s RIRO – rubbish in, rubbish out.)
This is all too often forgotten. There is so much dirty or biased data out there, and the quality of your conclusions is only as good as the quality of your inputs.
Big data is not necessary better
Like many things, bigger is not necessarily better.
Big Data has tended to come with its share of Big Hype. So long as we’re realistic about its potential, and recognize that our data is only as useful as the human intelligence we bring to it, minus the human biases with which we burden it, Big Data should, indeed, pay significant dividends.
Nate Silver
Beware big data bias: The tendency to assume that more data will always lead to a better outcome.
Use the null hypothesis
Don’t lie with data. I still remember the first time I was asked to create statistical reports to meet a preconceived analytical outcome. I was naive, but it came as a shock.
Men may construe things, after their fashion / Clean them from the purpose of the things themselves
William Shakespeare
-Cicero
Torture the data, and it will confess to anything.
Ronald Coase
Data is useless without context
(I stole this one from Nate Silver).
Context matters. How data is presented also has a huge impact. Give the right context to support an accurate interpretation of the data.
Protect personal data
As an industry we have a long way to go with data security.
The six principles of GDPR (General Data Protection Regulations) are a good start:
- Lawfulness, fairness and transparency
- Purpose limitation
- Data minimisation
- Accuracy
- Storage limitation
- Integrity and confidentiality
- Accountability