Last Updated on March 16, 2021

I’ve always worked heavily with data, since my very first professional job.

Here’s a few important things I’ve learned along the way. They might seem like common knowledge, but in reality these principles aren’t practiced as commonly as you’d hope.

Never forget GIGO

GIGO stands for garbage in, garbage out. (Sometimes it’s RIRO – rubbish in, rubbish out.)

This is all too often forgotten. There is so much dirty or biased data out there, and the quality of your conclusions is only as good as the quality of your inputs.

Big data is not necessary better

Like many things, bigger is not necessarily better.

Big Data has tended to come with its share of Big Hype. So long as we’re realistic about its potential, and recognize that our data is only as useful as the human intelligence we bring to it, minus the human biases with which we burden it, Big Data should, indeed, pay significant dividends.

Nate Silver

Beware big data bias: The tendency to assume that more data will always lead to a better outcome.

Use the null hypothesis

Don’t lie with data. I still remember the first time I was asked to create statistical reports to meet a preconceived analytical outcome. I was naive, but it came as a shock.

Men may construe things, after their fashion / Clean them from the purpose of the things themselves

-Cicero

William Shakespeare

Torture the data, and it will confess to anything.

Ronald Coase

Data is useless without context

(I stole this one from Nate Silver).

Context matters. How data is presented also has a huge impact. Give the right context to support an accurate interpretation of the data.

Protect personal data

As an industry we have a long way to go with data security.

The six principles of GDPR (General Data Protection Regulations) are a good start:

  1. Lawfulness, fairness and transparency
  2. Purpose limitation
  3. Data minimisation
  4. Accuracy
  5. Storage limitation
  6. Integrity and confidentiality
  7. Accountability
Image
Wow, this is how we were recording data 4000 yrs ago