Can what you Tweet predict whether you’ll have a heart attack?

A recent study by Johannes Eichstaedt, from the University of Pennsylvania, inferred people’s mental state by analysing their tweets. They looked at 826 million tweets from 1400 US counties.

Let’s just take a moment to digest those figures.

826 million tweets. 1400 US counties.

That equates to about 10 million people and roughly half of all the counties in the US.

They found that in the communities where people wrote angrier tweets, there was a higher incidence of heart disease.

This isn’t surprising. But the study was more accurate at predicting this correlation on a community level than a meta-study that examined traditional factors such as socio-economic grouping and demographics.

how's waldo

That means it produces a better model than traditional techniques. A better model means a more effective and efficient way to allocate resources. More lives saved, less money spent.

We think the reason for this is simple: bigger data.

Imagine how difficult it would be to collect that much data on that many people. Even if you managed to sidestep privacy issues and circumvent the legal vortex of state legislature, the data would still be in thousands of different formats.

Without wishing to trivialise heart disease, this study is just the beginning.

The fact is that people choose to communicate in words, not numbers.

There is a vast amount of data that people to choose to broadcast about themselves online, and now we can process that (unstructured) data at scale.

So whether you’re a government or a company, you can invest huge amounts of money and time into collecting specific numerical data to suit your purpose, or perhaps you can just listen to what people are already saying.

We’re experienced listeners with ears tuned to linguistics, so if you want to talk to us email Chris.