Making the most of the 70% of data you don’t use

At least 70% of the data available to your customer insight team is unstructured.

That’s what Chris learned last week when he interviewed Gary Seaman, one of our preferred partners in the world of text analytics.

Gary’s work uses text analytics to help social enterprises make vital interventions in the lives of troubled young people. His models have saved people from becoming homeless.

He has quietly had great success doing this. (And if predictive analysis with text analytics can save more than 700 young people from the streets, imagine what it can do for the world of customer retention.)

We are unashamedly fans of his work.

He’s one of the few people we’ve met who combines structured and unstructured data for effective predictive analytics. Since the majority of all available data is unstructured, it’s vital to combine the two types to get a truly holistic view of your customers’ needs.

In this short interview, Gary tells us about how and why he started analysing free text, and how he developed his methodology for predictive analytics.

Some of the highlights are:

  1. Gary’s first step on any project is to agree the objectives in advance. Save money? Reduce time? Spot more people at risk? Whatever it is, have clear, measurable objectives.
  2. Gary then moves onto building a dictionary of specialist words and slang. That’s not always necessary, it depends on the software you need for your modelling. In his case, it meant identifying the words and concepts seen frequently in the case interviews with people at risk of homelessness, and then applying those to model a risk-index for future cases.
  3. Text to numbers: The next move is to build categories of words that are used, and use these categories to bracket together common concepts. Doing this helps him create a numerical model that tags mentions of key concepts and categories.
  4. His models are then based on counts and concurrences of multiple “high risk” concepts mentioned in a particular case. This lets him make predictions about which individuals might become homeless, drop out of school, or become unemployed.
  5. Gary’s advice for anyone new to text analytics is: DON’T GET DISTRACTED ALONG THE WAY. Text analytics will always throw up unexpected information and insights, but it’s best to start with the end in mind, and never lose sight of your goal.

About Gary: Gary works with non-profit groups and social enterprises to help them target their resources to the vulnerable people who need help most. He’s pioneered the use of the technology in targeting life-altering resources on an individual level.

Thanks to him and his predictive modelling, a huge number of vulnerable young people have received the help they needed to avoid homelessness when they were most at risk.

About Verbal Identity: Verbal Identity uses text analytics enhanced with linguistics to identify the deep-root causes of marketing and customer experience problems, and uses creative writing to action those insights.