Data is becoming a must-have for many companies today. According to IDC, the Big Data and business analytics industry reached $189 billion in 2019 and is projected to reach $274 billion by 2022.
Most of the industry, however, is focused on solutions for structured internal data. The arguments for structured internal data include the following: it is more accessible; it is more user-friendly and tangible; it is easier and faster to extract insights and results, and it is more reliable.
The real issue is that external data is very noisy. Analysts project that by this year, there will be more than 1.7MB of data created per second for every person on Earth. There are more than 250 million posts per hour on social media, and there are endless sources (influencer posts, blogs, product reviews, patent filings, conference agendas, research papers, news sites) that contain a trove of data. Not surprisingly, businesses struggle to identify what data is most important and relevant to driving informed growth decisions.
It is this external data that is most valuable: the data that exists out of bounds tells the story about market trends, consumer sentiments, competitor developments, and technological innovation. Data that is unconnected, unstructured, and composed of large and extremely varied combinations result in more variables and relationships to analyze, explore, and test.
Take, for example, research on hydration trends. In the context of beverages, related terms may include “thirst,” but in the personal care and beauty example below, relevant terms may be “moisturizer,” “hyaluronic,” and “water drench,” among others.
Product clustering is another major challenge. With the same products called different things by different vendors, even the leading e-commerce vendors are only able to achieve 85% accuracy. Identical products may have different images online or varying prices; conversely, products with the same name may be different.
And finally, let’s look at the well-known exercise that proves the difficulty of natural language processing. The “Let’s eat Grandma” versus “Let’s eat, grandma” meme shows how difficult it is to extract context from a phrase when the words are identical.
Data-driven organizations outperform their peers
As a result of these challenges, most companies ignore or limit their use of external data. In fact, according to a recent BI-survey, the mean number of external data sources integrated into an enterprise storehouse is three.
Data-driven organizations integrate many more sources of data. The data informs every decision they make and allows them to determine where to make investments and where to scale back. Getting this right as early as possible leads to higher profits, increased competitive advantages, and long-term brand value. An analysis by Flat World Solutions shows that businesses stand to gain up to $430 billion if they opt for a data-driven approach.
Harness the power of external data
Skai connects over 13,000 data sources by pulling data from social media posts, product listings, product reviews, patent filings, key opinion leaders’ posts, research papers, business news, conference programs, clinical trials, and point-of-sale data to track consumer sentiments and trends across various categories in food and beverage, personal care, and beauty industries.
Using patented Natural Language Processing (NLP) techniques and applying domain expertise, the Skai platform can generate profound insights that typically come from small focus group research.
————————————–
*This blog post originally appeared on Signals-Analytics.com. Kenshoo acquired Signals-Analytics in December 2020. Read the press release.