Decision-makers at consumer brands are finally realizing the full transformative potential of external data – but they’re also realizing how difficult it is to source. Forrester reports that 87% of decision-makers in data and analytics have implemented or are planning initiatives to source more external data.
And those initiatives are growing outside of the IT team; 29% of those surveyed say that IT has primary ownership of data sourcing, down from 37% in 2016. To support these projects, organizations are increasingly turning to a new specialist: the data hunter, who identifies and vets external data sources. It’s a lot of work to build external data-focused teams, and many leaders are realizing that choosing external data is difficult to scale as the source list grows. Perhaps that’s why 66% of those decision-makers surveyed by Forrester report that they’re using or planning to use external service providers for data, analytics, and insights.
How can brands identify the right external data sources?
External providers like Skai do the hard work for you, taking time to develop data identification methodologies that ensure the insights you glean are representative of the market. Identifying and choosing external data sources is a time-consuming task that stymies many organizations’ internal data teams.
At Skai, we begin this process by asking two important questions: what business questions are we trying to answer with this data, and what will the business ultimately do with this data? The answers dictate the data types and sources we select, which must meet four main criteria:
Accurate market share coverage. Choosing external data sources must be truly representative of a given market ecosystem, with enough textual data to produce useful analysis. Collecting too much data from single-brand retailers or other niche channels can introduce bias to the data sets; often, these smaller brands are well-represented within more comprehensive data sources like Amazon or Walmart.
But ensuring accurate market coverage is a custom process for each market ecosystem. Some markets, like beauty, have relevant data spread over lots of ecommerce sites; other markets, like food, are more centralized on fewer, larger sites. For every new ecosystem, we review the market share breakdown over the most common ecommerce channels (Amazon, Walmart, and Target) before deciding whether to onboard smaller or more specialized sources.
Robust online presence. The Skai platform is unique in its ability to make accurate and revelatory connections between a product and what consumers and experts are thinking about that product. Product reviews and online discussions are essential to making these connections, and we have high standards. We make sure to source product reviews with rich product details and publication dates so we can accurately track both what consumers are saying and when they discussed those topics, producing a timeline of consumer sentiment
Continuous and stable data access. We collect external data through third-party API integrations and lawfully scraping publicly accessible sites; every time a supplier modifies their back end we must modify our code and algorithms accordingly to ensure continuous access to data. Maintaining access to these types of data sources is a prohibitively complex engineering task for most organizations, but it’s routine for Skai.
Excellent data quality. The data we collect must have comprehensive, high-value text descriptions so that our proprietary machine learning and natural language processing algorithms can accurately extract content. We then normalize that content, ensuring that all data sets are speaking the same language, allowing us to connect data sets and deliver reliable, accurate analyses.
Once we’ve identified the best assortment of data sources, we confirm our selections with industry subject matter experts who validate our research and identify any data types or parameters we may have missed. Identifying data sources is a complex, time-consuming and specialized task, but it’s pretty simple compared to the steps that follow: connecting and contextualizing that data.
Take it from a real Skai client:
“Trying to connect this many new and different external data sources into our existing system would have taken us years! Skai has done all of the hard work by collecting and contextualizing all of the data relevant to [our] ecosystem, giving us a more holistic view of what’s happening in our category”
————————————–
*This blog post originally appeared on Signals-Analytics.com. Kenshoo acquired Signals-Analytics in December 2020. Read the press release.





