How Scalepath Sources and Modifies Data

Scalepath’s software is essentially a bridge between your market context and large public datasets we use to better understand the market. But knowing where the data comes from, and how it’s modified, is an important aspect of being confident in the results.

Here’s a look behind the curtain.

Data sources

We rely primarily on public, comprehensive data sets from national statistics agencies. These agencies have one job: to count things. They ensure every business is counted and classified, without duplication or omission. They are also good at sharing that data, though not always in the most user-friendly way.

We’ve found these sources to be the most accurate way to count the number of businesses in each market, by business size, geo, locations, and market vertical. We also use similar data sets to understand job roles, payroll, and more.

The primary data sources we use include:

US Census Bureau
Bureau of Labour Statistics
Statistics Canada

This list is subject to change, and will expand as we provide data for additional countries and markets.

We generally don’t rely on databases of specific companies or leads, such as those provided by Zoominfo. These databases aren’t built to size markets, so they have two problems: they don’t count every company, and there are often duplications or companies or subsidiaries. These can be great tools when it’s time to turn your targeted SOM into a list of prospects, but it’s better to start with a complete dataset.

Data modification

We modify data in two primary ways: recategorizing and forecasting.

We often have a need to recategorize verticals, either by grouping multiple verticals together, or segmenting verticals further. We track more than 1,600 market verticals; however, the primary data sources above may not align with how users view the market, so additional work is needed.

We also build forecasts and projections to provide a view to the present (since data is not provided by the agencies in real time) and future market environment. We use machine learning algorithms to analyze decades of historical data, along with more current inputs, to build forecasts of where each market is heading.

Pricing data

Pricing is a key input to understanding the revenue potential of a market.

Where possible, we use the revenue and pricing models of our customers in their TAM model. If they’ve proven they can sell in a market, to specific customers and specific amounts, this is often better data than you would get from a survey, where your selling motion, brand, and product strength aren’t considered.

Where pricing data doesn’t exist, such as for a new geo, vertical, or product category, Scalepath can assist with a market survey to understand pricing trends and sensitivity in a given market.

Surveys and additional data

Occasionally, users need an additional layer of data to accurately size and understand their market. This may be the usage of a specific required technology, a selling motion, or other preferences that impact their market size. In this case, we can input data from primary or secondary research into a filter field to apply this data to the model.

Conclusion

So, that’s how Scalepath approaches the data we use in our market models. If you have any questions on how we source, use, and modify data, please contact us.