Moody's Analytics Buffet Blog

TitleUsing Data Buffet: Course 102 - The value-added stack

AuthorPhillip Thorne

Question

In Data Buffet, what are historical, supplement, estimated, and forecast time series, how are they produced, and how do they add value to Moody's Analytics offerings?

Answer

This content originally appeared in the March 2012 issue of Data Buffet Monthly and has subsequently been updated.

[Stack: Historical, supplement, estimated, forecast] When populating Data Buffet, we start with historical data gathered from third parties; fill gaps in sets of related indicators by supplementing with simple calculations; estimate major indicators to achieve high frequency, detail and geographic granularity; then forecast using our suite of macroeconomic models.

Historical verbatim

At the base of the stack are historical datasets that we contract to republish from third-party providers. In the U.S., these consist of a large number of government agencies and private industry organizations. Globally, we obtain cross-national datasets (which supply consistent sets of indicators across multiple countries) from international agencies such as the IMF and OECD, and nationally sourced data from agencies of individual countries (at a minimum, from the central bank and national statistics office). National sources also supply our subnational data.

Historical supplements

In some cases, we need to supplement the reported historical data. If the source provides only part of a related group of indicators, we can complete the set using arithmetic identities. Because it’s common to periodically rebase index series, we can splice predecessor series to extend history. And for analytic convenience, we’re able to reformat trends as indexes and seasonally adjust.

A special type of supplement is the frequency convenience series. Some indicators are commonly reported in several ways, such as the monthly sum of daily stock market volume, or the weekly average of a daily index. We start with the daily fundamental and then frequency-convert to one or more secondary forms. You can do the same by using Data Buffet’s frequency conversions, analytic transformations and formulas.

Historical estimates

At the third level are estimates. An economic dataset can be characterized across three dimensions: Time (frequency and lag), space (geography), and detail (commodity, industry, occupation, etc.). There’s usually a trade-off between the dimensions; for example, the U.S. FHFA “Monthly Interest Rate Survey” reports national figures 12 times a year, but metro areas only once. With one series in isolation, you’re stuck; but complementary datasets can be leveraged to mutually improve utility.

The essential principle is to pair two series that generally move in parallel, where one is superior in at least one dimension, such that we can use the pattern of the superior series to reallocate the inferior series across its constituent parts (periods, areas or industries). In a simple case, if employment levels are reported for a state and its component counties by quarter and by year, respectively, then we can "quarterize" each of the counties to track the state. We provide methodologies in Mnemonic 411.

For certain estimated datasets, such as our U.S. subnational employment-output-wages suite, we append future periods from the corresponding forecast.

Forecasts

Finally, we forecast select indicators using an elaborate mathematical model of the macroeconomy. Most of our models have a quarterly frequency and a 30-year horizon, and are updated monthly (the output is a new vintage). Each forecast series consists of a historical segment and a projected segment, and is driven by one or more historical or estimated series. The historical segment is based on its drivers, but because of intermediate processing, is not necessarily equal to them; hence, it’s incorrect to call its values “actuals.” (In the simplest case, a monthly driver converts to the quarterly forecast frequency, irreversibly losing detail.) Forecasts require assumptions, and by running repeatedly with a range of assumptions we produce alternative scenarios.

Mnemonics

You can identify the rank of series by its mnemonic. Estimated series begin with R or XR, forecasts with F, and anything else is historical; moreover, a forecast-estimate pair will have similar mnemonics. However, these rules aren’t universal, so we urge use of the catalog (to select series) and Mnemonic 411 (to research them). If there’s any doubt, the “documentation” metadata will show if Moody’s Analytics touched the values.