Logout / Access Other products Drop Down Arrow
Get live help Monday-Friday from 7:00AM-6:00PM ET (11:00AM-10:00PM GMT)  •  Contact Us
Check out our new FAQ section!
RSS Feed
TitleUsing Data Buffet: Dissecting metadata
AuthorPhillip Thorne
Question

For each time series on Data Buffet, what kinds of information are embedded in the Description and Source textual metadata fields?

Answer

I. Introduction

If you have a Data Buffet mnemonic, there's no need to guess at the definition of the associated time series.

Each of our series has two textual metadata fields, which you can read in Mnemonic 411 and in View mode, include as optional headers in basket output, and use when labeling charts. These fields briefly explain "what" the series is, "how" it's measured, "who" provides it, and "which" dataset it's part of. Additional information is provided by the catalog (context within a categorical hierarchy), the written background in Mnemonic 411, and the source itself (linked from Mnemonic 411).

We segment the metadata in a standard order with standard punctuation and symbology. To facilitate comparison against the source we use the source's terminology when practical (although we translate to English and may harmonize), and include the source's unique identifier when available.

The segments will vary, depending on the nature of the series. The examples (selected from Data Buffet) are illustrative and progressive; a more detailed discussion of available segments follows.

II. Examples

1. A quarterly series from a U.S. source

  • XBSDCBNARELOCQ.IUSA
  • Banking: Commercial banks - Nonaccrual assets - 1-4 family residential HELOCs, (Mil. USD, NSA)
  • U.S. Federal Deposit Insurance Corporation (FDIC): Statistics on Depository Institutions [NARELOC]

Description field:

  • Preamble: Banking:
  • General description: Commercial banks - Nonaccrual assets - 1-4 family ELOCs
  • Measurement units: Mil. USD ... symbolic for "millions of U.S. dollars"
  • Adjustment: NSA ... symbolic for "not seasonally adjusted"

Source field:

  • Source entity, full name: U.S. Federal Deposit Insurance Corporation
  • Source entity, preferred abbreviation: (FDIC)
  • Dataset: Statistics on Depository Institutions
  • Series identifier: [NARELOC]

2. A national accounts series from a non-U.S. source

  • NACSGVAMMBCQ.IPRT
  • National Accounts: By Production - Gross value added at basic prices - Industry [05 to 33], (Mil. Ch. 2011 EUR, CDASA)
  • Instituto Nacional de Estatística - Portugal (Statistics Portugal): National Accounts - Table C.1.1.6 - GVA by industry A8 (chain linked volume data, quarterly) [CAE-Rev.3, ESA 2010]

Description field:

  • Preamble: National Accounts:
  • General description: By Production - Gross value added at basic prices - Industry
  • Category aggregate: [05 to 33]... taxonomy is identified in the Source field
  • Measurement units: Mil. Ch. 2011 EUR ... symbolic for "millions of chained year-2011 euros"
  • Adjustment: CDASA ... symbolic for "calendar day adjusted and seasonally adjusted"

Source field:

  • Source entity, native name: Instituto Nacional de Estatística
  • Source entity, English name: (Statistics Portugal)
  • Dataset: National Accounts - Table C.1.1.6 - GVA by industry A8 (chain linked volume data, quarterly)
  • Activity classification: CAE-Rev.3
  • National accounts framework: ESA 2010

3. With a legal memo

  • IR%TRGB15YBUM.IHUN
  • Interest rates: Government bonds - Bid yield - 15 year, (% p.a., NSA)
  • Thomson Reuters: Govt-Bond-Yield-TS - Govt-Bond-Yields; Moody's Analytics Calculated [USE IS RESTRICTED]

Description field:

  • Preamble: Interest rates:
  • General description: Government bonds - Bid yield - 15 year
  • Measurement units: % p.a. ... symbolic for "percent per annum"
  • Adjustment: NSA ... symbolic for "not seasonally adjusted"

Source field:

  • Legal memo: [USAGE IS RESTRICTED] ... Contact your sales representative for specific acceptable forms of use (e.g., citing one period, republishing the entire series, creating a chart, etc.)

4. With a complex index measurement and a definitional memo

  • IPIMMEGYYUM.IUKR
  • Industrial production index: General index [B to D], (Index CPPY=100 YTD, NSA)
  • State Statistics Service of Ukraine: Index of Industrial Production, by types of activity [CTEA-2010] [Post-Crimean Crisis boundaries]

Description field:

  • General description: General index ... The topmost series in a family of indexes; the term "total" is problematic.
  • Measurement units: Index CPPY=100 YTD ... symbolic for "index, corresponding period of period year equals 100, year to date". This is one kind of moving-base index, which are common among eastern European sources.

Source field:

  • Activity classification: CTEA-2010
  • Memo: [Post-Crimean Crisis boundaries] ... For this dataset, a change in geographic scope requires "before" and "after" series. The phrasing is standardized to be distinctive and concise; full information appears in the associated Mnemonic 411 background.

5. With a non-standard unit-descriptor

  • IMFWBCAUA.IGBR
  • Current Account Balance [Estimates Start After 2013], (Billions U.s. Dollars)
  • International Monetary Fund (IMF): World Economic Outlook ©2015

Description field:

  • Memo: [Estimates Start After 2013] ... Series in this dataset include both actual and projected periods.
  • Measurement units: (Billions U.s. Dollars) ... This metadata has not been standardized; otherwise it would read (Bil. USD). Annual series are not subject to seasonal effects, so we omit "NSA" as redundant.

III. Design of the metadata

The Description field is limited to 500 characters, and the Source field to 255.

Description field

  • Preamble (followed by colon)
  • General description, with one or more clauses (separated by hyphens or semicolons)
  • List (separated by semicolons)
  • Aggregate (in square brackets)
  • Line number
  • Scale, base measurement, accumulation
  • Adjustment and annualization - NSA, SAAR, CDASA, etc.

Together, the measurement and adjustment comprise the unit-descriptor, separated by a comma and enclosed in parentheses. Annual series are not subject to seasonality, and so omit the adjustment as redundant (except in rare cases where the source specifically reports two distinct annual series, one of them labeled as seasonally adjusted).

Bracketed aggregates use the symbols "to" (consecutive range), "semicolon" (nonconsecutive concatenation), and "ex." (excluding). Some sources use "+" and "-" but we consider these symbols to be ambiguous.

Source field

The Documentation field may contain multiple citations, separated by semicolons. There are three principal situations where we organize multiple citations:

  • The proximate source (data aggregator) from whom we obtain the data, followed by the ultimate source
  • The active source, followed by source(s) we used to extend the time series
  • The historical source, followed by "Moody's Analytics" when we have intervened

Each citation contains one or more of these segments, in this order:

  • Source's name in native language - mandatory
  • Source's preferred abbreviation
  • Name of the statistical release, report, or survey
  • Table number

Followed by zero or more of these segments, each in square brackets:

  • Source's unique identifier
  • Classification or taxonomy - activity, industry, product, etc.
  • Framework - national accounts, BOP, etc.
  • Break in currency, geography, or other definition
  • Copyright, usage restrictions, TOS, EULA, or similar
  • Projected period - by source
  • Known gaps
  • Calculation - by Moody's Analytics
  • Other memo

Intervention by Moody's Analytics

When Moody's Analytics has been involved in the synthesis of a series (i.e., a historical supplement, historical estimate, or forecast), our name appears in a secondary citation, with one of these fixed phrases:

  • Moody's Analytics Adjusted
  • Moody's Analytics Calculated
  • Moody's Analytics Estimated
  • Moody's Analytics Forecasted

Each phrase subsumes the prior; e.g., if you see "Moody's Analytics Estimated" you can assume that the processing pipeline may also have performed adjustment and calculation.

Symbology

  • The generic measurement is "count" with symbol "#".
  • In isolation, "%" means "percent of a whole."
  • Interest rates are per annum, "% p.a.".
  • Percentage rates of change denote the interval with "Q/Q" or similar.
  • Currencies use the ISO 4217 alpha-3 code.
  • Indexes of any kind use the keyword "Index".
  • Locational indexes use the Data Buffet geo code for the reference area.

See the reference hyperlinks below for more comprehensive lists of symbols and abbreviations.

Classifications and frameworks are identified with standard abbreviations (ISIC 4, NACE Rev. 2, 2008 SNA, BPM6, etc.). We do not at this time have a consolidated reference; please check the "new data" article for the dataset.

IV. Caveats

The preceding principles are general guidelines for our metadata, but do not apply universally. There are variations, particularly with U.S. series, legacy series, and series automatically translated from a bulk delivery. We advise against automatic translation to tokenize and extract key-value pairs.

Non-Latin characters (from accented European alphabets, etc.) may not appear correctly.

Our metadata are subject to change. When building a database, we advise against using them as primary keys; that's the function of our mnemonics.

V. References

Updates

  • 29 May 2015 - Initial version.
  • 9 Feb. 2018 - Geo code: US to IUSA. Example with legal memo. Secondary source citation. "ECCA" retired.