Demographic Definitions 

The Audience API supports the following user demographic types. Demographic types may be either single valued (user.gender: male or female) or multivalued, meaning that a user may have more than one demographic value associated with his/her profile (user.language: English and/or Chinese).

Please note:

  • Some model taxonomies are marked with a beta tag, due to expected future improvements.
Demographic Type Group-By Label Definition
Age user.age

Age categorization is based on a number of signals, including Twitter users’ self-declared birthdays, along with the accounts each user follows and the content they engage with.

Multivalued: Yes

Example values:

  • 13 to 24
  • 21 to 34
  • 25 to 49
  • 35 to 54
  • 18+
  • 21+
  • 35+
  • 50+
Gender user.gender

Gender classification is based on information our users share as they use Twitter, including profile names and accounts a user follows.

Multivalued: No

Example values:

  • Female
  • Male
Language user.language

We derive a user’s language from a number of different sources, including the language selected in a user’s profile settings and the languages that correspond to a user’s activity on Twitter.

Multivalued: Yes

Example values:

  • Chinese
  • English
Interests user.interest

We identify Twitter users’ interests using a proprietary algorithm that is based on a variety of signals, such as who a user follows, the content they Tweet and engage with on Twitter and Twitter bio.

Interests are reported at two different levels:

  1. A top-level interest family (e.g., Books and literature)
  2. A sub-interest within a family (e.g., Books and literature/Biographies and memories).

Note: A ‘/’ character separates the top-level interest from the sub-interest.

Multivalued: Yes

Example values:

  • Books and literature
  • Books and literature/Biographies and memoirs
  • Music and radio
  • Music and radio/Alternative
TV Genre (beta) user.tv.genre

TV interests at the genre level.

TV interests are based on the contents of a user’s Tweets, as well as engagement patterns. We employ a combination of machine learning and human curation to create a taxonomy of conversation around TV programming, with which we classify individual users and their conversations.

Multivalued: Yes

Example values:

  • Sci Fi
  • Sports Talk
TV Show (beta) user.tv.show

TV interests at the show level.

TV interests are based on the contents of a user’s Tweets, as well as engagement patterns. We employ a combination of machine learning and human curation to create a taxonomy of conversation around TV programming, with which we classify individual users and their conversations.

Multivalued: Yes

Example values:

  • Silicon Valley
  • American Idol

Note: We track TV shows across 20 markets: Argentina, Australia, Brazil, Canada, Chile, Colombia, France, Germany, India, Ireland, Italy, Japan, Mexico, Netherlands, South Africa, Spain, UK, US (English & Hispanic programming), Venezuela.

Location: Country user.location.country

User location categorized at the country level, classified in ISO-3166-1.

Location on Twitter is based on a number of signals, including current location as well as recent location history. Twitter uses several signals for determining location, such as web IP address, mobile GPS signal, mobile wi-fi signal, and real-time signals (such as the location a user includes in his/her Tweets) when available.

Multivalued: No

Example values:

  • Canada
  • United States
Location: Region user.location.region

User location categorized at the state/province level.

Multivalued: No

Example values:

  • Roma / Rome, IT
  • Wisconsin
Location: Metro user.location.metro

User location categorized at the metro/DMA level.

Multivalued: No

Example values:

  • Chicago, IL, US
  • San Francisco-Oakland-San Jose CA, US
Wireless Device Category user.device.os

Device category information is derived in aggregate from information collected via the Twitter mobile application. Categorization is device-specific (e.g., iOS devices, Android devices, etc.) but does not include operating system level details.

Multivalued: Yes

Example values:

  • Android devices
  • Blackberry phones and tablets
  • Desktop and laptop computers
  • iOS devices
  • Mobile web on other devices
Wireless Device Network user.device.network

Wireless carrier information is derived in aggregate from information collected via the Twitter mobile application.

Multivalued: Yes

Example values:

  • Sprint
  • Verizon Wireless

Understanding Audience Insights 

To retrieve insights for an audience, you define one or more groupings in your query, such as:

{
 "groupings": {
  "country_by_gender": {
   "group_by": [
    "user.gender",
    "user.location.country"
   ]
  }
 }
}

The way you define your grouping will dictate how aggregated demographic results are returned. Your grouping may have up to two demographic types, which will be evaluated in the order defined. In the above example, query results will first be grouped by gender, and then for each gender category value, the results will be grouped by country. The percentages returned represent the percentage of people in an audience with the reported demographic characteristics.

Note: A grouping may return aggregate demographic results that do not sum to 100%. This is due to rounding results to two decimal places, and/or the fact that only category values that exceed the minimum reporting threshold are returned. Given that we add noise to the audience results, there may be slight discrepancies between the rolled up values for one-dimensional and two-dimensional queries. If a grouping contains at least one multivalued demographic type (e.g., user.language or user.interest), the reported values may exceed 100% in total. This is because users in an audience may have multiple category values within that demographic model.


Reporting Thresholds 

In describing an audience’s characteristics, the maximum value the Audience API will return is 100%. The minimum value we will report is 0.1%, though we require a minimum of 75 users within a demographic category. All results are displayed as percentages, rounded to the nearest hundredth of a percent (e.g., 7.27).

In cases where all results do not satisfy the minimum threshold, you will receive the error message, “Unable to return insights due to reporting threshold.”.