Twitter PowerTrack Operators
The following operators are available to filter the Twitter firehose via PowerTrack. These operators will match specific types of Tweets, and can be combined using standard PowerTrack syntax.
| Operator | Description | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
bio_location: |
Matches tweets where the user's bio-level location contains the specified keyword or phrase. This operator performs a tokenized match, similar to the normal keyword rules on the message body.
The user bio location is a non-normalized, user-generated, free-form string.
See Examples |
||||||||||||||||||||||||
friends_count: |
Matches tweets where the author has a friends count (the number of users they follow) that falls within the given range.
If a single number is specified, any number equal to or higher will match.
Additionally, a range can be specified to match any number in the given range.
See Examples |
||||||||||||||||||||||||
url_contains: |
Matches activities with URLs that literally contain the given phrase or keyword. To search for patterns with punctuation in them (i.e. google.com) enclose the search term in quotes.
NOTE: If you're using Gnip's Enriched output format, we will match against Gnip's expanded URL as well.
See Examples |
||||||||||||||||||||||||
twitter_lang: |
Matches tweets that have been classified by Twitter as being of a particular language (if, and only if, the tweet has been classified). It is important to note that each activity is currently only classified as being of one language, so AND'ing together multiple languages will yield no results.
**Note:** if no language classification can be made the provided result is 'und' (for undefined).
The list below represents the current supported languages and their corresponding BCP 47 language indentifier:
See Examples |
||||||||||||||||||||||||
retweets_of_status_id: |
Deliver only explicit retweets of the specified Tweet.
Note that the status ID used should be the ID of an original tweet and not a retweet. If extracting the ID of an original Tweet from within a Retweet for this purpose, look in the object.id field in Activity Streams format.
See Examples |
||||||||||||||||||||||||
place_contains: |
Matches tweets where the tagged place/location contains a given substring.
Place names are semi-normalized by Twitter application but there can be many variations. A substring match allows you to easily match across variations.
See Examples |
||||||||||||||||||||||||
has:media |
Matches Tweets that contain a media url classified by Twitter, e.g. pic.twitter.com.
WARNING: Use this operator with care. Used by itself, with no other limiting clauses, it can generate large amounts of volume. Currently, this will deliver double digit percentages of the firehose when used by itself.
See Examples |
||||||||||||||||||||||||
in_reply_to_status_id: |
Deliver only explicit replies to the specified Tweet.
See Examples |
||||||||||||||||||||||||
statuses_count: |
Matches tweets where the author has posted a number of statuses that falls within the given range.
If a single number is specified, any number equal to or higher will match.
Additionally, a range can be specified to match any number in the given range.
See Examples |
||||||||||||||||||||||||
contains: |
Substring match for activities that have the given substring in the body, regardless of tokenization. In other words, this does a pure substring match, and does not consider word boundaries.
Use double quotes to match substrings that contain whitespace or punctuation.
See Examples |
||||||||||||||||||||||||
sample: |
Returns a random sample of activities that match a rule rather than the entire set of activities. Sample percent must be represented by an integer value between 1 and 100. This operator applies to the entire rule and requires any "OR'd" terms be grouped.
**Important Note:** The sample operator first reduces the scope of the firehose to X%, which then the rest of the rule is applied to. Each Tweet individually (of all tweets) has a 10% chance of being in a 10% sample, or 1%chance:1%sample, 50%chance:50%sample, etc. The sample is applied before the rule is applied to the sample.
Also, the sampling is deterministic, and you will get the same data sample in realtime as you would if you pulled the data historically.
See Examples |
||||||||||||||||||||||||
has:profile_geo |
Matches tweets that have any [Profile Geo](http://support.gnip.com/enrichments/profile_geo.html) metadata, regardless of the actual value.
See Examples |
||||||||||||||||||||||||
has:geo |
Matches Tweets that have Tweet-specific geo location data provided from Twitter. This can be either "geo" lat-long coordinate, or a "location" in the form of a Twitter ["Place"](https://dev.twitter.com/overview/api/places), with corresponding display name, geo polygon, and other fields.
WARNING: Use this operator with care, it can generate large amounts of volume. Currently, this will deliver 1-4% of the firehose independently.
See Examples |
||||||||||||||||||||||||
profile_locality: |
Matches on the "locality" field from the "address" object in the Profile Geo enrichment.
This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two". Use double quotes to match substrings that contain whitespace or punctuation.
See Examples |
||||||||||||||||||||||||
bio_location_contains: |
Matches Tweets where the user's bio-level location contains the specified substring.
The user bio location is a non-normalized, user-generated, free-form string.
**Warning**: use of broad or common locations strings can result in the consumption of large volumes of data (e.g. a bio_location_contains:"MA" rule with hopes of matching all tweets from Massachusetts, will also match "Alabama"). The addition of punctuation (e.g. ", MA" or ",MA") could help limit this data.
See Examples |
||||||||||||||||||||||||
retweets_of: |
Matches tweets that are retweets of a specified user. Accepts both usernames and numeric Twitter Account IDs (NOT tweet status IDs).
See HERE or HERE for methods for looking up numeric Twitter Account IDs.
See Examples |
||||||||||||||||||||||||
"keyword1 keyword2"~N |
Commonly referred to as a proximity operator, this matches an activity where the keywords are no more than N tokens from each other.
If the keywords are in the opposite order, they can not be more than N-2 tokens from each other.
Can have any number of keywords in quotes.
N cannot be greater than 6.
See Examples |
||||||||||||||||||||||||
is:verified |
Deliver only Tweets where the author is "verified" by Twitter. Can also be negated to exclude Tweets where the author is verified.
See Examples |
||||||||||||||||||||||||
bounding_box:[west_long south_lat east_long north_lat] |
Matches against the Exact Location (x,y) of the Activity when present, and in Twitter, against a "Place" geo polygon, where the Place is fully contained within the defined region.
- west_long south_lat represent the southwest corner of the bounding box where west-long is the longitude of that point, and south_lat is the latitude.
- east_long and north_lat represent the northeast corner of the bounding box, where east_long is the longitude of that point, and north_lat is the latitude.
- Width and height of the bounding box must be less than 25mi
- Longitude is in the range of ±180
- Latitude is in the range of ±90
- All coordinates are in decimal degrees.
- Rule arguments are contained with brackets, space delimited.
See Examples |
||||||||||||||||||||||||
bio_name_contains: |
Matches tweets where the user's display name (not username) as specified in their bio, contains a given substring.
See Examples |
||||||||||||||||||||||||
profile_point_radius:[long lat radius] |
Matches functionality described for the standard point_radius: operator, but only applies to geo-location data contained in the Profile Geo enrichment.
See Examples |
||||||||||||||||||||||||
bio_contains: |
Matches tweets whose author's Twitter bio contain the given substring. To search for patterns with punctuation in them (i.e. start-up) enclose the search term in quotes.
See Examples |
||||||||||||||||||||||||
profile_subregion_contains: |
Matches on the "subRegion" field from the "address" object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region.
This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.
See Examples |
||||||||||||||||||||||||
has:lang |
Matches activities which Gnip has classified as any language.
See Examples |
||||||||||||||||||||||||
"exact phrase match" |
Matches an exact phrase within the body of an activity. This is an exact match, and it is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two".
Note that this is not a substring match, and includes a check for word boundaries at the ends of the quoted phrase. For a pure substring match, see the contains: operator below.
See Examples |
||||||||||||||||||||||||
has:links |
This operators matches activities which contain links in the message body.
See Examples |
||||||||||||||||||||||||
place: |
Matches tweets tagged with the specified location *or* Twitter place ID (see examples). Multi-word place names (“New York City”, “Palo Alto”) should be enclosed in quotes.
**Note:** See the [GET geo/search](https://dev.twitter.com/rest/reference/get/geo/search) public API endpoint for how to obtain Twitter place IDs.
See Examples |
||||||||||||||||||||||||
has:profile_geo_region |
Matches all activities that have a profileLocations.address.region value present in the payload.
See Examples |
||||||||||||||||||||||||
listed_count: |
Matches tweets where the author has been listed within Twitter a number of times falls within the given range.
If a single number is specified, any number equal to or higher will match.
Additionally, a range can be specified to match any number in the given range.
See Examples |
||||||||||||||||||||||||
is:retweet |
Deliver only explicit retweets that match a rule. Can also be negated to exclude retweets that match a rule from delivery and only original content is delivered.
**Note:** This operator looks only for true Retweets, which use Twitter's retweet functionality. Quoted Tweets and Modified Tweets which do not use Twitter's retweet functionality will not be matched by this operator.
|
||||||||||||||||||||||||
has:hashtags |
Matches Tweets that contain a hashtag.
WARNING: Use this operator with care. Used by itself, with no other limiting clauses, it can generate large amounts of volume. Currently, this will deliver double digit percentages of the firehose when used by itself.
See Examples |
||||||||||||||||||||||||
point_radius:[lon lat radius] |
Matches against the Exact Location (x,y) of the Activity when present, and in Twitter, against a "Place" geo polygon, where the Place is fully contained within the defined region.
- Units of radius supported are miles (mi) and kilometers (km).
- Radius must be less than 25mi.
- Longitude is in the range of ±180
- Latitude is in the range of ±90
- All coordinates are in decimal degrees.
- Rule arguments are contained with brackets, space delimited.
See Examples |
||||||||||||||||||||||||
to: |
Matches any activity that is in reply to a particular user.
The to: operator returns a subset match of the @mention operator.
The value must be the user’s numeric Account ID or username (excluding the @ character). See HERE for methods for looking up numeric Twitter Account IDs.
See Examples |
||||||||||||||||||||||||
has:profile_geo_subregion |
Matches all activities that have a profileLocations.address.subRegion value present in the payload.
See Examples |
||||||||||||||||||||||||
has:mentions |
Matches Tweets that mention another Twitter user.
WARNING: Use this operator with care. Used by itself, with no other limiting clauses, it can generate large amounts of volume. Currently, this will deliver double digit percentages of the firehose when used by itself.
See Examples |
||||||||||||||||||||||||
profile_subregion: |
Matches on the "subRegion" field from the "address" object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region.
This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two". Use double quotes to match substrings that contain whitespace or punctuation.
See Examples |
||||||||||||||||||||||||
profile_locality_contains: |
Matches on the "locality" field from the "address" object in the Profile Geo enrichment.
This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.
See Examples |
||||||||||||||||||||||||
profile_bounding_box:[west_long south_lat east_long north_lat] |
Matches functionality described for the standard bounding_box: operator, but only applies to geo-location data contained in the Profile Geo enrichment.
See Examples |
||||||||||||||||||||||||
profile_country_code: |
Exact match on the "countryCode" field from the "address" object in the Profile Geo enrichment.
Uses a normalized set of two-letter country codes, based on ISO-3166-1-alpha-2 specification. This operator is provided in lieu of an operator for "country" field from the "address" object to be concise.
See Examples |
||||||||||||||||||||||||
profile_region: |
Matches on the "region" field from the "address" object in the Profile Geo enrichment.
This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two". Use double quotes to match substrings that contain whitespace or punctuation.
See Examples |
||||||||||||||||||||||||
# |
Matches any activity with the given hashtag.
This operator performs an exact match, NOT a tokenized match, meaning the rule "2016" will match posts with the exact hashtag "2016", but not those with the hashtag "2016election"
Note: that the hashtag operator relies on Twitter's entity extraction to match hashtags, rather than extracting the hashtag from the body itself. The description of how Twitter extracts entities can be found here: http://dev.twitter.com/pages/tweet_entities.
See Examples |
||||||||||||||||||||||||
followers_count: |
Matches tweets where the author has a followers count within the given range.
If a single number is specified, any number equal to or higher will match.
Additionally, a range can be specified to match any number in the given range.
See Examples |
||||||||||||||||||||||||
@ |
Matches any Tweet that mentions the given username or user ID.
The to: operator returns a subset match of the @mention operator.
Note that the mention operator relies on Twitter's entity extraction to match mentions, rather than trying to extract the mention from the body itself. The description of how Twitter extracts entities can be found here: http://dev.twitter.com/pages/tweet_entities.
See Examples |
||||||||||||||||||||||||
source: |
Matches any tweet generated by the given source application. The value must be either the name of the application, or the application's URL. Cannot be used alone.
See Examples |
||||||||||||||||||||||||
bio_lang: |
Matches tweets where the user's bio-level language setting matches a given ISO 639-1 language code. Twitter does not support all languages in this list
NOTE: This language setting simply changes the language which Twitter displays its UI text (it does not translate Tweet text). THIS IS NOT A LANGUAGE CLASSIFICATION. Customers have reported that this setting is often left in its default of English even when the Tweets an account is generating are in a foreign language.
We recommend its use in conjunction with Gnip’s language classification operator (lang) rather than a standalone indicator of a user or Tweet’s language.
See Examples |
||||||||||||||||||||||||
time_zone: |
Matches tweets where the user-selected time zone specified in a user's profile settings matches a given string.
These values are normalized to the options specified on a user's account settings page: [https://twitter.com/account/settings]
See Examples |
||||||||||||||||||||||||
from: |
Matches any activity from a specific user.
In Twitter, the value must be the user’s Twitter Account ID or username (excluding the @ character). See HERE or HERE for methods for looking up numeric Twitter Account IDs.
For some publishers, MD5-hashed email can be used.
See Examples |
||||||||||||||||||||||||
lang: |
Matches activities that have been classified by Gnip as being of a particular language (if, and only if, the activity has been classified). Current languages supported are:
See Examples |
||||||||||||||||||||||||
keyword |
Matches a keyword within the body of an activity. This is a tokenized match, meaning that your keyword string will be matched against the tokenized text of the activity body -- tokenization is based on punctuation, symbol, and separator Unicode basic plane characters. For example, an activity with the text "I like coca-cola" would be split into the following tokens: I, like, coca, cola. These tokens would then be compared to the keyword string used in your rule. To match strings containing punctuation (e.g. coca-cola), symbol, or separator characters, you must use a quoted exact match as described below.
See Examples |
||||||||||||||||||||||||
profile_region_contains: |
Matches on the "region" field from the "address" object in the Profile Geo enrichment.
This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.
See Examples |
||||||||||||||||||||||||
has:profile_geo_locality |
Matches all activities that have a profileLocations.address.locality value present in the payload.
See Examples |
||||||||||||||||||||||||
country_code: |
Matches tweets where the country code associated with a tagged [place/location](https://dev.twitter.com/overview/api/places) matches the given ISO alpha-2 character code.
Valid ISO codes can be found here: [http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2](http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
See Examples |