PowerTrack Rules


Introduction  

Products utilizing PowerTrack rules deliver social data to you based on filtering rules you set up. Rules are made up of one or more ‘clauses’, where a clause is a keyword, exact phrase, or one of the many PowerTrack Operators. Before beginning to build PowerTrack rules, be sure to review the syntax described below, look through the list of available operators, and understand the restrictions around building rules. You should also be sure to understand the nuances of how rules are evaluated logically, in the ‘order of operations’ section.

Multiple clauses can be combined with both ‘and’ and ‘or’ logic. ‘And’ logic is specified with a space between clauses, and ‘or’ logic is specified with an upper-case OR. See below for more details…

Realtime and Historical PowerTrack (as well as Replay) support two forms of rules:

  • ‘Standard’ rules can be up to 1,024 characters long and contain up to 30 positive clauses (things you want to match or filter on) and 50 negative clauses (things you want to exclude and not match on).
  • ‘Long’ rules can be up to 2,048 characters long with no limits on the number of positive and negative clauses.

     Reach out to your account representative to switch the form of rules your stream uses.

Building Rules with PowerTrack  

Keyword match

Keyword matches are similar to queries in a search interface (e.g. Google). For example, the following PowerTrack rule would match activities with ‘happy’ in the text body.

 happy

ANDing terms with white space

Adding another keyword is the same as adding another requirement for finding matches. For example, this rule would only match activities where both ‘happy’ and ‘party’ were present in the text, in either order – having a space between terms operates as boolean AND logic. Note that if you include an explicit AND it will be treated as a regular keyword.

 happy party

ORing terms with upper-case OR

Many situations actually call for boolean OR logic, however. This is easily accomplished as well. Note that the OR operator must be upper-case and a lower-case ‘or’ will be treated as a regular keyword.

 happy OR party

Negating terms

Still other scenarios might call for excluding results with certain keywords (a boolean NOT logic). For instance, activities with ‘happy’, but excluding any with ‘birthday’ in the text.

 happy -birthday

Grouping with parentheses

These types of logic can be combined using grouping with parentheses, and expanded to much more complex queries.

 (happy OR party) (holiday OR house) -(birthday OR democratic OR republican)

This is just the beginning though – while the above examples rely simply on tokenized matching for keywords, PowerTrack also offers a operators to perform different types of matching on the text.

Exact match

 "happy birthday"

Substring match

 contains:day

Proximity match

 "happy birthday"~3

Further, other operators allow you to filter on unique aspects of social data, besides just the text. For example:

The user who is posting a Tweet

 from:user

Geo-tagged Tweets within 10 miles of Pearl St. in Boulder, CO

 point_radius:[-105.27346517 40.01924738 10.0mi]

Putting it all together

These can be combined with text filters using the same types of logic described above.

 (happy OR party) (holiday OR house OR "new year's eve") point_radius:[-105.27346517 40.01924738 10.0mi] lang:en -(birthday OR democratic OR republican)

Using various operators:

 (gnip OR from:688583 OR @gnip) ("powertrack operators" OR "streaming code"~4) contains:help bio_contains:developer has:links url_contains:github source:web (friends_count:1 OR followers_count:2000 OR listed_count:500 OR statuses_count:1000 OR is:verified OR klout_score:50) (country_code:US OR bio_location:CO OR bio_location_contains:Boulder OR time_zone:"Mountain Time (US & Canada)") -is_retweet (lang:en OR twitter_lang:en)

Also see below for converting this to JSON.


Boolean Syntax 

The examples in the previous section, utilized various types of boolean logic and grouping. See the table below for additional detail regarding the syntax and requirements for each.

Logic Type PowerTrack Syntax Description
AND social data Whitespace between two operators results in AND logic between them

Matches activities containing BOTH keywords ('social', 'data').

Do NOT use AND explicitly in your rule. Only use whitespace. An explicit AND will be treated like a regular keyword.
OR social OR data To OR together two operators, insert an all-caps OR, enclosed in whitespace between them

Matches activities with EITHER keyword ('social' OR 'data')

Note that if you combine OR and AND functionality in a single rule, you should understand the order of operations described here, and consider grouping operators together using parentheses as described below to ensure your rule behaves as expected.

You must use upper-case 'OR' in your rule. Lower-case 'or' will be treated as a regular keyword.
NOT  social -data
apple -(fruit OR orange)
apple -(android phone)
Insert a - character immediately in front of the operator or group of operators.

The example rule shown matches activities containing keyword 'social', but excludes those which contain the keyword 'data')

Negated ORs are not allowed where the rule would request "everything in the firehose except the negation." E.g., apple OR -ipad is invalid because it would match all activities except those mentioning 'ipad'.
Grouping (social OR data) -(gnop OR ping) Parentheses around multiple operators create a functional "group".

Groups can be connected to clauses in the same manner as an individual clause via whitespace (AND) or ORs, and can be negated. However, note that the same restriction described above regarding negation/OR combination also applies to groups. For example, the following are examples of invalid syntax using groups:
ipad OR -(iphone OR ipod)
ipad OR (-iphone OR ipod)

Grouping is especially important where a single rule combines AND and OR functionality, due to the order of operations used to evaluate the rule. See below for more details.

Note that Operators may be either positive or negative.

Positive Operators  define what you want to include in the results. E.g. the ‘is:retweet’ operator says “I only want retweets.”

Negative Operators  define what you want to exclude from the results, and are created by using the Boolean NOT logic described above. E.g. ‘-is:retweet’ says “Exclude retweets from my rule, I only want to receive original Tweets, no retweets”

If using ‘standard’ rules, a single ‘standard’ PowerTrack rule can support up to 30 positive operators, and up to 50 negative operators, subject to a maximum length of 1,024 characters as documented here. If using ‘long’ rules, there is no limit on the number of positive and negative clauses, subject to a maximum length of 2,048 characters.


Order of Operations 

When combining AND and OR functionality in a single rule, the following order of operations will dictate how your rule is evaluated.

  1. Operators connected by AND logic are combined first
  2. Then, Operators connected with OR logic are applied

Example:

  • apple OR iphone ipad would be evaluated as apple OR (iphone ipad)
  • ipad iphone OR android would be evaluated as (iphone ipad) OR android

To eliminate uncertainty and ensure that your rules are evaluated as intended, group terms together with parentheses where appropriate. For example:

  • (apple OR iphone) ipad
  • iphone (ipad OR android)

Punctuation, Diacritics, and Case Sensitivity 

In PowerTrack Operators, characters with accents or diacritics are treated the same as normal characters and are not treated as word boundaries. For example, a rule of cumpleaños would only match activities containing the word cumpleaños and would not match activities containing cumplea, cumplean, or os.

All Operators are evaluated in a case-insensitive manner. For example, the rule Cat will match all of the following: cat, CAT, Cat.


Rule Tags 

As described here, each rule can be created with a tag. Each rule can have only one tag, with a maximum of 255 characters. Tags have no effect on rule uniqueness, for example identical rules with different tags will be ignored. It is not possible to have two separate rules with the same value on the same stream or job. Tags have no effect on filtering, but can create an identification for each rule which can be referenced by your app for business logic. Tags should be included with the JSON formatted rule at the time of creation via the API, as described in our documentation.


Putting Rules in JSON Format 

In order to add or delete a rule from a PowerTrack stream via the API, the rules must utilize JSON format. Essentially, this requires putting each rule into the following structure:

 {"value":"insert_rule_here"}

Rules with Double-quotes

If the ‘rule’ contains double-quote characters (“) associated with exact-match or other operators, they must be escaped using a backslash to distinguish them from the structure of the JSON format. For example, if your rule is:

 "social data" @gnip

The JSON formatted rule would be:

 {"value":"\"social data\" @gnip"}

Rules with Double-quote String Literals

To include a double-quote character as a string literal within an exact-match, it must be double-escaped. For example, for a rule matching on the exact phrase ‘Toys “R” Us’, including the double-quotes around R, the plain-text representation of this would look like the following:

  "Toys \"R\" Us"

Translating this to JSON format, you should use the following structure:

  {"value":"\"Toys \\\"R\\\" Us\""}

More examples:

{"value":"happy"}

{"value":"happy party"}

{"value":"happy -birthday"}

{"value":"happy OR party"}

{"value":"from:twitterdev"}

{"value":"(happy OR party) (holiday OR house) -(birthday OR democratic OR republican)"},

{"value":"\"happy birthday\"~3","tag":null},{"value":"(happy OR party) (holiday OR house OR \"new year's eve\") point_radius:[-105.27346517 40.01924738 10.0mi] lang:en -(birthday OR democratic OR republican)"}

 {"value":"(gnip OR from:688583 OR @gnip) (\"powertrack operators\" OR \"streaming code\"~4) contains:help bio_contains:developer has:links url_contains:github source:web (friends_count:1 OR followers_count:2000 OR listed_count:500 OR statuses_count:1000 OR is:verified OR klout_score:50) (country_code:US OR bio_location:CO OR bio_location_contains:Boulder OR time_zone:\"Mountain Time (US & Canada)\") -is_retweet (lang:en OR twitter_lang:en)"}

Rules with Tags

To include an optional Tag with your rule, as described above, simply include an additional “tag” field with the rule value:

 {"value":"\"social data\" @gnip","tag":"RULE-TAG-01"}

Formatting for API Requests

When adding or deleting rules from the PowerTrack stream via the API, multiple JSON formatted rules should be comma delimited, and wrapped in a JSON “rules” array, as shown below:

  {"rules":[{"value":"from:gnip"},{"value":"\social data\" @gnip","tag":"RULE-TAG-01"}]}

List of Operators 

Below is a list of all operators supported in PowerTrack. Note that while many operators work across multiple different data sources, others are specific to a specific source. See the Sources column for the data sources that a specific operator applies to.

Or, for a list of all the operators available for a specific source, see one of the following links.


Operator Description
bio_location: Matches tweets where the user's bio-level location contains the specified keyword or phrase. This operator performs a tokenized match, similar to the normal keyword rules on the message body. The user bio location is a non-normalized, user-generated, free-form string.
Gnip Rule Match No Match
bio_location:"boulder" actor.location.displayname:Boulder
actor.location.displayname:Boulder, CO
actor.location.displayname:Boulder Colorado
actor.location.displayname:Beautiful Boulder, CO
actor.location.displayname:BoCo
actor.location.displayname:Boulderado
actor.location.displayname:Colorado

See Examples Sources | Twitter
verb:update Matches activities where a previously created comment has been updated. Sources | Disqus
friends_count: Matches tweets where the author has a friends count (the number of users they follow) that falls within the given range. If a single number is specified, any number equal to or higher will match. Additionally, a range can be specified to match any number in the given range.
Gnip Rule Match No Match
friends_count:1000 Tweets (from user) that have friends_count:1000 or more Tweets (from user) that have friends_count:999 or less
friends_count:1000..10000 Tweets (from user) that have friends_count:1000
Tweets (from user) that have friends_count:6814
Tweets (from user) that have friends_count:10000
Tweets (from user) that have friends_count:999 or less
Tweets (from user) that have friends_count:10001 or more

See Examples Sources | Twitter
is:article Matches only activities that are posted articles.
Gnip Rule Applies To Match No Match
is:article Posted Articles    

See Examples Sources | Wordpress
url_contains: Matches activities with URLs that literally contain the given phrase or keyword. To search for patterns with punctuation in them (i.e. google.com) enclose the search term in quotes. NOTE: If you're using Gnip's Enriched output format, we will match against Gnip's expanded URL as well.
Gnip Rule Match No Match
url_contains:gnip http://support.gnip.com/
https://github.com/abh1nav/gnippy
https://gn.ip.com
url_contains:"how-to" https://www.coachella.com/how-to-purchase/  
url_contains:teslas twitter_entities.urls.url: http://t.co/yECAbi9p6Q twitter_entities.urls.expanded_url: http://wrd.cm/1IfohKo gnip.urls.display_url: wrd.cm/1IfohKo gnip.urls.expanded_url: http://www.wired.com/2015/05/used-teslas/ (matches fully unwound URL)  

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
twitter_lang: Matches tweets that have been classified by Twitter as being of a particular language (if, and only if, the tweet has been classified). It is important to note that each activity is currently only classified as being of one language, so AND'ing together multiple languages will yield no results. **Note:** if no language classification can be made the provided result is 'und' (for undefined). The list below represents the current supported languages and their corresponding BCP 47 language indentifier:
  • Amharic - am
  • Arabic - ar
  • Armenian - hy
  • Bengali - bn
  • Bosnian - bs
  • Bulgarian - bg
  • Cherokee - chr
  • Chinese - zh
  • Croatian - hr
  • Danish - da
  • Dutch - nl
  • English - en
  • Estonian - et
  • Finnish - fi
  • French - fr
  • Georgian - ka
  • German - de
  • Greek - el
  • Gujarati - gu
  • Haitian - ht
  • Hebrew - iw
  • Hindi - hi
  • Hungarian - hu
  • Icelandic - is
  • Indonesian - in
  • Inuktitut - iu
  • Italian - it
  • Japanese - ja
  • Kannada - kn
  • Khmer - km
  • Korean - ko
  • Lao - lo
  • Latvian - lv
  • Lithuanian - lt
  • Malayalam - ml
  • Maldivian - dv
  • Marathi - mr
  • Myanmar-Burmese - my
  • Nepali - ne
  • Norwegian - no
  • Oriya - or
  • Panjabi - pa
  • Pashto - ps
  • Persian - fa
  • Polish - pl
  • Portuguese - pt
  • Romanian - ro
  • Russian - ru
  • Serbian - sr
  • Sindhi - sd
  • Sinhala - si
  • Slovak - sk
  • Slovenian - sl
  • Sorani Kurdish - ckb
  • Spanish - es
  • Swedish - sv
  • Tagalog - tl
  • Tamil - ta
  • Telugu - te
  • Thai - th
  • Tibetan - bo
  • Turkish - tr
  • Ukrainian - uk
  • Urdu - ur
  • Uyghur - ug
  • Vietnamese - vi
  • Welsh - cy

Gnip Rule Match No Match
twitter_lang:fr "C'est un plaisir de vous rencontrer!" "Nice to meet you!"

See Examples Sources | Twitter
type: Matches activities of a giventype. Options include: answer, audio, chat, link, photo, quote, text, video. Can be negated to include all posts except for a given type.
Gnip Rule Match No Match
type:text All posts of type "text"  
type:photo All posts of type "photo"  

See Examples Sources
retweets_of_status_id: Deliver only explicit retweets of the specified Tweet. Note that the status ID used should be the ID of an original tweet and not a retweet. If extracting the ID of an original Tweet from within a Retweet for this purpose, look in the object.id field in Activity Streams format.
Gnip Rule Match
retweets_of_status_id:365697420392280064 Retweets of the Tweet with status 365697420392280064
retweets_of:gnip -retweets_of_status_id:365697420392280064 Retweets of Gnip, except for retweets of the specified Tweet status.

See Examples Sources | Twitter
place_contains: Matches tweets where the tagged place/location contains a given substring. Place names are semi-normalized by Twitter application but there can be many variations. A substring match allows you to easily match across variations.
Gnip Rule Match No Match
place_contains:USA Tweets that are geo-tagged with place.name:Colorado, USA
Tweets that are geo-tagged with place.name:Louisiana, USA
Tweets where place:null

See Examples Sources | Twitter
has:media Matches Tweets that contain a media url classified by Twitter, e.g. pic.twitter.com.
WARNING: Use this operator with care. Used by itself, with no other limiting clauses, it can generate large amounts of volume. Currently, this will deliver double digit percentages of the firehose when used by itself.
Gnip Rule Match No Match
has:media (Any Tweets that contain a media url as classified by Twitter including images and videos)  

See Examples Sources | Twitter
verb:delete Matches activities where a previously created comment has been deleted. Sources | Disqus
in_reply_to_status_id: Deliver only explicit replies to the specified Tweet.
Gnip Rule Match No Match
in_reply_to_status_id:365697420392280064 Replies to the Tweet with status 365697420392280064.
to:gnip -in_reply_to_status_id:365697420392280064 Replies to (rather than general @ mentions of) gnip, except for replies to the Tweet with status 365697420392280064

See Examples Sources | Twitter
statuses_count: Matches tweets where the author has posted a number of statuses that falls within the given range. If a single number is specified, any number equal to or higher will match. Additionally, a range can be specified to match any number in the given range.
Gnip Rule Match No Match
statuses_count:1000 Tweets (from user) that have statuses_count:1000 or more Tweets (from user) that have statuses_count:999 or less
statuses_count:1000..10000 Tweets (from user) that have statuses_count:1000
Tweets (from user) that have statuses_count:6814
Tweets (from user) that have statuses_count:10000
Tweets (from user) that have statuses_count:999 or less
Tweets (from user) that have statuses_count:10001 or more

See Examples Sources | Twitter
contains: Substring match for activities that have the given substring in the body, regardless of tokenization. In other words, this does a pure substring match, and does not consider word boundaries. Use double quotes to match substrings that contain whitespace or punctuation.
Gnip Rule Match No Match
contains:phone Where is my phone?
That's a telephone
Pongo la telephono.
What is the ph0ne number?
contains:"$TWTR" How much is $TWTR stock?
How much is $TWTRstock?
Headlines with $GOOG$TWTR$FB today
Just setting up my TWTR Just setting up my $ TWTR

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
sample: Returns a random sample of activities that match a rule rather than the entire set of activities. Sample percent must be represented by an integer value between 1 and 100. This operator applies to the entire rule and requires any "OR'd" terms be grouped. **Important Note:** The sample operator first reduces the scope of the firehose to X%, which then the rest of the rule is applied to. Each Tweet individually (of all tweets) has a 10% chance of being in a 10% sample, or 1%chance:1%sample, 50%chance:50%sample, etc. The sample is applied before the rule is applied to the sample. Also, the sampling is deterministic, and you will get the same data sample in realtime as you would if you pulled the data historically.
Gnip Rule Match No Match
dog sample:50 All of the Tweets matching the keyword dog within the 50% firehose sample.  
(dog OR cat) sample:25 All of the Tweets matching the keyword cat or the keyword dog within the 25% firehose sample.  
sample:2 2% of all tweets (Note:This is a stand alone rule for 1-10% sample)  

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
activity_url_contains: Matches activities where the activity URL (i.e. permalink) contains the given phrase or keyword. URL encodings are not encoded at this time. To search for patterns with punctuation in them (i.e. cnn.com) enclose the search term in quotes. **Applies to:** - Posted Article (Automattic): article URL - Comment (Automattic): comment URL (which contains article URL) - Like (Automattic): like URL (which contains article URL)
Gnip Rule Match No Match
activity_url_contains:"cnn.com" All posted articles, comments, and likes with cnn.com in the URL.  
activity_url_contains:"obama" All posted articles, comments, and likes with obama in the URL.  
activity_url_contains:"/cheryl_staton" http://wordpress.com/cheryl_staton/  

See Examples Sources | IntenseDebate | Wordpress
has:profile_geo Matches tweets that have any [Profile Geo](http://support.gnip.com/enrichments/profile_geo.html) metadata, regardless of the actual value.
Gnip Rule Match
cat has:profile_geo If account is enabled for the Profile-Geo Enrichment, this will match any Tweets that mentions the word “cat” and has any Gnip Profile Geo metadata derived from the user's bio "location". Tweets from accounts that do not have a bio "location" entered by the user

See Examples Sources | Twitter
has:geo Matches Tweets that have Tweet-specific geo location data provided from Twitter. This can be either "geo" lat-long coordinate, or a "location" in the form of a Twitter ["Place"](https://dev.twitter.com/overview/api/places), with corresponding display name, geo polygon, and other fields. WARNING: Use this operator with care, it can generate large amounts of volume. Currently, this will deliver 1-4% of the firehose independently.
Gnip Rule Match No Match
sale has:geo Any Tweets with geolocation data, either an exact lat/lon or a named "place", that also have the keyword 'sale' in the body of the Tweet Tweets that have keyword 'sale' but do not have a place/location
sale -has:geo Tweets that have keyword 'sale' that do not have a place/location

See Examples Sources | Twitter
profile_locality: Matches on the "locality" field from the "address" object in the Profile Geo enrichment. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two". Use double quotes to match substrings that contain whitespace or punctuation.
Gnip Rule Match
profile_locality:boulder All Profile Geo Enrichments in ANY city named "Boulder"

See Examples Sources | Twitter
bio_location_contains: Matches Tweets where the user's bio-level location contains the specified substring. The user bio location is a non-normalized, user-generated, free-form string. **Warning**: use of broad or common locations strings can result in the consumption of large volumes of data (e.g. a bio_location_contains:"MA" rule with hopes of matching all tweets from Massachusetts, will also match "Alabama"). The addition of punctuation (e.g. ", MA" or ",MA") could help limit this data.
Gnip Rule Match No Match
bio_location_contains:"AZ" actor.location.displayname:Pheonix, AZ
actor.location.displayname:Beautiful Pheonix, AZ
actor.location.displayname:Aztec Ruins
actor.location.displayname:Arizona
actor.location.displayname:USA
bio_location_contains:", MA" actor.location.displayname:Boston, MA
actor.location.displayname:Andapa, Madagascar
actor.location.displayname:Alabama
actor.location.displayname:Mass

See Examples Sources | Twitter
retweets_of: Matches tweets that are retweets of a specified user. Accepts both usernames and numeric Twitter Account IDs (NOT tweet status IDs). See HERE or HERE for methods for looking up numeric Twitter Account IDs.
Gnip Rule Match No Match
retweets_of:justinbieber When verb:share this matches on the object.actor.preferredUsername:justinbieber
Retweets of organic tweets from justinbieber account
Retweets of retweets by justinbieber
Quoted justinbieber tweets
retweets_of:6264412    

See Examples Sources | Twitter
"keyword1 keyword2"~N Commonly referred to as a proximity operator, this matches an activity where the keywords are no more than N tokens from each other. If the keywords are in the opposite order, they can not be more than N-2 tokens from each other. Can have any number of keywords in quotes. N cannot be greater than 6.
Gnip Rule Match No Match
"love boulder"~4 Love everything about my town Boulder.
Boulder, I love living here.
I don’t love hiking, but I really like to visit Boulder.
Boulder is a place I love to visit.

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
is:verified Deliver only Tweets where the author is "verified" by Twitter. Can also be negated to exclude Tweets where the author is verified.
Gnip Rule Match No Match
dog is:verified Tweets from verified users with the keyword dog  
cat -is:verified Tweets only from not verified users with the keyword dog  
dog OR (cat is:verified) Tweets containing the keyword dog or Tweets from verified users with the keyword cat  
(dog OR cat) is:verified Tweets from verified users with either the keyword dog or the keyword cat  

See Examples Sources | Twitter
bounding_box:[west_long south_lat east_long north_lat] Matches against the Exact Location (x,y) of the Activity when present, and in Twitter, against a "Place" geo polygon, where the Place is fully contained within the defined region. - west_long south_lat represent the southwest corner of the bounding box where west-long is the longitude of that point, and south_lat is the latitude. - east_long and north_lat represent the northeast corner of the bounding box, where east_long is the longitude of that point, and north_lat is the latitude. - Width and height of the bounding box must be less than 25mi - Longitude is in the range of ±180 - Latitude is in the range of ±90 - All coordinates are in decimal degrees. - Rule arguments are contained with brackets, space delimited.
Gnip Rule Match No Match
bounding_box:[-105.301758 39.964069 -105.178505 40.09455] Tweets (with place) or Checkins with coordinates contained within a box drawn around Boulder, CO Tweets (with place) or Checkins outside the box drawn around Boulder, CO
Tweets without place defined.

See Examples Sources | Twitter
is:reblog Deliver only reblogs that match a rule. Can also be negated to exclude reblogs that match a rule from delivery and only original content is delivered.
Gnip Rule Match No Match
dog is:reblog    
cat -is:reblog    
dog OR (cat is:reblog)    
(dog OR cat) is:reblog    

See Examples Sources
bio_name_contains: Matches tweets where the user's display name (not username) as specified in their bio, contains a given substring.
Gnip Rule Match No Match
bio_name_contains:"Mike" (Any tweets from a user whose said they were named Mike in their Twitter bio)  

See Examples Sources | Twitter
profile_point_radius:[long lat radius] Matches functionality described for the standard point_radius: operator, but only applies to geo-location data contained in the Profile Geo enrichment.
Gnip Rule Match
profile_point_radius:[-105.27346517 40.01924738 10.0mi] Profile Geo Enrichments with coordinates within 10 miles of 17th & Pearl St. in Boulder, CO

See Examples Sources | Twitter
bio_contains: Matches tweets whose author's Twitter bio contain the given substring. To search for patterns with punctuation in them (i.e. start-up) enclose the search term in quotes.
Gnip Rule Match No Match
bio_contains:CEO "CEO of ABC Corp" "COO at DEF, Inc."
bio_contains:"Start-up" "Start-up junkie" "Software Engineer startup @Gnip"
bio_contains:"bieber" "World's biggest @justinbieber fan" "I love biebs"

See Examples Sources | Twitter
post_title_contains: Automattic: Matches a substring within the title of a posted article. Use double quotes to match substrings that contain whitespace or punctuation. **Applies To:** - Posted Articles: article title - Comment: article title - Like: article title
Gnip Rule Match No Match
post_title_contains:Obama This presidency is an obamanation  

See Examples Sources | IntenseDebate | Wordpress
publisher: Matches the Automattic publisher, Wordpress.com, Wordpress.org, or IntenseDebate. Note that you can only run 1 publisher: operator per rule. The publisher operator can be negated if necessary.
Gnip Rule Applies to Match
publisher:wordpresscom Posted Articles
Comments
Likes
Activities from Wordpress.com
publisher:wordpressorg Posted Articles
Comments
Likes
Activities from Wordpress.org
publisher:intensedebate Posted Articles
Comments
Likes
Activities from IntenseDebate

See Examples Sources | IntenseDebate | Wordpress
profile_subregion_contains: Matches on the "subRegion" field from the "address" object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.
Gnip Rule Match
profile_subregion_contains:jefferson All Profile Geo Enrichments where the substring 'jefferson' appears in the subRegion (e.g. 'Jefferson County')

See Examples Sources | Twitter
has:lang Matches activities which Gnip has classified as any language.
Gnip Rule Match No Match
has:lang gnip.language.value: es gnip.language.value: null
twitter_lang:es (but gnip.language.value: null)

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
"exact phrase match" Matches an exact phrase within the body of an activity. This is an exact match, and it is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two". Note that this is not a substring match, and includes a check for word boundaries at the ends of the quoted phrase. For a pure substring match, see the contains: operator below.
Gnip Rule Match No Match
"call gnip" I need to call gnip, again
I need to call gnip again
call gnip
I called gnip
call gnip (multiple spaces)
call-gnip
call_gnip
"one/two" Maybe we can look at one/two different computers
One/two/three - fourth time's is a charm
call gnip
#one/two hashtags with punctuation don't work well
one//two slash happy
one\two

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
has:links This operators matches activities which contain links in the message body.
Gnip Rule Match No Match
cat has:links Here's a picture of my cat: bit.ly/cat
Adopt a cat at http://spca.org/cats
Check out @gnip
Check out #gnip

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
place: Matches tweets tagged with the specified location *or* Twitter place ID (see examples). Multi-word place names (“New York City”, “Palo Alto”) should be enclosed in quotes. **Note:** See the [GET geo/search](https://dev.twitter.com/rest/reference/get/geo/search) public API endpoint for how to obtain Twitter place IDs.
Gnip Rule Match No Match
place:"Rio de Janeiro" Tweets that are geo-tagged with the exact place.name Rio de Janeiro Tweets where place:null
place:Florida Tweets that are geo-tagged with the exact place.name:Florida Tweets that are geo-tagged with place.name:USA
Tweets where place:null
place:fd70c22040963ac7 Tweets that are geo-tagged with the exact Twitter place.id:fd70c22040963ac7
Tweets that are geo-tagged with Boulder, CO (place.id:fd70c22040963ac7)
Tweets where place.id:e21c8e4914eef2b3 (Note: this is the placeID for the state Colorado)
Tweets where place:null

See Examples Sources | Twitter
has:profile_geo_region Matches all activities that have a profileLocations.address.region value present in the payload.
Gnip Rule Match
profile_country_code:us has:profile_geo_region All Tweets with Profile Geo locations in the US that include region-level detail (e.g. US states).

See Examples Sources | Twitter
listed_count: Matches tweets where the author has been listed within Twitter a number of times falls within the given range. If a single number is specified, any number equal to or higher will match. Additionally, a range can be specified to match any number in the given range.
Gnip Rule Match No Match
listed_count:1000 Tweets (from user) that have listed_count:1000 or more Tweets (from user) that have listed_count:999 or less
listed_count:1000..10000 Tweets (from user) that have listed_count:1000
Tweets (from user) that have listed_count:6814
Tweets (from user) that have listed_count:10000
Tweets (from user) that have listed_count:999 or less
Tweets (from user) that have listed_count:10001 or more

See Examples Sources | Twitter
is:retweet Deliver only explicit retweets that match a rule. Can also be negated to exclude retweets that match a rule from delivery and only original content is delivered. **Note:** This operator looks only for true Retweets, which use Twitter's retweet functionality. Quoted Tweets and Modified Tweets which do not use Twitter's retweet functionality will not be matched by this operator.
Gnip Rule Match No Match
dog is:retweet RT "I love my dog!" "My dog is the best."
cat -is:retweet "My cat > your dog." RT "My cat is the best."
is:retweet (dog or cat) RT "I can't wait to get a dog!" "Would you rather have a dog or a cat?"
See Examples Sources | Twitter
has:hashtags Matches Tweets that contain a hashtag. WARNING: Use this operator with care. Used by itself, with no other limiting clauses, it can generate large amounts of volume. Currently, this will deliver double digit percentages of the firehose when used by itself.
Gnip Rule Match No Match
cat has:hashtags My cat is too fat. #diet
My cat just had kittens. #cute
 

See Examples Sources | Twitter
is:like Matches only activities that are Likes. This includes Likes, Un-Likes, Vote-Ups, and Vote-downs.
Gnip Rule Match No Match
is:like    

See Examples Sources | IntenseDebate | Wordpress
is:comment Matches only activities that are comments.
Gnip Rule Applies To Match No Match
is:comment Comments    

See Examples Sources | IntenseDebate | Wordpress
point_radius:[lon lat radius] Matches against the Exact Location (x,y) of the Activity when present, and in Twitter, against a "Place" geo polygon, where the Place is fully contained within the defined region. - Units of radius supported are miles (mi) and kilometers (km). - Radius must be less than 25mi. - Longitude is in the range of ±180 - Latitude is in the range of ±90 - All coordinates are in decimal degrees. - Rule arguments are contained with brackets, space delimited.
Gnip Rule Match No Match
point_radius:[-105.27346517 40.01924738 0.5mi] Geo-tagged Tweets within .5 miles of 17th and Pearl Street in Boulder, CO. Geo-tagged Tweets outside more than .5 miles from 17th and Pearl in Boulder, CO.
Tweets without place defined
point_radius:[2.355128 48.861118 16km] Geo-tagged Tweets within 16 kilometers of the center of Paris, France Geo-tagged Tweets outside more than 16 kilometers from the center of Paris, France
Tweets without place defined

See Examples Sources | Twitter
to: Matches any activity that is in reply to a particular user. The to: operator returns a subset match of the @mention operator. The value must be the user’s numeric Account ID or username (excluding the @ character). See HERE for methods for looking up numeric Twitter Account IDs.
Gnip Rule Match No Match
Twitter
to:gnip
to:16958875
Tweets that have a Tweet ID from @Gnip as the in_reply_to: id specified
Reply to a tweet sent originally by @gnip (Twitter ID = 16958875)
"in_reply_to_status_id_str":"841679557522513920","in_reply_to_user_id_str":"16958875"
Tweets that start with @Gnip "in_reply_to_status_id_str":null,"in_reply_to_user_id_str":"63046977","in_reply_to_screen_name":"gnip"
Tweet that mentions @gnip but not start with @gnip
Quote tweets of tweets from @gnip

See Examples Sources | Twitter
has:profile_geo_subregion Matches all activities that have a profileLocations.address.subRegion value present in the payload.
Gnip Rule Match
profile_country_code:us has:profile_geo_subregion Tweets with Profile Geo locations in the US that include sub-region (county) level detail..

See Examples Sources | Twitter
has:mentions Matches Tweets that mention another Twitter user. WARNING: Use this operator with care. Used by itself, with no other limiting clauses, it can generate large amounts of volume. Currently, this will deliver double digit percentages of the firehose when used by itself.
Gnip Rule Match No Match
"best friends" has:mentions Tweets that mention other users and have the phrase "best friends"  
enemies -has:mentions Tweets that have the keyword enemies and do not mention other users  

See Examples Sources | Twitter
profile_subregion: Matches on the "subRegion" field from the "address" object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two". Use double quotes to match substrings that contain whitespace or punctuation.
Gnip Rule Match
profile_subregion:"San Francisco County" All Profile Geo Enrichments where the subRegion is San Francisco County.
profile_subregion:"San Mateo County" All Profile Geo Enrichments where the subRegion is San Mateo County.

See Examples Sources | Twitter
profile_locality_contains: Matches on the "locality" field from the "address" object in the Profile Geo enrichment. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.
Gnip Rule Match
profile_locality_contains:haven All Profile Geo Enrichments in ANY city containing the substring "haven" including "New Haven," "West Haven," and "Lock Haven"

See Examples Sources | Twitter
profile_bounding_box:[west_long south_lat east_long north_lat] Matches functionality described for the standard bounding_box: operator, but only applies to geo-location data contained in the Profile Geo enrichment.
Gnip Rule Match
profile_bounding_box: [-105.301758 39.964069 -105.178505 40.09455] Profile Geo Enrichments with coordinates contained within a box drawn around Boulder, CO

See Examples Sources | Twitter
profile_country_code: Exact match on the "countryCode" field from the "address" object in the Profile Geo enrichment. Uses a normalized set of two-letter country codes, based on ISO-3166-1-alpha-2 specification. This operator is provided in lieu of an operator for "country" field from the "address" object to be concise.
Gnip Rule Match
profile_country_code:us All Profile Geo Enrichments in the United States.

See Examples Sources | Twitter
profile_region: Matches on the "region" field from the "address" object in the Profile Geo enrichment. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use "one/two", not "one\/two". Use double quotes to match substrings that contain whitespace or punctuation.
Gnip Rule Match
profile_region:"New York" All Profile Geo Enrichments in New York state

See Examples Sources | Twitter
source_url_contains: Matches activities where the post's author has attributed the post's content to URL containing the given substring. This is an optional field that is editorially entered by the Post's author
Gnip Rule Match
source_url_contains:"cnn.com" All activities where the content has been attributed to a cnn.com web page by the post's author  
source_url_contains:"obama" All activities where the content has been attributed to a web page with "obama" in the URL by the post's author

See Examples Sources
# Matches any activity with the given hashtag. This operator performs an exact match, NOT a tokenized match, meaning the rule "2016" will match posts with the exact hashtag "2016", but not those with the hashtag "2016election" Note: that the hashtag operator relies on Twitter's entity extraction to match hashtags, rather than extracting the hashtag from the body itself. The description of how Twitter extracts entities can be found here: http://dev.twitter.com/pages/tweet_entities.
Gnip Rule Match No Match
#politics All posts tagged with #politics  
#2016_election All posts tagged with #2016_election Posts tagged with #2016
#boulderfire All posts tagged with #boulderfire Posts tagged with #boulderfirefighters

See Examples Sources | Twitter
followers_count: Matches tweets where the author has a followers count within the given range. If a single number is specified, any number equal to or higher will match. Additionally, a range can be specified to match any number in the given range.
Gnip Rule Match No Match
followers_count:1000 Tweets (from user) that have followers_count:1000 or more Tweets (from user) that have followers_count:999 or less
followers_count:1000..10000 Tweets (from user) that have followers_count:1000
Tweets (from user) that have followers_count:6814
Tweets (from user) that have followers_count:10000
Tweets (from user) that have followers_count:999 or less
Tweets (from user) that have followers_count:10001 or more

See Examples Sources | Twitter
verb:post Matches activities where a new comment has been created. Sources | Disqus
@ Matches any Tweet that mentions the given username or user ID. The to: operator returns a subset match of the @mention operator. Note that the mention operator relies on Twitter's entity extraction to match mentions, rather than trying to extract the mention from the body itself. The description of how Twitter extracts entities can be found here: http://dev.twitter.com/pages/tweet_entities.
@ gnip
Gnip Rule Match No Match
@gnip cool @gnip stuff
"entities":{user_mentions":[{"screen_name":"gnip","name":"Gnip, Inc.","id_str":"16958875"}]
cool stuff @gnipeng
cool stuff #gnip

See Examples Sources | Twitter
source: Matches any tweet generated by the given source application. The value must be either the name of the application, or the application's URL. Cannot be used alone.
Gnip Rule Match No Match
cat source:web cool cat (if the tweet was created at twitter.com) cool cat (if the tweet was created from an iPhone)
cat -source:web neat cat (if the tweet was NOT from twitter.com) neat cat (if the tweet was created at twitter.com)
cat source:"Twitter for iPhone" neat cat (if the tweet was from an iPhone Twitter App) neat cat (if the tweet was created from an Android)
cat source:iphone neat cat (if the tweet was from an iPhone Twitter App)
generator.displayName:Twitter for iPhone
 
cat source:"Android" neat cat (if the tweet was from an Android Twitter App)
generator.displayName:Twitter for Android
 
cat source:tweetdeck neat cat (if the tweet was by TweetDeck)
generator.link:https://about.twitter.com/products/tweetdeck
 
cat source:Emily neat cat (if the tweet was created by the EmilyTestPublicAPI App)
generator.displayName:EmilyTestPublicAPI
 

See Examples Sources | Twitter
bio_lang: Matches tweets where the user's bio-level language setting matches a given ISO 639-1 language code. Twitter does not support all languages in this list NOTE: This language setting simply changes the language which Twitter displays its UI text (it does not translate Tweet text). THIS IS NOT A LANGUAGE CLASSIFICATION. Customers have reported that this setting is often left in its default of English even when the Tweets an account is generating are in a foreign language. We recommend its use in conjunction with Gnip’s language classification operator (lang) rather than a standalone indicator of a user or Tweet’s language.
Gnip Rule Match No Match
bio_lang:fr Tweets from accounts whilst with language setting: Español - Spanish  
bio_lang:nl Tweets from accounts whilst with language setting: Netherlands - Dutch  

See Examples Sources | Twitter
time_zone: Matches tweets where the user-selected time zone specified in a user's profile settings matches a given string. These values are normalized to the options specified on a user's account settings page: [https://twitter.com/account/settings]
Gnip Rule Match No Match
time_zone:"Eastern Time (US & Canada)" Tweets from accounts that have their account time zone set to "(GMT -04:00) Eastern Time (US & Canada)" at the time of the tweet Tweets from accounts that do not have their account time zone set to "Eastern Time (US & Canada)"
time_zone:"Dublin" Tweets from accounts that have their account time zone set to "(GMT+01:00) Dublin" at the time of the tweet Tweets from accounts that do not have their account time zone set to "(GMT+01:00) West Central Africa" Note:Timezones are specific, not grouped by UTC offset.

See Examples Sources | Twitter
from: Matches any activity from a specific user. In Twitter, the value must be the user’s Twitter Account ID or username (excluding the @ character). See HERE or HERE for methods for looking up numeric Twitter Account IDs. For some publishers, MD5-hashed email can be used.
Gnip Rule Match No Match
from:17200003 All original tweets from user 1720003
Retweets of others' tweets by user 1720003
Replies made by user 1720003 on others' tweets
Tweets from this user 1720003, regardless of user's changed username
Retweets of user 1720003 tweets by other users
from:mikesmith All original tweets from user mikesmith
Retweets of others' tweets by mikesmith
Retweets of mikesmith tweets by other users
Tweets from this user, with a different or changed username

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
lang: Matches activities that have been classified by Gnip as being of a particular language (if, and only if, the activity has been classified). Current languages supported are:
  • ar - Arabic
  • da - Danish
  • de - German
  • el - Greek
  • en - English
  • es - Spanish
  • fa - Persian
  • fi - Finnish
  • fr - French
  • he - Hebrew
  • it - Italian
  • id - Indonesian
  • ja - Japanese
  • ko - Korean
  • nl - Dutch
  • no - Norwegian
  • pl - Polish
  • pt - Portuguese
  • ru - Russian
  • sv - Swedish
  • th - Thai
  • tr - Turkish
  • uk - Ukrainian
  • zh - Chinese
It is important to note that each activity is currently only classified as being of one language, so AND'ing together multiple languages will yield no results. Also note that not every activity is classified as being of a particular language.
Gnip Rule Match No Match
lang:de Guten Morgen! Good morning!
cat lang:en I'm taking my cat to prom I'm taking my dog to prom

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
post_title: Matches an exact phrase within the title of a posted article (Automattic). **Applies To:** - Posted articles (Automattic): article title - Comment (Automattic): article title - Like (Automattic): article title
Gnip Rule Match No Match
post_title:"Big Data" Making sense of big data within your company  

See Examples Sources | IntenseDebate | Wordpress
keyword Matches a keyword within the body of an activity. This is a tokenized match, meaning that your keyword string will be matched against the tokenized text of the activity body -- tokenization is based on punctuation, symbol, and separator Unicode basic plane characters. For example, an activity with the text "I like coca-cola" would be split into the following tokens: I, like, coca, cola. These tokens would then be compared to the keyword string used in your rule. To match strings containing punctuation (e.g. coca-cola), symbol, or separator characters, you must use a quoted exact match as described below.
Gnip Rule Match No Match
gnip I need to call gnip
Check out gnip's documentation.
I love the @gnip blog.
Check out Gnip.
#gniprocks
cola Ice cold cola on a hot day
I like coca-cola!
I like cocacola!
snow please let it snow!

twitter_entities.urls.display_url: https://en.wikipedia.org/wiki/Snow

gnip.urls.expanded_url: http://www.snowdays.com/2015/01/how-to-get-more-snow-days/
it is finally snowing!
Coachella Hanging out at #coachella NEW.PICS.FROM.COACHELLA2015!

See Examples Sources | Disqus | IntenseDebate | Twitter | Wordpress
thread_url_contains: Matches activities posted to a web page that's URL contains the given phrase or keyword. URL encodings are not encoded at this time. To search for patterns with punctuation in them (i.e. google.com) enclose the search term in quotes.
Gnip Rule Match
thread_url_contains:"cnn.com" All activities posted to cnn.com
thread_url_contains:"obama" All activities posted to a web page with "obama" in the URL

See Examples Sources | Disqus
profile_region_contains: Matches on the "region" field from the "address" object in the Profile Geo enrichment. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.
Gnip Rule Match
profile_region_contains:carolina All Profile Geo Enrichments in North or South Carolina (or other regions of the world containing the string "Carolina")

See Examples Sources | Twitter
has:profile_geo_locality Matches all activities that have a profileLocations.address.locality value present in the payload.
Gnip Rule Match
profile_country_code:us has:profile_geo_locality All Tweets with Profile Geo locations in the US that include city-level detail.

See Examples Sources | Twitter
country_code: Matches tweets where the country code associated with a tagged [place/location](https://dev.twitter.com/overview/api/places) matches the given ISO alpha-2 character code. Valid ISO codes can be found here: [http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2](http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
Gnip Rule Match No Match
country_code:us Tweets with the United States place/location country code
location.twitter_country_code:US
 
country_code:GB Tweets with the Great Britain place/location country code
location.twitter_country_code:GB
 
country_code:UK No matches, UK is not an ISO alpha-2 country code  
country_code:USA No matches, USA is not an ISO alpha-2 country code  

See Examples Sources | Twitter

Restrictions 

  1. Stop words are not allowed as stand-alone terms in queries. If you need to find a phrase that contains a stop word, either pair it with an additional term, or use the exact match operators such as “on the roof”. As long as there is at least one required and allowed term in the rule, it will be allowed. Please note that this list of stop words is subject to change, but the current stop words we use are: "a", "an", "and", "at", "but", "by", "com", "from", "http", "https", "if", "in", "is", "it", "its", "me", "my", "or", "rt", "the", "this", "to", "too", "via", "we", "www", "you"

  2. Rules cannot consist of only negated terms/operators. For example, ‘-cat -dog’ is not valid.

  3. Realtime and Historical PowerTrack (as well as Replay) support two forms of rules:
    • ‘Standard’ rules:
      • The entire string for a rule may be no more than 1024 characters, including all operators and spaces, with no single term exceeding 128 characters.
      • Rules may contain no more than 30 positive operators (things you want to match or filter on). If you exceed this limit when trying to create a rule, you will receive a 422 error, with a message indicating that you have exceeded one of the clause limits.
      • Rules may contain no more than 50 negative clauses. If you exceed this limit when trying to create a rule, you will receive a 422 error, with a message indicating that you have exceeded one of the clause limits.
    • ‘Long’ rules can be up to 2,048 characters long, with no single term exceeding 128 characters, and with no limits on the number of positive and negative clauses.


           Reach out to your account representative to switch the form of rules your stream uses.

  4. Negated ORs are not supported. Such as: apple OR -lang:en

  5. Each rule may only have 1 tag. However, the tag is simply treated as a string, and may contain up to 255 characters, including - ; and other punctuation.

  6. Geo rules with a radius greater than 25mi are not supported. Geo rules with a bounding box comprised of any edge greater than 25mi are not supported.

  7. A rule keyword or input can start with either a digit (0-9) or any non-punctuation character. Current punctuation characters are defined as the ASCII characters below. Any keyword or input that needs to start with or contains punctuation must be “quoted”. A keyword can not have a colon or parentheses unless you quote it.
! % & \ ' ( ) * + - . / ; < = > ? \\ , : # @ \t \r \n " [] _
and the Unicode ranges:
U+007B -- U+00BF
U+02B0 -- U+037F
U+2000 -- U+2BFF
U+FF00 -- U+FF03
U+FF05 -- U+FF0F