Search API Reference


Methods 

Method Description
POST /search/:stream Retrieve recent Tweets matching the specified PowerTrack rule.
POST /search/:stream/counts Retrieve the number of recent Tweets matching the specified PowerTrack rule.

Authentication 

All requests to the Search API must use HTTP Basic Authentication, constructed from a valid email address and password combination used to log into your account at console.gnip.com. Credentials must be passed as the Authorization header for each request.


POST /search/:stream 

Retrieves Tweets which match the specified query from within the last 30 days.

Note: This functionality can also be accomplished using a GET request, instead of a POST, by encoding the parameters described below into the URL.

Request Parameters 

query The equivalent of one Gnip PowerTrack rule, with up to 30 positive clauses, 50 negations and 1024 characters. Supported PowerTrack operator matches behave as they do in other PowerTrack based products.

This parameter should include ALL portions of the PowerTrack rule, including all operators, and portions of the rule should not be separated into other parameters of the query.

NOTE: Not all PowerTrack operators are supported. Exception details outlined below.
publisher The publisher’s data on which the query will be executed. Twitter is the only supported publisher at this time.
fromDate Optional. The oldest UTC timestamp from which the activities will be provided. Timestamp is in minute granularity and is inclusive (i.e. 12:00 includes the 00 minute). Using only the fromDate with no toDate parameter will deliver the most recent results for a query going back no further than the fromDate. If neither the fromDate or toDate parameter is used, will deliver all results (up to the maximum) for the entire 30-day index, starting at the time of the request, going backwards.
toDate Optional. The latest UTC timestamp to which the activities will be provided. Timestamp is in minute granularity and is not inclusive (i.e. 11:59 does not include the 59th minute of the hour). Use of only the toDate with no fromDate parameter will deliver the most recent results older than the specified toDate. If neither the fromDate or toDate parameter is used, will deliver all results (up to the maximum) for the entire 30-day index, starting at the time of the request, going backwards.
maxResults Optional. The maximum number of search results to be returned by a request. A number between 10 and the system limit (currently 500). By default, a request response will return 100 results.
next Optional. This parameter is used to get the next "page" of results as described HERE. The value used with the parameter is pulled directly from the response provided by the Search API, and should not be modified.


Additional Details 

Available Timeframe Last 30 days
Query Format The equivalent of one Gnip PowerTrack rule, with up to 30 positive clauses, 50 negations and 1024 characters. Supported PowerTrack operators behave as they do in other PowerTrack based products. Not all PowerTrack operators are currently supported. Exception details are outlined below.
Rate Limit 8 requests per 4 seconds, aggregated across all requests for data or /counts.
Compliance All data delivered via the Search API is compliant at the time of delivery.
Available Enrichments Language Classification, URL Expansion, Klout, Profile Geo
Realtime Availability Data is available in the index within 30 seconds of generation on the Twitter Platform

Pagination

When making a Search API query, there may be more matching Tweets than can be returned for a single request. Specifically, if your query matches more Tweets than the “maxResults” parameter used in the request (or more than 100 results if no MaxResults parameter is used), the response for that request will include a “next” element. The “next” value can be used in a subsequent request to retrieve the next portion of the matching Tweets for that query (i.e. the next “page”), and it will continue to be returned in subsequent queries until you have reached the last “page” of results for that query.

The value of the “next” element will change with each query and should be treated as an opaque string. The “next” element will look like the following in the response body:

 {
      "next":"dPLQKDqRdqGIqW5FaJpJrRPxcGFerysVh9HK3KWNAN0xs+ZoTo91WBiWV+7qggUriayry6qWCD/e4ZcSs5LhQrOMDO03L75KhGQaDWLenldBwKDZl9V1nIqlQYhkHeQWOQYGBgI4Qx0iWAQj3evzGH6SSVEcmswyZKeEQk2siHQG2Q1q1Xbry1eTFPEsudYe5ocglboTEXBzbFaAkrB3/OD7L6OXJKJ98opuKyL2fmvAxgNUQA+QcJg3qvQOxoltuttNS409ZBBJqLxRwwI18V/wWyrorkLtI599uUmCVRzH89T56vRc3bkGxz02sJL3KOiUDWGe2zVLLyiZhBfxZiBdNKFMceSSpH2/JBBMptqs/8pOwNxCtsrbpID5UIC",
      "results":
      [
           {--Tweet 1--},
           {--Tweet 2--},
           ...
           {--Tweet 500--}
      ]
 }

To request the next “page” of data, you must make the exact same query as the original, including toDate and fromDate if used, but pass in the “next” value from the previous response with the request. This can be utilized with either a GET or POST request. However, the “next” parameter must be URL encoded in the case of a GET request.

The response to a subsequent request might look like the following (note the new Tweets and different “next” value):

 {
      "next":"dPLQKDqRdqFYFqMfz7Xo0Vyzx6jBaN3z/sR2hCDbpBFR6eLXGwiRF1cQ/gqwZNRjn2+4dmzcLggADcKv6wBS5ijLRszPlYzQHdHA4N2QufmSeG4wL249CJwc5r6y7GxKnV1GD+W449UF7TSPqYeCzhlYpJHyylmYZIP364YfODiEf+pd459ZLy15VLzmnKF22W/IGaQ5q7FptgizjDKRmbhXuBqemMv7qFiqM/rlh0O5E2yN0s0aJ0xxrJ/DxjFMrLH1/S3wWg3vbd0TT/gaMxx/Mmrxy48gLK/19yQHpWyTSHB91YJuqn+g/Me+Haez1HHW02tJQqgYjB5PFkjIoUSINTZsvVwLbvimbNLOe2MILpxDnPxvubnxcL0agHhx",
      "results":
      [
           {--Tweet 501--},
           {--Tweet 502--},
           ...
           {--Tweet 1000--}
      ]
 }

You can continue to pass in the “next” element from your previous query until you have received all Tweets from the time period covered by your query. When you receive a response that does not include a “next” element, it means that you have reached the last page and no additional data is available in your time range.

Additional Notes

  • When using a fromDate or toDate in a search request, you will only get results from within your time range, and the “next” parameter will only correspond to the time range defined. When you reach the last group of results within your time range, you will not receive a next element.
  • The “next” element can be used with any maxResults value (above the minimum 10) or the default value of 100. The maxResults determines how many Tweets are returned in each response, but does not prevent you from eventually getting all results.
  • The “next” element works exactly the same for high-volume queries, allowing you to page through all of the results within a given minute.
  • The next element does not expire. Multiple requests using a the same “next” query will receive the same results, regardless of when the request is made. However, note that Search operates within a rolling 30 day window, so if the data a “next” element was pointing to has rolled off the window, the “next” element will not return any results.
  • When paging through results using the “next” parameter, you may encounter duplicates at the edges of the query. Your app should be tolerant of these.

Responses

The following responses may be returned by the API for these requests. Most error codes are returned with a string with additional details in the body. For non-200 responses, clients should retry the request, after applying any necessary modifications.

Successful requests will include the following elements:

"next" Optional. Contains the pagination token described above, where applicable.
"results" A comma-delimited JSON array of Tweets, using the standard data format described here.

Response Details

Status Text Description
200 OK The request was successful. The JSON response will be similar to the following:
            {
               "next":"dPLQKDqRdqFYFqMfz7Xo0Vyzx6jBaN3z/sR2hCDbpBFR6eLXGwiRF1cQ/gqwZNRjn2+4dmzcLggADcKv6wBS5ijLRszPlYzQHdHA4N2QufmSeG4wL249CJwc5r6y7GxKnV1GD+W449UF7TSPqYeCzhlYpJHyylmYZIP364YfODiEf+pd459ZLy15VLzmnKF22W/IGaQ5q7FptgizjDKRmbhXuBqemMv7qFiqM/rlh0O5E2yN0s0aJ0xxrJ/DxjFMrLH1/S3wWg3vbd0TT/gaMxx/Mmrxy48gLK/19yQHpWyTSHB91YJuqn+g/Me+Haez1HHW02tJQqgYjB5PFkjIoUSINTZsvVwLbvimbNLOe2MILpxDnPxvubnxcL0agHhx",
               "results":
               [
            	    {"id":"tag:search.twitter.com,2005:340572103852048385","objectType":"activity","actor":{"objectType":"person","id":"id:twitter.com:369141964","link":"http://www.twitter.com/muratcoskunbu","displayName":"king size","postedTime":"2011-09-06T21:01:54.000Z","image":"http://a0.twimg.com/profile_images/3700775684/f17ff2dd78e982bfd8e339118f42d0e2_normal.jpeg","summary":"yaz geçer yenisi gelir..","links":[{"href":null,"rel":"me"}],"friendsCount":955,"followersCount":1069,"listedCount":0,"statusesCount":2983,"twitterTimeZone":"Quito","verified":false,"utcOffset":"-18000","preferredUsername":"muratcoskunbu","languages":["tr"],"favoritesCount":663},"verb":"post","postedTime":"2013-05-31T20:54:51.000Z","generator":{"displayName":"Twitter for iPhone","link":"http://twitter.com/download/iphone"},"provider":{"objectType":"service","displayName":"Twitter","link":"http://www.twitter.com"},"link":"http://twitter.com/muratcoskunbu/statuses/340572103852048385","body":"1915 te Çanakkaleyi geçemeyen güçler hepimizin eline teitter yüklü  iphone verip bu işler olacak mı sandın,,yanlış sandın!","object":{"objectType":"note","id":"object:search.twitter.com,2005:340572103852048385","summary":"1915 te Çanakkaleyi geçemeyen güçler hepimizin eline teitter yüklü  iphone verip bu işler olacak mı sandın,,yanlış sandın!","link":"http://twitter.com/muratcoskunbu/statuses/340572103852048385","postedTime":"2013-05-31T20:54:51.000Z"},"favoritesCount":0,"twitter_entities":{"hashtags":[],"symbols":[],"urls":[],"user_mentions":[]},"twitter_filter_level":"medium","retweetCount":0,"gnip":{"language":{"value":"en"}}},
                	{"id":"tag:search.twitter.com,2005:340572102514061312","objecttype"....
               ]
            }
            
400 Bad Request Generally, this response occurs due to the presence of invalid JSON in the request, or where the request failed to send any JSON payload. The corresponding JSON message will look similar to the following:

{"error":{"message":"Invalid JSON. The body must be in the format {\"query\":\"cat OR kitten\",\"maxResults\":\"50\"}","sent":"2014-07-16T17:30:12+00:00"}}
            
401 Unauthorized HTTP authentication failed due to invalid credentials. Log in to console.gnip.com with your credentials to ensure you are using them correctly with your request. The corresponding JSON message will look similar to the following:

{"error":{"message":"Unauthorized: Couldn't authenticate user/password.","sent":"2014-07-16T17:31:38+00:00"}}
			
404 Not Found The resource was not found at the URL to which the request was sent, likely because an incorrect URL was used.
422 Unprocessable Entity This is returned due to invalid parameters in the query -- e.g. invalid PowerTrack rules. The corresponding JSON message will look similar to one of the following:

{"error":{"message":"Could not accept your search request: no viable alternative at character '&' (at position 1)","sent":"2014-07-16T17:29:37+00:00"}}

{"error":{"message":"Could not accept your search request: maxResults parameter can only be between 10 and 500.","sent":"2014-07-16T17:26:43+00:00"}}

{"error":{"message":"Could not accept your search request: Couldn't validate search stream.","sent":"2014-07-16T17:38:48+00:00"}}
            
429 Unknown Code Your app has exceeded the limit on connection requests. The corresponding JSON message will look similar to the following:

{"error":{"message":"Rate limit exceeded","sent":"2014-07-16T18:07:27+00:00"}}
			
502 Proxy Error There was an error on Gnip's side. Retry your request using an exponential backoff pattern.
503 Service Unavailable There was an error on Gnip's side. Retry your request using an exponential backoff pattern.


Example POST Request 

  • Request parameters in a POST request are sent via a JSON-formatted body, as shown below.
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter
  • Do not split portions of the rule out as separate parameters in the query URL
curl -X POST -u<username> "https://search.gnip.com/accounts/{ACCOUNT_NAME}/search/{LABEL}.json" -d '{"publisher":"twitter","query":"gnip","maxResults":"200","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>","next":"dPLQKDqRdqFYFqMfz7Xo0Vyzx6jBaN3z/sR2hCDbpBFR6eLXGwiRF1cQ/gqwZNRjn2+4dmzcLggADcKv6wBS5ijLRszPlYzQHdHA4N2QufmSeG4wL249CJwc5r6y7GxKnV1GD+W449UF7TSPqYeCzhlYpJHyylmYZIP364YfODiEf+pd459ZLy15VLzmnKF22W/IGaQ5q7FptgizjDKRmbhXuBqemMv7qFiqM/rlh0O5E2yN0s0aJ0xxrJ/DxjFMrLH1/S3wWg3vbd0TT/gaMxx/Mmrxy48gLK/19yQHpWyTSHB91YJuqn+g/Me+Haez1HHW02tJQqgYjB5PFkjIoUSINTZsvVwLbvimbNLOe2MILpxDnPxvubnxcL0agHhx"}'

Example GET Request 

  • Request parameters in a GET request are encoded into the URL, using standard URL encoding
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter
  • Do not split portions of the rule out as separate parameters in the query URL
curl -u<username> "http://search.gnip.com/accounts/your_account_name/search/your_stream_label.json?publisher=twitter&query=cat&maxResults=500&fromDate=<yyyymmddhhmm>&toDate=<yyyymmddhhmm>&next=dPLQKDqRdqFYFqMfz7Xo0Vyzx6jBaN3z%2FsR2hCDbpBFR6eLXGwiRF1cQ%2FgqwZNRjn2%2B4dmzcLggADcKv6wBS5ijLRszPlYzQHdHA4N2QufmSeG4wL249CJwc5r6y7GxKnV1GD%2BW449UF7TSPqYeCzhlYpJHyylmYZIP364YfODiEf%2Bpd459ZLy15VLzmnKF22W%2FIGaQ5q7FptgizjDKRmbhXuBqemMv7qFiqM%2Frlh0O5E2yN0s0aJ0xxrJ%2FDxjFMrLH1%2FS3wWg3vbd0TT%2FgaMxx%2FMmrxy48gLK%2F19yQHpWyTSHB91YJuqn%2Bg%2FMe%2BHaez1HHW02tJQqgYjB5PFkjIoUSINTZsvVwLbvimbNLOe2MILpxDnPxvubnxcL0agHhx"



POST /search/:stream/counts 

Returns counts representing the number of Tweets that match the specified query during the requested timeframe.

Note: This functionality can also be accomplished using a GET request, instead of a POST, by encoding the parameters described below into the URL.

Parameters/Details 

query The equivalent of one Gnip PowerTrack rule, with up to 30 positive clauses, 50 negations and 1024 characters. Supported PowerTrack operators behave as they do in other PowerTrack based products.

This parameter should include ALL portions of the PowerTrack rule, including all operators, and portions of the rule should not be separated into other parameters of the query.

NOTE: Not all PowerTrack operators are currently supported. Exception details are outlined here.
publisher The publisher’s data on which the query will be executed. Twitter is the only supported publisher at this time.
fromDate Optional. The oldest UTC timestamp from which the activity counts will be provided. Timestamp is in minute granularity and is inclusive (i.e. 12:00 includes the 00 minute).
toDate Optional. The latest UTC timestamp to which the activity counts will be provided. Timestamp is in minute granularity and is not inclusive (i.e. 11:59 does not include the 59th minute of the hour)
bucket Optional. The unit of time for which count data will be provided. Count data can be returned for every day, hour or minute in the requested timeframe. By default, hourly counts will be provided. Options: "day", "hour", "minute"
Count Precision The counts delivered through this endpoint reflect the number of activities that occurred and do not reflect any later compliance events (deletions, scrub geos). Some activities counted may not be available via data endpoint due to user compliance actions.

Example POST Request 

Request parameters in a POST request are sent via a JSON-formatted body, as shown below.

curl -X POST -u<username> "https://search.gnip.com/accounts/{ACCOUNT_NAME}/search/{LABEL}/counts.json" -d '{"publisher":"twitter", "query":"gnip","bucket":"hour","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>"}'

Example GET Request 

Request parameters in a GET request are encoded into the URL, using standard URL encoding.

curl -u<username> "https://search.gnip.com/accounts/{ACCOUNT_NAME}/search/{LABEL}/counts.json?publisher=twitter&query=gnip&bucket=hour&fromDate=<yyyymmddhhmm>&toDate=<yyyymmddhhmm>"

Responses

The following responses may be returned by the API for these requests. Most error codes are returned with a string with additional details in the body. For non-200 responses, clients should retry the request, after applying any necessary modifications.

Successful requests will include the following elements:

"timePeriod" A timestamp representing the start of a time “bucket”, with each bucket representing a day, hour, or minute, as specified in the API request
"count" The count of activities for the associated time “bucket”

Response Details

Status Text Description
200 OK The request was successful. The JSON response will be similar to the following:
{
  "results":[
	{"timePeriod":"201305300000","count":17167},
	{"timePeriod":"201305300100","count":18183},
	{"timePeriod":"201305300200","count":17972},
	{"timePeriod":"201305300300","count":18539},
	{"timePeriod":"201305300400","count":18298},
	{"timePeriod":"201305300500","count":15705},
	{"timePeriod":"201305300600","count":14750},
	{"timePeriod":"201305300700","count":13839},
	{"timePeriod":"201305300800","count":14022},
	{"timePeriod":"201305300900","count":14649},
	{"timePeriod":"201305301000","count":15106},
	{"timePeriod":"201305301100","count":17638},
	{"timePeriod":"201305301200","count":19435},
	{"timePeriod":"201305301300","count":20494},
	{"timePeriod":"201305301400","count":21149},
	{"timePeriod":"201305301500","count":21716},
	{"timePeriod":"201305301600","count":23719},
	{"timePeriod":"201305301700","count":23499},
	{"timePeriod":"201305301800","count":21705},
	{"timePeriod":"201305301900","count":22159},
	{"timePeriod":"201305302000","count":21902},
	{"timePeriod":"201305302100","count":20255},
	{"timePeriod":"201305302200","count":19021},
	{"timePeriod":"201305302300","count":17821}
  ]
}
            
400 Bad Request Generally, this response occurs due to the presence of invalid JSON in the request, or where the request failed to send any JSON payload. The corresponding JSON message will look similar to the following:

{"error":{"message":"Invalid JSON. The body must be in the format {\"query\":\"cat OR kitten\",\"bucket\":\"hour\"}","sent":"2014-07-21T17:33:29+00:00"}}
            
401 Unauthorized HTTP authentication failed due to invalid credentials. Log in to console.gnip.com with your credentials to ensure you are using them correctly with your request. The corresponding JSON message will look similar to the following:

{"error":{"message":"Unauthorized: Couldn't authenticate user/password.","sent":"2014-07-16T17:31:38+00:00"}}
			
404 Not Found The resource was not found at the URL to which the request was sent, likely because an incorrect URL was used.
422 Unprocessable Entity This is returned due to invalid parameters in the query -- e.g. invalid PowerTrack rules. The corresponding JSON message will look similar to one of the following:

{"error":{"message":"Could not accept your count request: no viable alternative at character '&' (at position 1)","sent":"2014-07-21T17:36:51+00:00"}}
            
429 Unknown Code Your app has exceeded the limit on connection requests. The corresponding JSON message will look similar to the following:

{"error":{"message":"Rate limit exceeded","sent":"2014-07-16T18:07:27+00:00"}}
			
502 Proxy Error There was an error on Gnip's side. Retry your request using an exponential backoff pattern.
503 Service Unavailable There was an error on Gnip's side. Retry your request using an exponential backoff pattern.

Note: For Counts requests only, this error may occur for very high-volume queries (e.g. "twitter") that attempt to return counts from the full 30-day time period of the index. In this scenario, before retrying, your app should break your request into multiple requests for smaller time segments. This scenario applies ONLY to Counts requests -- not requests to return Tweets.