Methods 

Method Description
POST /search/fullarchive/accounts/:account_name/:label Retrieve recent Tweets matching the specified PowerTrack rule.
POST /search/fullarchive/accounts/:account_name/:label/counts Retrieve the number of recent Tweets matching the specified PowerTrack rule.

Where:

  • :account_name is the (case-sensitive) name associated with your account, as displayed at console.gnip.com
  • :label is the (case-sensitive) label associated with your Search endpoint, as displayed at console.gnip.com

Your complete Full-Archive Search API endpoint is displayed at https://console.gnip.com.


Authentication 

All requests to the Full-Archive Search API must use HTTP Basic Authentication, constructed from a valid email address and password combination used to log into your account at https://console.gnip.com. Credentials must be passed as the Authorization header for each request.


Request/Response Behavior 

Using the fromDate and toDate parameters, you can request any time period back to the very first tweet (March 21, 2006) However, a single response will be limited to the lesser of your specified maxResults or 31 days. If your the available data or your time range exceeds your specified maxResults or 31 days, you will receive a ‘next’ token which you should use to paginate through remainder of your specified time range.

For example, if you want all of the data/counts from Jan 1, 2012 to June 1, 2012… You can specify that full five-month time period in your request, but the API will respond with the data for May only and provide a ‘next’ token to pull the data for the next 31 days and so on until you’ve received all of the data through Jan 1st, 2012.


Pagination

When making both data and count requests it is likely that there is more data than can be returned in a single response. When that is the case the response will include a ‘next’ token. The ‘next’ token is provided as a root-level JSON attribute. Whenever a ‘next’ token is provided, there is additional data to retrieve so you will need to keep making API requests.

Note: The ‘next’ token behavior differs slightly for data and counts requests, and both are described below with example responses provided in the API Reference section.

Data Pagination

Requests for data will likely generate more data than can be returned in a single response. Each data request includes a parameter that sets the maximum number of activities to return per request. The ‘maxResults’ parameter defaults to 100, and can be set to a range of 10-500. If your query matches more Tweets than the “maxResults” parameter used in the request, the response will include a “next” token (as a root-level JSON attribute). This “next” token can be used in a subsequent request to retrieve the next portion of the matching Tweets for that query (i.e. the next “page”). Next tokens will continue to be provided until you have reached the last “page” of results for that query, when no “next” token is provided.

To request the next “page” of data, you must make the exact same query as the original, including toDate and fromDate, if used, and also include a “next” request parameter set to the value from the previous response. This can be utilized with either a GET or POST request. However, the “next” parameter must be URL encoded in the case of a GET request.
You can continue to pass in the “next” element from your previous query until you have received all Tweets from the time period covered by your query. When you receive a response that does not include a “next” element, it means that you have reached the last page and no additional data is available for the specified query and time range.

Counts Pagination

The counts API provides Tweet volumes associated with a query on either a daily, hourly, or per-minute basis. The ‘counts’ API endpoint will return a timestamped array of counts for a maximum of 31-day payload of counts. If you request more than 31 days of counts, you will be provided a “next” token. As with the data “next” tokens, you must make the exact same query as the original and also include a “next” request parameter set to the value from the previous response.

Beyond requesting more than 31 days of counts, there is another scenario when a “next” token is provided. For higher volume queries, there is the potential that the generation of counts will take long enough to trigger a response timeout. When this occurs you will receive less than 31 days of counts, but will be provided a “next” token in order to continue making requests for the entire payload of counts. Important: Timeouts will only issue full “buckets” - so 2.5 days would result in 2 full day “buckets”.

Additional Notes

  • When using a fromDate or toDate in a search request, you will only get results from within your time range. When you reach the last group of results within your time range, you will not receive a next element.
  • The “next” element can be used with any maxResults value between 10-500 (with a default value of 100). The maxResults determines how many Tweets are returned in each response, but does not prevent you from eventually getting all results.
  • The next element does not expire. Multiple requests using a the same “next” query will receive the same results, regardless of when the request is made.
  • When paging through results using the “next” parameter, you may encounter duplicates at the edges of the query. Your application should be tolerant of these.



Data Endpoint  

POST /search/:stream

Endpoint pattern:

/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}.json

This endpoint returns data for the specified query and time period. If a time period is not specified the time parameters will default to the last 30 days.
Note: This functionality can also be accomplished using a GET request, instead of a POST, by encoding the parameters described below into the URL.


Data Request Parameters 

Parameters Description Required Sample Value
query The equivalent of one Gnip PowerTrack rule, with up to 2048 characters (and no limits on the number of positive and negative clauses).

This parameter should include ALL portions of the PowerTrack rule, including all operators, and portions of the rule should not be separated into other parameters of the query.

Items to Note:
  • Not all PowerTrack operators are supported. Supported Operators are listed HERE.
  • Some PowerTrack operator matches behave differently than they do in other PowerTrack based products as described HERE.
Yes (snow OR cold OR blizzard) weather
tag Tags can be used to segregate rules and their matching data into different logical groups. If a rule tag is provided, and you are ingesting data in the Activity Stream format, the rule tag is included in the gnip.matching_rules attribute.

It is recommended to assign rule-specific UUIDs to rule tags and maintain desired mappings on the client side.

No 8HYG54ZGTU
fromDate The oldest UTC timestamp (back to 3/21/2006) from which the activities will be provided. Timestamp is in minute granularity and is inclusive (i.e. 12:00 includes the 00 minute).

Specified: Using only the fromDate with no toDate parameter will deliver results for the query going back in time from now( ) until the fromDate.

Not Specified: If a fromDate is not specified, the API will deliver all of the results for 30 days prior to now( ) or the toDate (if specified).

If neither the fromDate or toDate parameter is used, the API will deliver all results for the most recent 30 days, starting at the time of the request, going backwards.
No 201207220000
toDate The latest, most recent UTC timestamp to which the activities will be provided. Timestamp is in minute granularity and is not inclusive (i.e. 11:59 does not include the 59th minute of the hour).

Specified: Using only the toDate with no fromDate parameter will deliver the most recent 30 days of data prior to the toDate.

Not Specified: If a toDate is not specified, the API will deliver all of the results from now( ) for the query going back in time to the fromDate.

If neither the fromDate or toDate parameter is used, the API will deliver all results for the entire 30-day index, starting at the time of the request, going backwards.
No 201208220000
maxResults The maximum number of search results to be returned by a request. A number between 10 and the system limit (currently 500). By default, a request response will return 100 results. No 500
next This parameter is used to get the next "page" of results as described HERE. The value used with the parameter is pulled directly from the response provided by the API, and should not be modified. No NTcxODIyMDMyODMwMjU1MTA0


Additional Details 

Available Timeframe March 21, 2006 - Present
Query Format The equivalent of one Gnip PowerTrack rule, with up to 2048 characters (and no limits on the number of positive and negative clauses).

Items to Note:
  • Not all PowerTrack operators are supported. Supported Operators are listed HERE.
  • Some PowerTrack operator matches behave differently than they do in other PowerTrack based products as described HERE.
Rate Limit Partners will be rate limited at both minute and second granularity. The per minute rate limit will vary by partner as specified in your contract. However, these per-minute rate limits are not intended to be used in a single burst. Regardless of your per minute rate limit, all partners will be limited to a maximum of 20 requests per second, aggregated across all requests for data and/or counts.
Compliance All data delivered via the Full-Archive Search API is compliant at the time of delivery.
Realtime Availability Data is available in the index within 30 seconds of generation on the Twitter Platform

Example Data Requests and Responses 

Example POST Request

  • Request parameters in a POST request are sent via a JSON-formatted body, as shown below.
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter
  • Do not split portions of the rule out as separate parameters in the query URL

Here is an example POST (using cURL) command for making an initial data request:

curl -X POST -u<username> "https://gnip-api.twitter.com/search/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}.json" -d '{"query":"gnip","maxResults":"500","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>"}'

If the API data response includes a “next” token, below is a subsequent request that consists of the original request, with the “next” parameter set to the provided token:

curl -X POST -u<username> "https://gnip-api.twitter.com/search/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}.json" -d '{"query":"gnip","maxResults":"500","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>",
"next":"NTcxODIyMDMyODMwMjU1MTA0"}'

Example GET Request 

  • Request parameters in a GET request are encoded into the URL, using standard URL encoding.
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter.
  • Do not split portions of the rule out as separate parameters in the query URL.

Here is an example GET (using cURL) command for making an initial data request:

curl -u<username> "http://gnip-api.twitter.com/search/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}.json?query=gnip&maxResults=500&fromDate=<yyyymmddhhmm>&toDate=<yyyymmddhhmm>"


Example Data Responses 

Below is an example response to a data query. This example assumes that there were more than ‘maxResults’ Tweets available so a “next” token is provided for subsequent requests. If “maxResults” or less Tweets are associated with your query, no “next” token would be included in the response.

The value of the “next” element will change with each query and should be treated as an opaque string. The “next” element will look like the following in the response body:

{
    "results":
      [
           {--Tweet 1--},
           {--Tweet 2--},
           ...
           {--Tweet 500--}
      ],
    "next":"NTcxODIyMDMyODMwMjU1MTA0",  
    "requestParameters":
      {
        "maxResults":500,
        "fromDate":"201101010000",
        "toDate":"201201010000"
      }
 }

The response to a subsequent request might look like the following (note the new Tweets and different “next” value):

{
      "results":
      [
           {--Tweet 501--},
           {--Tweet 502--},
           ...
           {--Tweet 1000--}
      ],
      "next":"R2hCDbpBFR6eLXGwiRF1cQ",
      "requestParameters":
      {
        "maxResults":500,
        "fromDate":"201101010000",
        "toDate":"201201010000"
      }
 }

You can continue to pass in the “next” element from your previous query until you have received all Tweets from the time period covered by your query. When you receive a response that does not include a “next” element, it means that you have reached the last page and no additional data is available in your time range.


Counts Endpoint  

/search/:stream/counts 

Endpoint pattern:

/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}/counts.json

This endpoint returns counts (data volumes) data for the specified query. If a time period is not specified the time parameters will default to the last 30 days. Data volumes are returned as a timestamped array on either daily, hourly (default), or by the minute.

Note: This functionality can also be accomplished using a GET request, instead of a POST, by encoding the parameters described below into the URL.


Counts Request Parameters 

Parameters Description Required Sample Value
query The equivalent of one Gnip PowerTrack rule, with up to 2048 characters (and no limits on the number of positive and negative clauses).

This parameter should include ALL portions of the PowerTrack rule, including all operators, and portions of the rule should not be separated into other parameters of the query.

Items to Note:
  • Not all PowerTrack operators are supported. Supported Operators are listed HERE.
  • Some PowerTrack operator matches behave differently than they do in other PowerTrack based products as described HERE.
Yes (snow OR cold OR blizzard) weather
fromDate The oldest UTC timestamp (back to 3/21/2006) from which the activities will be provided. Timestamp is in minute granularity and is inclusive (i.e. 12:00 includes the 00 minute).

Specified: Using only the fromDate with no toDate parameter, the API will deliver counts (data volumes) data for the query going back in time from now until the fromDate. If the fromDate is older than 31 days from now( ), you will receive a next token to page through your request.

Not Specified: If a fromDate is not specified, the API will deliver counts (data volumes) for 30 days prior to now( ) or the toDate (if specified).

If neither the fromDate or toDate parameter is used, the API will deliver counts (data volumes) for the most recent 30 days, starting at the time of the request, going backwards.

No 201207220000
toDate The latest, most recent UTC timestamp to which the activities will be provided. Timestamp is in minute granularity and is not inclusive (i.e. 11:59 does not include the 59th minute of the hour).

Specified: Using only the toDate with no fromDate parameter will deliver the most recent counts (data volumes) for 30 days prior to the toDate.

Not Specified: If a toDate is not specified, the API will deliver counts (data volumes) for the query going back in time to the fromDate. If the fromDate is more than 31 days from now( ), you will receive a next token to page through your request.

If neither the fromDate or toDate parameter is used, the API will deliver counts (data volumes) for the most recent 30 days, starting at the time of the request, going backwards.
No 201208220000
bucket The unit of time for which count data will be provided. Count data can be returned for every day, hour or minute in the requested timeframe. By default, hourly counts will be provided. Options: "day", "hour", "minute" No minute
next This parameter is used to get the next "page" of results as described HERE. The value used with the parameter is pulled directly from the response provided by the API, and should not be modified. No NTcxODIyMDMyODMwMjU1MTA0

Additional Details

Available Timeframe March 21, 2006 - Present
Query Format The equivalent of one Gnip PowerTrack rule, with up to 2048 characters (and no limits on the number of positive and negative clauses).

Items to Note:
  • Not all PowerTrack operators are supported. Supported Operators are listed HERE.
  • Some PowerTrack operator matches behave differently than they do in other PowerTrack based products as described HERE.
Rate Limit Partners will be rate limited at both minute and second granularity. The per minute rate limit will vary by partner as specified in your contract. However, these per-minute rate limits are not intended to be used in a single burst. Regardless of your per minute rate limit, all partners will be limited to a maximum of 20 requests per second, aggregated across all requests for data and/or counts.
Count Precision The counts delivered through this endpoint reflect the number of activities that occurred and do not reflect any later compliance events (deletions, scrub geos). Some activities counted may not be available via data endpoint due to user compliance actions.



Example Counts Requests and Responses 

Example POST Request

  • Request parameters in a POST request are sent via a JSON-formatted body, as shown below.
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter.
  • Do not split portions of the rule out as separate parameters in the query URL.

Here is an example POST (using cURL) command for making an initial counts request:

curl -X POST -u<username> "https://gnip-api.twitter.com/search/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}/counts.json" -d '{"query":"gnip","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>","bucket":"day"}'

If the API counts response includes a “next” token, below is a subsequent request that consists of the original request, with the “next” parameter set to the provided token:

curl -X POST -u<username> "https://gnip-api.twitter.com/search/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}/counts.json" -d '{"query":"gnip","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>","bucket":"day",
"next":"YUcxO87yMDMyODMwMjU1MTA0"}'

Example GET Request

  • Request parameters in a GET request are encoded into the URL, using standard URL encoding
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter
  • Do not split portions of the rule out as separate parameters in the query URL

Here is an example GET (using cURL) command for making an initial counts request:

curl -u<username> "http://gnip-api.twitter.com/search/fullarchive/accounts/{ACCOUNT_NAME}/{LABEL}/counts.json?query=gnip&bucket=day&fromDate=<yyyymmddhhmm>&toDate=<yyyymmddhhmm>"


Example Counts Responses

Below is an example response to a counts (data volume) query. This example response includes a next token, meaning the counts request was for more than 31 days, or that the submitted query had a large enough volume associated with it to trigger a partial response.

The value of the “next” element will change with each query and should be treated as an opaque string. The “next” element will look like the following in the response body:

{
  "results": [
    { "timePeriod": "201101010000", "count": 32 },
    { "timePeriod": "201101020000", "count": 45 },
    { "timePeriod": "201101030000", "count": 57 },
    { "timePeriod": "201101040000", "count": 123 },
    { "timePeriod": "201101050000", "count": 134 },
    { "timePeriod": "201101060000", "count": 120 },
    { "timePeriod": "201101070000", "count": 43 },
    { "timePeriod": "201101080000", "count": 65 },
    { "timePeriod": "201101090000", "count": 85 },
    { "timePeriod": "201101100000", "count": 32 },
    { "timePeriod": "201101110000", "count": 23 },
    { "timePeriod": "201101120000", "count": 85 },
    { "timePeriod": "201101130000", "count": 32 },
    { "timePeriod": "201101140000", "count": 95 },
    { "timePeriod": "201101150000", "count": 109 },
    { "timePeriod": "201101160000", "count": 34 },
    { "timePeriod": "201101170000", "count": 74 },
    { "timePeriod": "201101180000", "count": 24 },
    { "timePeriod": "201101190000", "count": 90 },
    { "timePeriod": "201101200000", "count": 85 },
    { "timePeriod": "201101210000", "count": 93 },
    { "timePeriod": "201101220000", "count": 48 },
    { "timePeriod": "201101230000", "count": 37 },
    { "timePeriod": "201101240000", "count": 54 },
    { "timePeriod": "201101250000", "count": 52 },
    { "timePeriod": "201101260000", "count": 84 },
    { "timePeriod": "201101270000", "count": 120 },
    { "timePeriod": "201101280000", "count": 34 },
    { "timePeriod": "201101290000", "count": 83 },
    { "timePeriod": "201101300000", "count": 23 },
    { "timePeriod": "201101310000", "count": 12 }
   ],
  "totalCount":2027, 
  "next":"NTcxODIyMDMyODMwMjU1MTA0",
  "requestParameters":
    {
      "bucket":"day",
      "fromDate":"201101010000",
      "toDate":"201201010000"
    }
}

The response to a subsequent request might look like the following (note the new counts timeline and different “next” value):

{
  "results": [
    { "timePeriod": "201102010000", "count": 45 },
    { "timePeriod": "201102020000", "count": 76 },
     ....
    { "timePeriod": "201103030000", "count": 13 }
 ],
 "totalCount":3288, 
 "next":"WE79fnakFanyMDMyODMwMjU1MTA0",
 "requestParameters":
    {
      "bucket":"day",
      "fromDate":"201101010000",
      "toDate":"201201010000"
    }
}

You can continue to pass in the “next” element from your previous query until you have received all counts from the query time period. When you receive a response that does not include a “next” element, it means that you have reached the last page and no additional counts are available in your time range.


HTTP Response Codes 

Status Text Description
200 OK The request was successful. The JSON response will be similar to the following:
400 Bad Request Generally, this response occurs due to the presence of invalid JSON in the request, or where the request failed to send any JSON payload.
401 Unauthorized HTTP authentication failed due to invalid credentials. Log in to console.gnip.com with your credentials to ensure you are using them correctly with your request.
404 Not Found The resource was not found at the URL to which the request was sent, likely because an incorrect URL was used.
422 Unprocessable Entity This is returned due to invalid parameters in a query or when a query is too complex for us to process. – e.g. invalid PowerTrack rules or too many phrase operators, rendering a query too complex.
429 Unknown Code Your app has exceeded the limit on connection requests. The corresponding JSON message will look similar to the following:
500 Internal Server Error There was an error on Gnip's side. Retry your request using an exponential backoff pattern.
502 Proxy Error There was an error on Gnip's side. Retry your request using an exponential backoff pattern.
503 Service Unavailable There was an error on Gnip's side. Retry your request using an exponential backoff pattern.