Key Features:
- Real-time data delivery: Get data matching your rules in near-real time.
- Precise filtering: Filter for exactly the data you are looking for using Boolean queries with operators.
- Delivery: JSON response over HTTP/1.1 chunked transfer encoding.
- Local Data-center support: Fetch posts only from the local datacenter to reduce latency by avoiding replication lag.
Authentication
Powerstream API endpoints use OAuth 2.0 Bearer Token. Include inAuthorization: Bearer <token> header and you can start using these endpoints.
Quick Start
This section showcases how to quickly get started with the PowerStream endpoints using Python with therequests library. Install it via pip install requests. All examples use OAuth 2.0 Bearer Token authentication. Replace YOUR_BEARER_TOKEN with your actual token (store it securely, e.g., via os.getenv('BEARER_TOKEN')).
We’ll cover each endpoint with code snippets. Assume these imports at the top:
Setup
1. Create Rules (POST /rules)
Add rules to filter your stream.2. Delete Rules (POST /rules)
Remove rules by ID (recommended) or value.3. Get Rules (GET /rules)
Fetch all active rules.4. PowerStream (GET /stream)
Connect to the stream for real-time Posts. Usestream=True for line-by-line reading. Implement reconnect logic for robustness.
Local Datacenter Support
For latency optimization, Powerstream provides an option to fetch only posts that originated or were created in the local datacenter where a connection is established. This avoids replication lag, resulting in faster delivery compared to posts from other datacenters. To enable this, append the query parameter?localDcOnly=true to the stream endpoint (e.g., /2/powerstream?localDcOnly=true). The datacenter you are connected to will be indicated both in the initial data payload of the stream and as an HTTP header in the response.
To use in code:
localDcOnly parameter is enabled, when the stream first connects, it will include the following response headers indicating which local datacenter is being used:
Tip: To optimize latency, set up connections from different geographic locations (e.g., one near Atlanta on the US East Coast and another near Portland on the US West Coast), enabling
localDcOnly=true for each. This provides faster access to posts from each respective datacenter. Aggregate the streams on your end to combine cross-datacenter data.Operators
In order to set rules for filtering, you can use keywords and operators. Check out the list of available operators below.Field-Based Operators
User Operators
| Operator | Summary | Example |
|---|---|---|
from: | Matches posts from a specific user | from:xdevelopers or from:123456 |
to: | Matches posts directed to a specific user | to:jvaleski |
retweets_of: | Matches reposts of a specific user | retweets_of:xdevelopers |
Content Operators
| Operator | Summary | Example |
|---|---|---|
contains: | Matches posts containing specific text/keywords | contains:hello or contains:-2345.432 |
url_contains: | Matches posts with URLs containing specific text | url_contains:"com/willplayforfood" |
lang: | Matches posts in specific languages | lang:en |
Entity Operators
| Operator | Summary | Example |
|---|---|---|
has: | Matches posts containing specific entities (Options: mentions, geo, links, media, lang, symbols, images, videos) | has:images, has:geo, has:mentions |
is: | Matches posts of specific types or with specific properties (Options: retweet, reply) | is:retweet, is:reply |
Location Operators
| Operator | Summary | Example |
|---|---|---|
place: | Matches posts from specific places/locations | place:"Belmont Central", place:02763fa2a7611cf3 |
bounding_box: | Matches posts within a geographic bounding box | bounding_box:[-112.424083 42.355283 -112.409111 42.792311] |
point_radius: | Matches posts within a radius of a point | point_radius:[-111.464973 46.371179 25mi], point_radius:[-111.464973 46.371179 15km] |
Advanced/Content Operators
| Operator | Summary | Example |
|---|---|---|
bio: | Matches posts from users with specific bio content (Uses phrase matching) | N/A |
bio_name: | Matches posts from users with specific name in bio (Uses phrase matching) | N/A |
Additional Operators
| Operator | Summary | Example |
|---|---|---|
retweets_of_status_id: | Matches reposts of specific posts | retweets_of_status_id:1234567890123456789 |
in_reply_to_status_id: | Matches replies to specific posts | in_reply_to_status_id:1234567890123456789 |
Non-Field Operators
Special Syntax Operators
| Operator | Summary | Example |
|---|---|---|
@ | Mention operator | @username |
| Phrase matching | Matches exact phrases | "exact phrase" |
Logical Operators
| Operator | Summary | Example |
|---|---|---|
OR | Logical OR between expressions | x OR facebook |
| Space/AND | Logical AND between expressions | x facebook (both terms must be present) |
() | Grouping for complex expressions | (x OR facebook) iphone |
- | Negation/exclusion | x -facebook (x but not facebook) |
Responses
The payload of the Powestream API is the same format as the legacy GNIP Powertrack API. A sample json response looks like:Limits & Best Practices
- Rate Limits: 50 requests/24h for rule management; no limit on streams (but connection limits apply).
- Reconnection: Exponential backoff on disconnects.
- Monitoring: Use
Connection: keep-aliveheaders.