Yellowstone gRPC vs WebSockets: choosing a real-time Solana data pipeline
Both transports stream the same chain, but they come from different places inside the validator and fail in different ways. A field guide to choosing and operating the right pipeline for your workload.
On this page +
Every real-time Solana system eventually faces the same fork in the road: WebSocket subscriptions or Yellowstone gRPC. They look like two flavors of the same thing, since both push you chain events as they happen. But they originate from different subsystems inside the validator, carry different data, and fail in different ways. Picking wrong costs you either months of unnecessary plumbing or hundreds of milliseconds you can't get back.
01Two doors into the validator#
A common misconception is that WebSocket subscriptions are “Geyser with JSON.” They're not. The WebSocket PubSub API is part of the validator's built-in RPC service, fed by the same internal notifications that serve HTTP reads. Yellowstone gRPC is a Geyser plugin: a shared library loaded into the validator process that taps account writes, transactions, entries, and block metadata at the moment the runtime commits them, then serves them over its own gRPC server.
┌─────────────────────────────────────────┐
│ solana validator │
│ │
replay/banking ──┼──▶ runtime commits state │
│ │ │
│ ├──▶ geyser plugin interface │
│ │ │ │
│ │ └──▶ yellowstone gRPC ──┼──▶ protobuf stream
│ │ (in-process) │ (accounts, txs,
│ │ │ slots, blocks)
│ └──▶ rpc service │
│ │ │
│ └──▶ pubsub websocket ──┼──▶ JSON notifications
│ (accountSubscribe, │ per subscription
│ logsSubscribe, …) │
└─────────────────────────────────────────┘That architectural difference drives everything downstream: what data you get, how fresh it is, how it's encoded, and what happens when something breaks.
02WebSocket subscriptions in practice#
The PubSub API gives you a JSON-RPC subscription per question: accountSubscribe for one account's changes, programSubscribe for every account owned by a program, logsSubscribe for transaction logs mentioning an address, slotSubscribe and blockSubscribe for chain progress. It speaks the same commitment levels as HTTP reads, works from any language with a WebSocket client, and requires zero schema tooling. Watching PumpFun activity is a dozen lines:
Honest assessment of what you just built:
- It was fast to write. No protobufs, no codegen, no client library. This is the transport's genuine superpower.
- You received log lines, not state.
logsSubscribehands you strings; recovering amounts, mints, and accounts means parsing instructions yourself or making follow-up HTTP calls, which adds back the latency you subscribed to avoid. - JSON costs you twice. Once on the wire (base64-in-JSON inflation), once in
JSON.parseat 5,000 notifications per second.
03Yellowstone gRPC in practice#
Yellowstone inverts the model. You open one bidirectional stream and send one SubscribeRequest describing everything you want: transactions touching certain accounts, account writes for certain owners, slots, blocks, entries. All of it is filtered server-side inside the plugin before a single byte crosses the network. The same PumpFun watcher, upgraded:
What changed, concretely:
- Full transactions, not logs. Instructions, account keys, pre/post balances, metadata. Decoded amounts are a borsh-parse away, with no follow-up reads.
processedat the source. Geyser emits the moment the runtime commits, so you see activity slots before aconfirmedWS notification describes it.- Protobuf on the wire. Smaller payloads, cheaper decode, and a typed schema that fails at compile time instead of at 3 a.m.
- The cost is real tooling. A gRPC client, TLS config, token auth, ping handling, and a build step. Budget a day, not an hour.
04Latency and payload: the numbers#
We measured both transports side by side on the same machine, subscribed to the same program, against endpoints in the same datacenter, recording when each transport first told us about each transaction:
Two things worth internalizing. First, commitment dominates transport: gRPC at confirmed is slower than WebSocket at processed, because waiting for the cluster to vote costs more than any serialization format saves. Second, at equal commitment, gRPC still wins by 50 to 150ms at the tail. The JSON encode/decode and the PubSub notification machinery are pure overhead the Geyser path never pays.
Choose your commitment level first; that's a correctness decision. Then choose the transport that delivers that commitment with the least machinery between the runtime and your code.
05Failure modes and recovery#
Steady-state latency is what gets benchmarked; recovery behavior is what gets you paged. The transports diverge sharply here.
WebSocket: detect and reconcile
A dropped WS connection loses everything in the gap. Your recovery story is to detect fast (heartbeats, idle timers), resubscribe, then reconcile what you missed via HTTP reads. Accept that for log-shaped data, perfect reconciliation may be impossible.
gRPC: resume from slot
Yellowstone keeps a short in-memory slot buffer and accepts a fromSlot on subscribe. Track the last slot you processed, and a reconnect becomes a resume instead of a gap:
06Choosing your pipeline#
Our decision table, distilled from operating both fleets:
07When you want neither#
There's a third option this comparison quietly assumes away: not running the pipeline at all. Both transports hand you raw chain data. You still own program-specific decoding, IDL drift when protocols upgrade, and the unglamorous work of keeping parsers correct across 37 programs' worth of instruction layouts.
That layer is a product. Our Enhanced Streams deliver decoded, schema-stable events (swaps, launches, graduations, liquidity changes) over the same WebSocket and gRPC transports, and Program Streams expose every instruction and event of every major program as 1,074 individually subscribable topics. The transport trade-offs in this post still apply; the decoding burden doesn't.
The summary we'd put on a sticky note: WebSocket is the right default; Yellowstone gRPC is the right ceiling. Start with the transport you can ship this afternoon, instrument your first-sight latency honestly, and when the numbers in Table 1 start costing you money, the gRPC migration is a well-marked road. Both doors are on our endpoints. Run the comparison yourself.
Solana Meme Coin Trading with NoLimitNodes API: Ride the Solana Pump Fun with Price Alerts and Token Launch Insights
Mastering Pump.fun: Track Real-Time Token Launches, Monitor Trades & Automate Your Strategy with WebSockets
Every benchmark in this blog runs against our public endpoints.
Spin up an RPC, WebSocket, or gRPC endpoint in under a minute — flat pricing, no request caps — and reproduce the numbers for your own workload.