The anatomy of a sub-50ms Solana RPC request

Where the milliseconds actually go between your client and a Solana node: connection setup, commitment levels, queueing, and the calls that quietly dominate your p99, plus how we engineer each stage down.

NoLimitNodes Engineering

Infrastructure Team

Jun 2, 2026updated Jun 10, 202614 min read

On this page +

A Solana slot lasts about 400 milliseconds. If your RPC round-trip costs 200 of them, you're reacting to a chain that has already moved on. This post walks the full path of a JSON-RPC request, from your client's socket to a node's bank state and back, and accounts for every place the time goes. Most of it hides where nobody looks.

~400ms

slot time

The clock every trading system races against

50ms

our request budget

Warm connection, same-region client

0ms

queueing target

No artificial rate-limit queues on any plan

p99

the number that matters

Averages hide the requests that hurt

01Where latency actually lives#

When developers see slow RPC, they usually blame the node. Sometimes they're right. But a request spends its life in five distinct stages, and the node's execution time is frequently the smallest of them:

request path · client → node → client

  client                    edge / LB                  solana node
  ──────                    ─────────                  ───────────
  ① connect ──────────────▶
     dns + tcp + tls          (80–250ms cold,
                               ~0ms warm)
  ② transmit ─────────────▶ ③ queue ────────────────▶
     serialize + send          wait for a worker         (0ms … unbounded)
                                                       ④ execute
                                                          read bank state
                                                          (1–30ms typical)
  ⑥ parse    ◀───────────────────────────────────────  ⑤ respond
     deserialize JSON                                     serialize + send

Fig. 1: The five stages of a JSON-RPC round trip. Cold-start costs (parenthesized) disappear with connection reuse; queueing disappears with capacity.

Stage	Cold connection	Warm connection	Who controls it
DNS + TCP + TLS handshake	80–250ms	~0ms	You (connection reuse)
Request transmit	0.5–2ms	0.5–2ms	Physics (RTT)
Provider-side queueing	0–500ms+	0–500ms+	Your provider
Node execution	1–30ms	1–30ms	Method choice + node hardware
Response transfer + parse	1–20ms	1–20ms	Payload size

Table 1: Typical contribution of each stage to total round-trip time, measured from a same-region client. Queueing is the only stage with an unbounded tail.

Three of the five stages are negotiable. The handshake disappears if you keep connections alive. Queueing disappears if your provider has capacity and doesn't throttle you into a holding pattern. Execution time collapses if you choose methods the node can answer from hot state. The rest of this post takes each in turn.

02The wire tax: connections you keep paying for#

A fresh HTTPS connection costs one round trip for TCP and at least one more for TLS 1.3. From a client 20ms away from the endpoint, that's 40 to 60ms gone before the node has seen a single byte of your request. That alone blows the entire latency budget. The highest-impact change most teams can make is embarrassingly mundane: stop opening connections.

First, measure it. curl will itemize the bill for you:

latency-breakdown.sh

bash

# Measure where the time goes on a single getLatestBlockhash call
curl -s -o /dev/null -w '\
dns:        %{time_namelookup}s
tcp:        %{time_connect}s
tls:        %{time_appconnect}s
first byte: %{time_starttransfer}s
total:      %{time_total}s\n' \
  -X POST https://your-endpoint.nolimitnodes.com \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"getLatestBlockhash","params":[{"commitment":"confirmed"}]}'

If tls dominates total, your client is paying the handshake tax on every call. In Node, the fix is a single long-lived dispatcher:

rpc-client.ts

typescript

import { Agent, fetch } from 'undici'

// One agent for the lifetime of the process. Connections stay open,
// TLS sessions get resumed, and HTTP/2 multiplexes concurrent calls
// over a single socket instead of opening one per request.
const agent = new Agent({
  keepAliveTimeout: 60_000,      // keep idle sockets for 60s
  keepAliveMaxTimeout: 600_000,
  connections: 16,               // per-origin socket cap
  pipelining: 1,
})

export async function rpc<T>(method: string, params: unknown[]): Promise<T> {
  const res = await fetch(process.env.RPC_URL!, {
    method: 'POST',
    dispatcher: agent,
    headers: { 'content-type': 'application/json' },
    body: JSON.stringify({ jsonrpc: '2.0', id: 1, method, params }),
  })
  const json = (await res.json()) as { result?: T; error?: { message: string } }
  if (json.error) throw new Error(`${method}: ${json.error.message}`)
  return json.result as T
}

Libraries like @solana/web3.js reuse connections internally if you reuse the Connection object. The classic mistake is constructing a new client per request inside a serverless handler. Every invocation pays the cold tax, and your p50 quietly triples.

03Commitment is a freshness knob#

Every read method accepts a commitment level, and it changes which bank state answers your query. It doesn't change how fast the node runs, just how old the data you get back is allowed to be:

Commitment	Lag behind tip	Rollback risk	Use it for
processed	0 slots	Yes, the slot can be skipped	Price feeds, MEV, anything you re-check
confirmed	~1–2 slots	Practically none after supermajority vote	Trading logic, balance checks, most apps
finalized	~32 slots (≈13s)	None	Settlement, accounting, compliance

Table 2: Commitment levels and the staleness they imply. Slot lag is relative to the node's view of the tip.

The latency angle: asking for finalized when confirmed would do adds up to 13 seconds of data staleness to every decision your system makes, which dwarfs anything you'll ever save optimizing the wire. We see well-built bots running confirmed reads with processed pre-checks. The fast path speculates, the slow path verifies.

04The calls that eat your p99#

Not all methods are priced equally by reality. getBalance is a hash-map lookup. getProgramAccounts against the SPL Token program is a scan over hundreds of millions of accounts. Same API, same HTTP status code, four orders of magnitude apart.

Taming getProgramAccounts

If you must call gPA, give the node every chance to skip work. dataSize narrows the scan to accounts of one exact size; memcmp filters on bytes at a fixed offset; dataSlice shrinks the response. Here's the canonical “find token accounts for a mint” query done properly:

gpa-filtered.json

json

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "getProgramAccounts",
  "params": [
    "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA",
    {
      "commitment": "confirmed",
      "encoding": "base64",
      "filters": [
        { "dataSize": 165 },
        { "memcmp": { "offset": 0, "bytes": "So11111111111111111111111111111111111111112" } }
      ],
      "dataSlice": { "offset": 64, "length": 8 }
    }
  ]
}

Beyond gPA, the usual suspects in a slow trace:

getBlock with full transaction detail: megabytes of JSON per call. Ask for signatures detail or use maxSupportedTransactionVersion and slice what you need.
getSignaturesForAddress on hot addresses: paginating a DEX program's history through RPC is archaeology with a teaspoon. Historical questions belong in datasets, not request loops.
Polling anything: if you call the same method every 400ms to detect change, you have reinvented a push feed, badly. That workload is what gRPC streams are for.

05Queueing: the silent killer#

Here's the part of the request path nobody benchmarks: what happens between your packet arriving at the provider's edge and the node beginning to execute it. Under load, oversubscribed providers make a choice. They can reject excess requests with a 429, or queue them so success rates look clean while latency quietly bleeds. Queueing is invisible in uptime dashboards and very visible in your fill prices.

Your users don't experience your median. They experience your p99, at the worst possible moment, when everyone else is also hammering the chain.

the rule we design our fleet around

This is why we publish percentiles instead of averages. It's also why you should measure them yourself; the harness is twenty lines:

bench.ts

typescript

// p50 tells you about your demo. p99 tells you about production.
const N = 1_000
const samples: number[] = []

for (let i = 0; i < N; i++) {
  const t0 = performance.now()
  await rpc('getLatestBlockhash', [{ commitment: 'confirmed' }])
  samples.push(performance.now() - t0)
}

samples.sort((a, b) => a - b)
const pct = (p: number) => samples[Math.min(N - 1, Math.floor((p / 100) * N))]

console.table({
  p50: pct(50).toFixed(1) + ' ms',
  p90: pct(90).toFixed(1) + ' ms',
  p99: pct(99).toFixed(1) + ' ms',
  max: samples[N - 1].toFixed(1) + ' ms',
})

06How we engineer the path#

Everything above applies to any provider. Here's what we specifically do with the stages we control:

No artificial queues. Flat pricing with no request caps means we never have to slow-walk your traffic to protect a metering tier. Capacity planning is our problem, by design.
Heavy methods run on segregated pools. A whale calling unfiltered gPA lands on scan-optimized nodes and can't head-of-line-block your getLatestBlockhash.
HTTP/2 at the edge, hot state at the node. Handshakes are amortized, and the methods that dominate real traffic are answered from memory-resident bank state.
Push over poll wherever possible. Our WebSocket and Yellowstone gRPC endpoints exist precisely so your hot path never contains a polling loop.

Method	p50	p99
getLatestBlockhash	11ms	34ms
getBalance	10ms	31ms
getMultipleAccounts (32 keys)	14ms	47ms
getTransaction	16ms	52ms
simulateTransaction	24ms	78ms
getProgramAccounts (filtered, dataSize + memcmp)	89ms	310ms

Table 3: Round-trip latency by method against a NoLimitNodes shared endpoint. 1,000 sequential requests per method, warm HTTP/2 connection, same-region client, confirmed commitment. Reproduce with bench.ts above.

Note the last row: even filtered, gPA costs about 8× a state read at p50. No provider engineering makes a scan free. The honest fix is to move standing queries off the request path entirely, which is the subject of our streaming field guide.

07A latency checklist#

Before you file a ticket that says “RPC is slow,” run down this list:

One long-lived client per process, never per request. Verify with the curl breakdown.
Deploy in the same region as your endpoint. Physics is undefeated; 120ms of RTT can't be engineered away downstream.
Use confirmed by default. Escalate to finalized only where settlement demands it.
Filter every gPA with dataSize + memcmp, and question why it's on the hot path at all.
Batch point reads with getMultipleAccounts instead of fanning out singles.
Replace polling loops with subscriptions. WebSocket for convenience, gRPC for the last milliseconds.
Measure p99 from production conditions, continuously, not once during vendor selection.

A sub-50ms request isn't one trick. It's the absence of a dozen small taxes: handshakes you didn't reuse, queues you didn't see, scans you didn't filter, staleness you didn't choose. Every stage of the path is measurable, and everything measurable is fixable. The numbers in this post run against our public endpoints; we'd genuinely like you to check them.

///Read next

EngineeringJul 11, 2026

How to Evaluate a Solana Data-Stream Provider (Checklist)

A three-phase protocol for evaluating Solana gRPC stream providers: pre-contract questions that disqualify bad fits, week-1 tests with pass/fail criteria, and a printable checklist.

#solana#yellowstone-grpc#stream-provider

9 min read

EngineeringJul 5, 2026

How Many RPS Do You Actually Need? Sizing Your Solana RPC Plan

Solana RPC sizing guide: why your average RPS number misleads, how method weights multiply real load, and the formula operators use to stop getting throttled.

#solana#rpc#rpc-sizing

10 min read

Run it yourself

Every benchmark in this blog runs against our public endpoints.

Spin up an RPC, WebSocket, or gRPC endpoint in under a minute. Flat pricing, no request caps. Reproduce the numbers for your own workload.

See pricing

The anatomy of a sub-50ms Solana RPC request

NoLimitNodes Engineering

Infrastructure Team

Jun 2, 2026updated Jun 10, 202614 min read

On this page +

~400ms

slot time

The clock every trading system races against

50ms

our request budget

Warm connection, same-region client

0ms

queueing target

No artificial rate-limit queues on any plan

p99

the number that matters

Averages hide the requests that hurt

01Where latency actually lives#

request path · client → node → client

  client                    edge / LB                  solana node
  ──────                    ─────────                  ───────────
  ① connect ──────────────▶
     dns + tcp + tls          (80–250ms cold,
                               ~0ms warm)
  ② transmit ─────────────▶ ③ queue ────────────────▶
     serialize + send          wait for a worker         (0ms … unbounded)
                                                       ④ execute
                                                          read bank state
                                                          (1–30ms typical)
  ⑥ parse    ◀───────────────────────────────────────  ⑤ respond
     deserialize JSON                                     serialize + send

Fig. 1: The five stages of a JSON-RPC round trip. Cold-start costs (parenthesized) disappear with connection reuse; queueing disappears with capacity.

Stage	Cold connection	Warm connection	Who controls it
DNS + TCP + TLS handshake	80–250ms	~0ms	You (connection reuse)
Request transmit	0.5–2ms	0.5–2ms	Physics (RTT)
Provider-side queueing	0–500ms+	0–500ms+	Your provider
Node execution	1–30ms	1–30ms	Method choice + node hardware
Response transfer + parse	1–20ms	1–20ms	Payload size

Table 1: Typical contribution of each stage to total round-trip time, measured from a same-region client. Queueing is the only stage with an unbounded tail.

02The wire tax: connections you keep paying for#

First, measure it. curl will itemize the bill for you:

latency-breakdown.sh

bash

# Measure where the time goes on a single getLatestBlockhash call
curl -s -o /dev/null -w '\
dns:        %{time_namelookup}s
tcp:        %{time_connect}s
tls:        %{time_appconnect}s
first byte: %{time_starttransfer}s
total:      %{time_total}s\n' \
  -X POST https://your-endpoint.nolimitnodes.com \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"getLatestBlockhash","params":[{"commitment":"confirmed"}]}'

If tls dominates total, your client is paying the handshake tax on every call. In Node, the fix is a single long-lived dispatcher:

rpc-client.ts

typescript

import { Agent, fetch } from 'undici'

// One agent for the lifetime of the process. Connections stay open,
// TLS sessions get resumed, and HTTP/2 multiplexes concurrent calls
// over a single socket instead of opening one per request.
const agent = new Agent({
  keepAliveTimeout: 60_000,      // keep idle sockets for 60s
  keepAliveMaxTimeout: 600_000,
  connections: 16,               // per-origin socket cap
  pipelining: 1,
})

export async function rpc<T>(method: string, params: unknown[]): Promise<T> {
  const res = await fetch(process.env.RPC_URL!, {
    method: 'POST',
    dispatcher: agent,
    headers: { 'content-type': 'application/json' },
    body: JSON.stringify({ jsonrpc: '2.0', id: 1, method, params }),
  })
  const json = (await res.json()) as { result?: T; error?: { message: string } }
  if (json.error) throw new Error(`${method}: ${json.error.message}`)
  return json.result as T
}

03Commitment is a freshness knob#

Every read method accepts a commitment level, and it changes which bank state answers your query. It doesn't change how fast the node runs, just how old the data you get back is allowed to be:

Commitment	Lag behind tip	Rollback risk	Use it for
processed	0 slots	Yes, the slot can be skipped	Price feeds, MEV, anything you re-check
confirmed	~1–2 slots	Practically none after supermajority vote	Trading logic, balance checks, most apps
finalized	~32 slots (≈13s)	None	Settlement, accounting, compliance

Table 2: Commitment levels and the staleness they imply. Slot lag is relative to the node's view of the tip.

04The calls that eat your p99#

Taming getProgramAccounts

gpa-filtered.json

json

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "getProgramAccounts",
  "params": [
    "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA",
    {
      "commitment": "confirmed",
      "encoding": "base64",
      "filters": [
        { "dataSize": 165 },
        { "memcmp": { "offset": 0, "bytes": "So11111111111111111111111111111111111111112" } }
      ],
      "dataSlice": { "offset": 64, "length": 8 }
    }
  ]
}

Beyond gPA, the usual suspects in a slow trace:

getBlock with full transaction detail: megabytes of JSON per call. Ask for signatures detail or use maxSupportedTransactionVersion and slice what you need.
getSignaturesForAddress on hot addresses: paginating a DEX program's history through RPC is archaeology with a teaspoon. Historical questions belong in datasets, not request loops.
Polling anything: if you call the same method every 400ms to detect change, you have reinvented a push feed, badly. That workload is what gRPC streams are for.

05Queueing: the silent killer#

Your users don't experience your median. They experience your p99, at the worst possible moment, when everyone else is also hammering the chain.

the rule we design our fleet around

This is why we publish percentiles instead of averages. It's also why you should measure them yourself; the harness is twenty lines:

bench.ts

typescript

// p50 tells you about your demo. p99 tells you about production.
const N = 1_000
const samples: number[] = []

for (let i = 0; i < N; i++) {
  const t0 = performance.now()
  await rpc('getLatestBlockhash', [{ commitment: 'confirmed' }])
  samples.push(performance.now() - t0)
}

samples.sort((a, b) => a - b)
const pct = (p: number) => samples[Math.min(N - 1, Math.floor((p / 100) * N))]

console.table({
  p50: pct(50).toFixed(1) + ' ms',
  p90: pct(90).toFixed(1) + ' ms',
  p99: pct(99).toFixed(1) + ' ms',
  max: samples[N - 1].toFixed(1) + ' ms',
})

06How we engineer the path#

Everything above applies to any provider. Here's what we specifically do with the stages we control:

No artificial queues. Flat pricing with no request caps means we never have to slow-walk your traffic to protect a metering tier. Capacity planning is our problem, by design.
Heavy methods run on segregated pools. A whale calling unfiltered gPA lands on scan-optimized nodes and can't head-of-line-block your getLatestBlockhash.
HTTP/2 at the edge, hot state at the node. Handshakes are amortized, and the methods that dominate real traffic are answered from memory-resident bank state.
Push over poll wherever possible. Our WebSocket and Yellowstone gRPC endpoints exist precisely so your hot path never contains a polling loop.

Method	p50	p99
getLatestBlockhash	11ms	34ms
getBalance	10ms	31ms
getMultipleAccounts (32 keys)	14ms	47ms
getTransaction	16ms	52ms
simulateTransaction	24ms	78ms
getProgramAccounts (filtered, dataSize + memcmp)	89ms	310ms

07A latency checklist#

Before you file a ticket that says “RPC is slow,” run down this list:

One long-lived client per process, never per request. Verify with the curl breakdown.
Deploy in the same region as your endpoint. Physics is undefeated; 120ms of RTT can't be engineered away downstream.
Use confirmed by default. Escalate to finalized only where settlement demands it.
Filter every gPA with dataSize + memcmp, and question why it's on the hot path at all.
Batch point reads with getMultipleAccounts instead of fanning out singles.
Replace polling loops with subscriptions. WebSocket for convenience, gRPC for the last milliseconds.
Measure p99 from production conditions, continuously, not once during vendor selection.

///Read next

EngineeringJul 11, 2026

How to Evaluate a Solana Data-Stream Provider (Checklist)

A three-phase protocol for evaluating Solana gRPC stream providers: pre-contract questions that disqualify bad fits, week-1 tests with pass/fail criteria, and a printable checklist.

#solana#yellowstone-grpc#stream-provider

9 min read

EngineeringJul 5, 2026

How Many RPS Do You Actually Need? Sizing Your Solana RPC Plan

Solana RPC sizing guide: why your average RPS number misleads, how method weights multiply real load, and the formula operators use to stop getting throttled.

#solana#rpc#rpc-sizing

10 min read

Run it yourself

Every benchmark in this blog runs against our public endpoints.

Spin up an RPC, WebSocket, or gRPC endpoint in under a minute. Flat pricing, no request caps. Reproduce the numbers for your own workload.

See pricing

The anatomy of a sub-50ms Solana RPC request

01Where latency actually lives#

02The wire tax: connections you keep paying for#

03Commitment is a freshness knob#

04The calls that eat your p99#

Taming getProgramAccounts

05Queueing: the silent killer#

06How we engineer the path#

07A latency checklist#

How to Evaluate a Solana Data-Stream Provider (Checklist)

How Many RPS Do You Actually Need? Sizing Your Solana RPC Plan

Every benchmark in this blog runs against our public endpoints.

Ready to get started?

The anatomy of a sub-50ms Solana RPC request

01Where latency actually lives#

02The wire tax: connections you keep paying for#

03Commitment is a freshness knob#

04The calls that eat your p99#

Taming getProgramAccounts

05Queueing: the silent killer#

06How we engineer the path#

07A latency checklist#

How to Evaluate a Solana Data-Stream Provider (Checklist)

How Many RPS Do You Actually Need? Sizing Your Solana RPC Plan

Every benchmark in this blog runs against our public endpoints.

Ready to get started?