Solana Historical Blocks vs Live Streams: When to Use Which (2026)

Solana historical blocks vs Yellowstone gRPC live streams: three production failures, a use-case decision table, and the hybrid catch-up pattern explained.

NoLimitNodes Engineering

Infrastructure Team

Jun 27, 202612 min read

On this page +

Three teams. Three different bugs. Same root cause.

Team 1: Liquidation bot fires on positions already cleared. The positions appeared open in their data. They had been closed 11 slots earlier. The bot processed a Yellowstone stream that dropped during a validator rotation. The gap was silent: no error, no alert, just stale state.

Team 2: Arb detector flags price divergence between Raydium and Orca. The signal is real in the data. Not real on-chain. The team built their backtest against a recorded WebSocket feed. The recording missed 40 competing transactions that failed. Their backtest showed an edge that never existed.

Team 3: Indexer running off a block archive is accurate to the previous midnight. A user queries it for a wallet that moved 2.3 SOL 14 minutes earlier. The indexer returns the pre-transfer balance. Six support tickets over three days before the team traced it to the data source. They thought they had a historical data problem. They had a real-time data problem.

All three chose the wrong data source for the job. Historical blocks and live streams aren't interchangeable. They solve different problems.

Timeline showing Solana archive coverage from genesis to previous midnight, the gap zone Team 3 queried into, and the live stream head at the current slot

01Completeness vs Recency: What Each Tool Actually Guarantees#

Historical blocks are a complete record. Every slot from genesis to the previous midnight, every transaction in that slot, every account balance before and after, every failed attempt. Nothing is missing. Nothing can be missing: the archive is built from the chain itself.

Live streams are a present-tense channel. Yellowstone gRPC, built as a Geyser plugin running inside the validator process, delivers events within 5–50ms of inclusion. That speed requires a tradeoff: if you're not connected when an event occurs, you don't receive it. The stream is current. It's not complete.

The question isn't which tool is better. It's which property your use case requires. If you need to know what happened, you need history. If you need to act before someone else does, you need a stream. Building a backtest on a stream or a live trading bot on an archive are both broken by design, not by implementation.

02Why Team 2's Backtest Was Wrong#

WebSocket recordings feel like historical data. They're not.

When you record a Yellowstone stream to replay later, you capture exactly what your subscription matched while you were connected. You don't capture failed transactions unless you explicitly set failed: true. You don't capture events on programs you weren't subscribed to at the time. You don't capture slots where your connection dropped.

On PumpFun during a hot launch, the ratio of failed-to-successful transactions typically exceeds 10:1. Every bot that tried to snipe and lost generates a failed transaction. Those failures are the competition. A backtest that doesn't see them is modeling a market with no other participants. The strategy will look profitable. It won't be.

The block archive captures all of it. Here's what the difference looks like in practice:

audit-block.py

python

import json
from pathlib import Path

VOTE_PROGRAM  = "Vote111111111111111111111111111111111111111"
PUMPFUN       = "6EF8rrecthR5Dkzon8Nwu78hRvfCKubJ14M5uBEwF6P"

def audit_block(block_path: str) -> dict:
    with open(block_path) as f:
        block = json.load(f)

    txs = [
        tx for tx in block.get("transactions", [])
        if VOTE_PROGRAM not in tx["transaction"]["message"]["accountKeys"]
    ]

    pumpfun_total   = 0
    pumpfun_failed  = 0

    for tx in txs:
        accounts = tx["transaction"]["message"]["accountKeys"]
        if PUMPFUN not in accounts:
            continue
        pumpfun_total += 1
        if tx["meta"]["err"] is not None:
            pumpfun_failed += 1

    return {
        "slot":              block.get("parentSlot", 0) + 1,
        "non_vote_txs":      len(txs),
        "pumpfun_txs":       pumpfun_total,
        "pumpfun_failed":    pumpfun_failed,
        "pumpfun_fail_rate": pumpfun_failed / pumpfun_total if pumpfun_total else 0,
    }

result = audit_block("./blocks/block_280000000.json")
print(f"Slot {result['slot']}: {result['pumpfun_txs']} PumpFun txs, "
      f"{result['pumpfun_fail_rate']:.0%} failed")

Run this on a few blocks during active launch periods. The failure rate will recalibrate how you think about competition. We've run it on blocks from a dozen PumpFun launches. The number is reliably 80–90%. A WebSocket recording will never show you this. If you're building a full backtest on top of the archive, our Solana backtesting guide covers the complete DuckDB and Python workflow.

03Why Team 1's Bot Fired on Cleared Positions#

The live stream isn't a historical record. It's a present-tense channel that requires you to maintain state.

When Team 1's Yellowstone connection dropped during a validator leader rotation, they missed 11 slots. The position close event was in one of those slots. Their consumer reconnected, resumed processing, and never received the close. Their internal state said the position was still open. The bot acted on stale state.

This isn't a bug in Yellowstone. The protocol delivers events once to connected consumers. It makes no promise about what you missed while disconnected. The gap detection responsibility falls entirely on the consumer.

The fix is two lines of logic wrapped around every message you process. If you're building the subscription side from scratch, our Yellowstone Python guide covers authentication, proto generation, and keepalive configuration before you reach this point.

stream-with-gap-detection.py

python

import asyncio
import grpc
import geyser_pb2
import geyser_pb2_grpc

ENDPOINT = "your-endpoint.nolimitnodes.com:10000"
API_KEY  = "your-api-key"

async def stream_kamino_with_gap_detection() -> None:
    last_slot: int = 0

    credentials = grpc.ssl_channel_credentials()
    options = [
        ("grpc.keepalive_time_ms",            10_000),
        ("grpc.keepalive_timeout_ms",          5_000),
        ("grpc.keepalive_permit_without_calls",    1),
    ]

    async with grpc.aio.secure_channel(ENDPOINT, credentials, options=options) as channel:
        stub = geyser_pb2_grpc.GeyserStub(channel)

        request = geyser_pb2.SubscribeRequest(
            transactions={
                "kamino": geyser_pb2.SubscribeRequestFilterTransactions(
                    account_include=["KLend2g3cP87fffoy8q1mQqGKjrxjC8boSyAYavgmjD"],
                    failed=True,
                )
            }
        )

        async for update in stub.Subscribe(
            iter([request]),
            metadata=[("x-api-key", API_KEY)]
        ):
            if not update.HasField("transaction"):
                continue

            slot = update.transaction.slot

            if last_slot > 0 and slot > last_slot + 1:
                missed = list(range(last_slot + 1, slot))
                print(f"WARNING: missed slots {missed[0]} to {missed[-1]} ({len(missed)} slots)")
                await backfill_from_archive(missed)

            last_slot = slot
            process_update(update)


async def backfill_from_archive(slots: list[int]) -> None:
    import aiohttp, json
    RPC = "https://api.mainnet-beta.solana.com"
    for s in slots:
        async with aiohttp.ClientSession() as session:
            payload = {"jsonrpc": "2.0", "id": 1, "method": "getBlock",
                       "params": [s, {"encoding": "json",
                                      "transactionDetails": "full",
                                      "rewards": False,
                                      "maxSupportedTransactionVersion": 0}]}
            async with session.post(RPC, json=payload) as resp:
                data = await resp.json()
        block = data.get("result")
        if block:
            process_block_from_rpc(s, block)

Without backfill_from_archive, the gap is silent. The bot continues. The state is wrong. The only signal is incorrect behavior downstream, which may not surface for minutes or slots.

One edge case before you ship: not every slot jump is a missed event. Solana validators legitimately skip slots when no block is produced for that slot. The index.parquet file in the archive marks these with tx_count = 0. Before triggering a backfill, check whether the missing slots are all empty. If every slot in the gap has tx_count = 0, there's nothing to fetch. The chain produced no transactions in that range.

Slot sequence diagram showing received slots 008-010, a missed gap of slots 011-014 where the connection dropped, then resumed at slot 015 with gap detection triggered

04Why Team 3's Indexer Gave a Wrong Balance#

The archive is correct. It's also a day old.

Team 3's indexer was built correctly for their original use case: batch analytics against historical data. When they added a real-time balance query endpoint, they kept the same data source. An archive updated at midnight can't answer questions about 14 minutes ago. The data isn't wrong. It simply doesn't exist yet.

The user who filed those six tickets wasn't misusing the product. They queried a balance endpoint and got a balance. The balance was real. For 14 hours earlier. There was no error message, no stale-data warning, no indication anything was off. The gap between the archive cutoff and the query time is invisible unless the system is designed to surface it. We've seen teams spend days debugging this before realizing the data source was the wrong choice entirely.

This is a use-case mismatch, not a data quality problem. The archive is complete up to its cutoff. The cutoff isn't now.

If you need current state, you need a live stream. An indexer that must serve both historical analysis and current balances requires both sources: archive for catch-up and history, live stream for current state. Trying to extend the archive's coverage window (re-running it every hour, polling getBlock in near-real-time) produces neither the accuracy of the archive nor the latency of a stream. It produces a slower, more expensive version of the wrong tool.

05When to Use Historical Blocks vs Live Streams#

Each of those failures came down to one bad call at the design stage. Here's the full use-case breakdown:

Use Case	Right Tool	Why
Backtesting any strategy	Historical blocks	Needs failed txs and complete slot sequence
MEV / arb execution	Live stream	Needs sub-slot latency
Token launch sniping (simulation)	Historical blocks	Competition visible in failed txs
Token launch sniping (live)	Live stream	Must detect within current slot
Liquidation bot (backtesting)	Historical blocks	Account state replay from pre/postBalances
Liquidation bot (live)	Live stream + gap detection	Stale state = wrong fires
Incident investigation	Historical blocks	Reconstruct what actually happened
Index initial sync	Historical blocks	Live stream cannot supply past slots
Index ongoing updates	Live stream	Archive lag too high for current state
Compliance / audit trail	Historical blocks	Immutable, deterministic, verifiable
Real-time price monitoring	Live stream	Sub-second updates required
Gap backfill in live pipeline	Historical blocks	Fill what the stream missed

Use-case decision table: historical blocks vs Yellowstone gRPC live streams.

The one trap in this table: “live stream” and “historical blocks” aren't always sequential. There's a third mode: running both at once.

Decision flowchart: Need data before last midnight? Yes leads to Archive. No leads to: Need to act within the current slot? Yes leads to Live Stream with gap detection. No leads to Hybrid Pattern.

06The Hybrid: Running Both Without a Gap#

Any indexer that must be accurate to the current slot eventually needs this pattern. Same goes for any strategy pipeline that backtests historically then runs live.

The naive approach: finish historical processing, then start the live stream. The problem is timing. By the time the archive run completes, the stream's current slot is hours or days ahead. There's a gap between where the archive ends and where the stream picks up.

The right approach is an overlap window.

Start the Yellowstone subscription first, before touching the archive. Let it buffer into a queue while you do nothing else with it. That queue will grow while historical processing runs, and that growth is the point.

Then process historical blocks from your target start slot forward. The live buffer accumulates in the background the whole time. When historical processing reaches the slot where the buffer started, that's your switchover.

At that point, drain the buffer. Any slot showing up in both sources: take the archive version. It's complete, verified against the chain, already processed. The buffer copy is a duplicate. Discard it.

Once the buffer is empty and you're processing the live head, shut down the historical reader. There's nothing left for it to do.

hybrid-pattern.py

python

import threading
from collections import deque
import json
from pathlib import Path
import duckdb

# Shared state
live_queue:        deque = deque()
queue_lock:        threading.Lock = threading.Lock()
buffer_start_slot: int = 0

def enqueue_live_event(slot: int, tx: dict) -> None:
    with queue_lock:
        live_queue.append((slot, tx))

def process_historical_range(archive_dir: str,
                              start_slot: int,
                              stop_at_slot: int) -> None:
    con = duckdb.connect()
    rows = con.execute(f"""
        SELECT slot, file_name
        FROM read_parquet('{archive_dir}/index.parquet')
        WHERE slot BETWEEN {start_slot} AND {stop_at_slot}
          AND tx_count > 0
        ORDER BY slot
    """).fetchall()

    for slot, fname in rows:
        with open(Path(archive_dir) / fname) as f:
            block = json.load(f)
        process_block(slot, block)

def drain_overlap_and_go_live(catchup_slot: int) -> None:
    while True:
        with queue_lock:
            if not live_queue:
                break
            slot, tx = live_queue[0]
            if slot <= catchup_slot:
                live_queue.popleft()   # already covered by archive
                continue
            live_queue.popleft()
        process_live_tx(slot, tx)

def process_block(slot: int, block: dict) -> None:
    pass   # your logic here

def process_live_tx(slot: int, tx: dict) -> None:
    pass   # your logic here

Skip the overlap and you create an uncovered window: slots too recent for the archive and too old for a fresh live subscription. That window is where production incidents hide. We've seen this exact gap cause incidents in pipelines that looked correct on paper. The overlap isn't a performance optimization. It's a correctness requirement.

Three-phase hybrid pattern diagram: Phase 1 live stream starts and buffers, Phase 2 overlap window where historical processing catches up and buffer is drained with deduplication, Phase 3 live stream only

07Before You Write the First Line#

Most teams that get this wrong aren't confused about what the tools do. They're confused about what their use case actually needs.

The first thing to settle is time. Do you need data from before this morning? Backtesting, incident investigation, compliance, index bootstrapping: all of these reach back past the last midnight. A live stream can't supply those slots. There's no workaround. The archive is the only source.

The second thing to settle is latency. Do you need to act within the current slot? If yes, the archive is out entirely. Not because it's slow for an archive. Because it isn't designed for execution at all. A pipeline that pulls historical blocks and then tries to fire trades is wrong by construction, not by configuration.

The third question is the one that causes the most production bugs: do you need failed transactions? The archive always includes them. The live stream includes them only if you explicitly set failed: true in your filter. Miss that flag and you're building the same backtest Team 2 built. A model where every competing bot succeeds, every snipe lands, and the market has no friction. That model doesn't exist on-chain.

One edge case production indexers eventually hit: validator forks. Solana occasionally produces competing blocks for the same slot. Your live stream may process events from the non-canonical fork before the chain resolves. When that slot appears in the historical archive, it contains the canonical block. Any state derived from the non-canonical fork needs to be rolled back. If your live-derived state diverges from the archive for the same slot, a fork resolution is the most likely explanation, not a data quality bug.

If time and latency both matter (you need current accuracy built on a historical foundation), you're not choosing between the two tools. You need the hybrid pattern described above, running both with an overlap window.

08Frequently Asked Questions#

What is the difference between Solana historical blocks and live streams?

Two different guarantees. Historical blocks have every transaction from genesis to yesterday midnight: deterministic, complete, failed attempts included. Live streams deliver in milliseconds but don't replay. Disconnect and you lose the gap. One is a record. The other is a channel.

When should I use Solana historical block data?

Backtesting, incident investigation, compliance, index bootstrapping. Anything that needs failed transactions. If your question is “what happened,” the archive is the only honest answer. Same slot range always returns the same result. That determinism matters more than you'd think until you need it.

When should I use Yellowstone gRPC live streams?

Execution workloads: MEV, live arb, token launch sniping, liquidations, price alerts. The archive has at least a full day of lag. You can't trade off a day-old ledger.

Can I use WebSocket recordings as a substitute for historical block data?

No. They miss failed transactions, programs you weren't subscribed to, and everything during connection drops. On PumpFun during a launch, that's 80–90% of activity. A backtest built on recordings models a market that doesn't exist.

What is a missed event in a Yellowstone gRPC stream?

Any transaction that happened while you were disconnected. gRPC doesn't replay. You reconnect, you've lost that window. Track slot sequence numbers, detect the jump, backfill from the archive.

How do I detect a gap in a Yellowstone gRPC stream?

Watch the slot sequence. Receive slot 010 then 015? You missed 011 through 014. Fetch via getBlock or from the archive before continuing. Check index.parquet first: if tx_count = 0 for those slots, it's a validator skip and there's nothing to backfill.

What is the NLN Historical Raw Blocks archive?

Complete getBlock archive from Solana genesis to previous UTC midnight. Delivered as tar.zst via signed URLs. index.parquet maps slot numbers to filenames with tx_counts so you can query slot ranges without downloading everything.

What use cases require both historical blocks and live streams?

Indexers (archive for sync, stream for ongoing updates), strategy pipelines (archive for backtesting, stream for execution), and any live pipeline that needs gap backfill. If you need a historical foundation with current accuracy, you're running both. That's the hybrid pattern.

If your pipeline needs the complete Solana ledger from genesis through yesterday, NLN Historical Raw Blocks delivers it with a slot index and per-file SHA-256 manifest. For the live side, NLN Yellowstone gRPC runs on owned bare metal in Frankfurt with decoded events across 37 programs, 1,074 typed event types, and no per-event metering on higher tiers. Still evaluating Yellowstone providers? Our 2026 provider comparison benchmarks six options on latency, decoded event coverage, and pricing structure.

Get access · Talk to an engineer

///Read next

EngineeringJun 30, 2026

Solana Validator and RPC Hardware Requirements in 2026: What the Docs Don't Tell You

What the official hardware page doesn't say: real CPU, RAM, NVMe, and network specs for running a Solana validator or RPC node in production in 2026.

#solana#validator#rpc

11 min read

EngineeringJun 25, 2026

PumpFun Data Analysis: Graduation Rates, Creator Wallets, and Bonding Curve Price Reconstruction (2026)

Analyze PumpFun launch data: graduation rates, creator wallet clustering, bonding curve price reconstruction, is_mayhem_mode, DuckDB queries.

#pumpfun#solana#duckdb

14 min read

Run it yourself

Every benchmark in this blog runs against our public endpoints.

Spin up an RPC, WebSocket, or gRPC endpoint in under a minute. Flat pricing, no request caps. Reproduce the numbers for your own workload.

See pricing

Solana Historical Blocks vs Live Streams: When to Use Which (2026)

Solana historical blocks vs Yellowstone gRPC live streams: three production failures, a use-case decision table, and the hybrid catch-up pattern explained.

NoLimitNodes Engineering

Infrastructure Team

Jun 27, 202612 min read

On this page +

Three teams. Three different bugs. Same root cause.

All three chose the wrong data source for the job. Historical blocks and live streams aren't interchangeable. They solve different problems.

01Completeness vs Recency: What Each Tool Actually Guarantees#

02Why Team 2's Backtest Was Wrong#

WebSocket recordings feel like historical data. They're not.

The block archive captures all of it. Here's what the difference looks like in practice:

audit-block.py

python

import json
from pathlib import Path

VOTE_PROGRAM  = "Vote111111111111111111111111111111111111111"
PUMPFUN       = "6EF8rrecthR5Dkzon8Nwu78hRvfCKubJ14M5uBEwF6P"

def audit_block(block_path: str) -> dict:
    with open(block_path) as f:
        block = json.load(f)

    txs = [
        tx for tx in block.get("transactions", [])
        if VOTE_PROGRAM not in tx["transaction"]["message"]["accountKeys"]
    ]

    pumpfun_total   = 0
    pumpfun_failed  = 0

    for tx in txs:
        accounts = tx["transaction"]["message"]["accountKeys"]
        if PUMPFUN not in accounts:
            continue
        pumpfun_total += 1
        if tx["meta"]["err"] is not None:
            pumpfun_failed += 1

    return {
        "slot":              block.get("parentSlot", 0) + 1,
        "non_vote_txs":      len(txs),
        "pumpfun_txs":       pumpfun_total,
        "pumpfun_failed":    pumpfun_failed,
        "pumpfun_fail_rate": pumpfun_failed / pumpfun_total if pumpfun_total else 0,
    }

result = audit_block("./blocks/block_280000000.json")
print(f"Slot {result['slot']}: {result['pumpfun_txs']} PumpFun txs, "
      f"{result['pumpfun_fail_rate']:.0%} failed")

03Why Team 1's Bot Fired on Cleared Positions#

The live stream isn't a historical record. It's a present-tense channel that requires you to maintain state.

stream-with-gap-detection.py

python

import asyncio
import grpc
import geyser_pb2
import geyser_pb2_grpc

ENDPOINT = "your-endpoint.nolimitnodes.com:10000"
API_KEY  = "your-api-key"

async def stream_kamino_with_gap_detection() -> None:
    last_slot: int = 0

    credentials = grpc.ssl_channel_credentials()
    options = [
        ("grpc.keepalive_time_ms",            10_000),
        ("grpc.keepalive_timeout_ms",          5_000),
        ("grpc.keepalive_permit_without_calls",    1),
    ]

    async with grpc.aio.secure_channel(ENDPOINT, credentials, options=options) as channel:
        stub = geyser_pb2_grpc.GeyserStub(channel)

        request = geyser_pb2.SubscribeRequest(
            transactions={
                "kamino": geyser_pb2.SubscribeRequestFilterTransactions(
                    account_include=["KLend2g3cP87fffoy8q1mQqGKjrxjC8boSyAYavgmjD"],
                    failed=True,
                )
            }
        )

        async for update in stub.Subscribe(
            iter([request]),
            metadata=[("x-api-key", API_KEY)]
        ):
            if not update.HasField("transaction"):
                continue

            slot = update.transaction.slot

            if last_slot > 0 and slot > last_slot + 1:
                missed = list(range(last_slot + 1, slot))
                print(f"WARNING: missed slots {missed[0]} to {missed[-1]} ({len(missed)} slots)")
                await backfill_from_archive(missed)

            last_slot = slot
            process_update(update)


async def backfill_from_archive(slots: list[int]) -> None:
    import aiohttp, json
    RPC = "https://api.mainnet-beta.solana.com"
    for s in slots:
        async with aiohttp.ClientSession() as session:
            payload = {"jsonrpc": "2.0", "id": 1, "method": "getBlock",
                       "params": [s, {"encoding": "json",
                                      "transactionDetails": "full",
                                      "rewards": False,
                                      "maxSupportedTransactionVersion": 0}]}
            async with session.post(RPC, json=payload) as resp:
                data = await resp.json()
        block = data.get("result")
        if block:
            process_block_from_rpc(s, block)

Without backfill_from_archive, the gap is silent. The bot continues. The state is wrong. The only signal is incorrect behavior downstream, which may not surface for minutes or slots.

04Why Team 3's Indexer Gave a Wrong Balance#

The archive is correct. It's also a day old.

This is a use-case mismatch, not a data quality problem. The archive is complete up to its cutoff. The cutoff isn't now.

05When to Use Historical Blocks vs Live Streams#

Each of those failures came down to one bad call at the design stage. Here's the full use-case breakdown:

Use Case	Right Tool	Why
Backtesting any strategy	Historical blocks	Needs failed txs and complete slot sequence
MEV / arb execution	Live stream	Needs sub-slot latency
Token launch sniping (simulation)	Historical blocks	Competition visible in failed txs
Token launch sniping (live)	Live stream	Must detect within current slot
Liquidation bot (backtesting)	Historical blocks	Account state replay from pre/postBalances
Liquidation bot (live)	Live stream + gap detection	Stale state = wrong fires
Incident investigation	Historical blocks	Reconstruct what actually happened
Index initial sync	Historical blocks	Live stream cannot supply past slots
Index ongoing updates	Live stream	Archive lag too high for current state
Compliance / audit trail	Historical blocks	Immutable, deterministic, verifiable
Real-time price monitoring	Live stream	Sub-second updates required
Gap backfill in live pipeline	Historical blocks	Fill what the stream missed

Use-case decision table: historical blocks vs Yellowstone gRPC live streams.

The one trap in this table: “live stream” and “historical blocks” aren't always sequential. There's a third mode: running both at once.

06The Hybrid: Running Both Without a Gap#

Any indexer that must be accurate to the current slot eventually needs this pattern. Same goes for any strategy pipeline that backtests historically then runs live.

The right approach is an overlap window.

Once the buffer is empty and you're processing the live head, shut down the historical reader. There's nothing left for it to do.

hybrid-pattern.py

python

import threading
from collections import deque
import json
from pathlib import Path
import duckdb

# Shared state
live_queue:        deque = deque()
queue_lock:        threading.Lock = threading.Lock()
buffer_start_slot: int = 0

def enqueue_live_event(slot: int, tx: dict) -> None:
    with queue_lock:
        live_queue.append((slot, tx))

def process_historical_range(archive_dir: str,
                              start_slot: int,
                              stop_at_slot: int) -> None:
    con = duckdb.connect()
    rows = con.execute(f"""
        SELECT slot, file_name
        FROM read_parquet('{archive_dir}/index.parquet')
        WHERE slot BETWEEN {start_slot} AND {stop_at_slot}
          AND tx_count > 0
        ORDER BY slot
    """).fetchall()

    for slot, fname in rows:
        with open(Path(archive_dir) / fname) as f:
            block = json.load(f)
        process_block(slot, block)

def drain_overlap_and_go_live(catchup_slot: int) -> None:
    while True:
        with queue_lock:
            if not live_queue:
                break
            slot, tx = live_queue[0]
            if slot <= catchup_slot:
                live_queue.popleft()   # already covered by archive
                continue
            live_queue.popleft()
        process_live_tx(slot, tx)

def process_block(slot: int, block: dict) -> None:
    pass   # your logic here

def process_live_tx(slot: int, tx: dict) -> None:
    pass   # your logic here

07Before You Write the First Line#

Most teams that get this wrong aren't confused about what the tools do. They're confused about what their use case actually needs.

08Frequently Asked Questions#

What is the difference between Solana historical blocks and live streams?

When should I use Solana historical block data?

When should I use Yellowstone gRPC live streams?

Execution workloads: MEV, live arb, token launch sniping, liquidations, price alerts. The archive has at least a full day of lag. You can't trade off a day-old ledger.

Can I use WebSocket recordings as a substitute for historical block data?

What is a missed event in a Yellowstone gRPC stream?

Any transaction that happened while you were disconnected. gRPC doesn't replay. You reconnect, you've lost that window. Track slot sequence numbers, detect the jump, backfill from the archive.

How do I detect a gap in a Yellowstone gRPC stream?

What is the NLN Historical Raw Blocks archive?

What use cases require both historical blocks and live streams?

Get access · Talk to an engineer

///Read next

EngineeringJun 30, 2026

Solana Validator and RPC Hardware Requirements in 2026: What the Docs Don't Tell You

What the official hardware page doesn't say: real CPU, RAM, NVMe, and network specs for running a Solana validator or RPC node in production in 2026.

#solana#validator#rpc

11 min read

EngineeringJun 25, 2026

PumpFun Data Analysis: Graduation Rates, Creator Wallets, and Bonding Curve Price Reconstruction (2026)

Analyze PumpFun launch data: graduation rates, creator wallet clustering, bonding curve price reconstruction, is_mayhem_mode, DuckDB queries.

#pumpfun#solana#duckdb

14 min read

Run it yourself

Every benchmark in this blog runs against our public endpoints.

Spin up an RPC, WebSocket, or gRPC endpoint in under a minute. Flat pricing, no request caps. Reproduce the numbers for your own workload.

See pricing

01Completeness vs Recency: What Each Tool Actually Guarantees#

02Why Team 2's Backtest Was Wrong#

03Why Team 1's Bot Fired on Cleared Positions#

04Why Team 3's Indexer Gave a Wrong Balance#

05When to Use Historical Blocks vs Live Streams#

06The Hybrid: Running Both Without a Gap#

07Before You Write the First Line#

08Frequently Asked Questions#

What is the difference between Solana historical blocks and live streams?

When should I use Solana historical block data?

When should I use Yellowstone gRPC live streams?

Can I use WebSocket recordings as a substitute for historical block data?

What is a missed event in a Yellowstone gRPC stream?

How do I detect a gap in a Yellowstone gRPC stream?

What is the NLN Historical Raw Blocks archive?

What use cases require both historical blocks and live streams?

Solana Validator and RPC Hardware Requirements in 2026: What the Docs Don't Tell You

PumpFun Data Analysis: Graduation Rates, Creator Wallets, and Bonding Curve Price Reconstruction (2026)

Every benchmark in this blog runs against our public endpoints.

Ready to get started?

01Completeness vs Recency: What Each Tool Actually Guarantees#

02Why Team 2's Backtest Was Wrong#

03Why Team 1's Bot Fired on Cleared Positions#

04Why Team 3's Indexer Gave a Wrong Balance#

05When to Use Historical Blocks vs Live Streams#

06The Hybrid: Running Both Without a Gap#

07Before You Write the First Line#

08Frequently Asked Questions#

What is the difference between Solana historical blocks and live streams?

When should I use Solana historical block data?

When should I use Yellowstone gRPC live streams?

Can I use WebSocket recordings as a substitute for historical block data?

What is a missed event in a Yellowstone gRPC stream?

How do I detect a gap in a Yellowstone gRPC stream?

What is the NLN Historical Raw Blocks archive?

What use cases require both historical blocks and live streams?

Solana Validator and RPC Hardware Requirements in 2026: What the Docs Don't Tell You

PumpFun Data Analysis: Graduation Rates, Creator Wallets, and Bonding Curve Price Reconstruction (2026)

Every benchmark in this blog runs against our public endpoints.

Ready to get started?