Solana Geyser Plugin (2026): Build, Deploy & 6 Failure Modes

How the Agave Geyser plugin interface works, a complete Rust build from cargo new to mainnet, and the six production failure modes we have debugged on real customer plugins: variant mismatch, silent panic, startup double-count, and more.

NoLimitNodes Engineering

Infrastructure Team

Jun 17, 202622 min read

On this page +

We host Geyser plugins for a living. Which means we've debugged more broken ones than we'd like to admit. Variant mismatches caught at 2am. Data-integrity issues traced back to five missing lines of code. And the one that still bothers me to think about: a plugin that silently lost data for two weeks while the validator kept reporting itself as completely healthy. No alerts. No errors. The data just stopped coming. We found it during a routine audit. Nobody warned us about the silent panic. The variant mismatch is annoying; the silent panic is the one that loses data for two weeks while the validator reports healthy. If you read nothing else, read failure mode 3.

01What the Solana Geyser plugin actually is#

A Geyser plugin is a Rust .so (a shared library) that the Agave validator loads at startup and runs inside its own process. It gets callbacks when accounts are written, when transactions confirm, when slots change status. Your code sees state at the moment the runtime commits it, before the RPC layer knows anything happened.

Geyser path vs RPC path latency


slot fires
  geyser path:  your .so runs in-process                  < 1ms
  rpc path:     RPC -> JSON/WebSocket -> your app          ~10-150ms

Fig. 1: A Geyser plugin runs in-process. The RPC layer sees state after the Geyser callback has already fired.

If you've been polling getProgramAccounts at 250ms, or using accountSubscribe over WebSocket, you're on the RPC path. A Geyser plugin bypasses it entirely.

The interface is maintained by Anza (the team that took over Solana Labs' validator work) and ships as agave-geyser-plugin-interface on crates.io, currently at version 3.1.5. If you want a quick definition before diving in, our Geyser plugin glossary entry covers the basics. The Solana Labs repository was archived January 22, 2025. Before you add a single dependency, point it at anza-xyz/agave, not the old solana-geyser-plugin-interface crate. Nobody is maintaining that anymore.

Yellowstone gRPC (what most people in the Solana space loosely call “the streaming gRPC thing”) is itself a Geyser plugin. When you stream from Helius, Alchemy, Triton, or our own gRPC endpoint, the data started as a callback in code structurally identical to what you'd write here.

02Why it exists#

Validators were falling behind consensus because people were using them as databases. A validator fielding thousands of getProgramAccounts calls per second (AMM pools, NFT listings, orderbooks, token holder lists) can't process transactions and answer database queries at the same time. Geyser replaced a failed second-node approach with a simpler idea: instead of a second node playing catch-up, you get a hook inside the first one. Accounts are pushed to your code the moment they're written. Your code decides what to do with them. The validator moves on without waiting.

Magic Eden ran this experiment and measured it. Their engineering team documented switching from RPC-based state tracking to a Geyser pipeline and got “almost 30x faster” end-to-end performance, from chain event to visible UI change. It comes from operating one of the highest-traffic NFT platforms on Solana, and they published it. We've seen similar jumps on our own validators. Teams come in polling RPC at 500ms and leave with sub-millisecond account updates. The gap is that wide.

03The trait: nine methods and one change that mattered#

The GeyserPlugin trait has nine methods. Realistically you'll implement two or three; the rest have default no-op implementations. Your plugin compiles to a cdylib, exports a C-ABI entry point named _create_plugin, and the validator dlopen()s it at startup.

Solana 1.16 changed &mut self to &self on every callback except on_load and on_unload. Before that, every callback acquired an exclusive write lock on your plugin struct—a serious bottleneck under load. This is why you'll see old plugins using RefCell for interior mutability. Today, every field your plugin touches from callback code must be thread-safe: Mutex, RwLock, AtomicU64, channels. RefCell and Cell won't compile.

Lifecycle methods

name() is the only truly required method. Name it after whatever program you're indexing, not “plugin”. “pumpfun-indexer” tells you everything at 3am; “plugin” tells you nothing.

on_load() is where you parse the config, open your database pool, and spawn worker threads. Most guides skip the second parameter: is_reload. When a hot-reload is triggered without a full validator restart, this comes in true. We use it to swap filter configs and reconnect to new database endpoints without downtime.

on_unload() is where things go wrong when teams skip it. Every resource you open in on_load must be closed here, in order, with a timeout. See failure mode #6.

The account callback

update_account() is the one that matters. Agave's own docs say “any delay here may cause the validator to fall behind the network.” That's not a suggestion.

The account argument is a versioned enum. On current Agave that's ReplicaAccountInfoVersions::V0_0_3. Match the wrong variant, the validator starts, loads the plugin, prints no errors, fires nothing. The Rust compiler won't warn you. The match arm is perfectly valid Rust, it just never executes. We come back to this in failure mode #1.

write_version inside is not a per-account counter. It's global: a single atomic counter that increments with every account write anywhere on the validator. Same account written twice in one slot? write_version is the only way to know which came last. slot alone won't tell you.

The txn field inside ReplicaAccountInfoV3 is Option<&SanitizedTransaction>. During startup replay (when is_startup is true) this is always None. Unwrap it without checking and your plugin panics on startup.

Slot status variants

update_slot_status() has more variants than any blog post I've seen actually documents:

FirstShredReceived: earliest possible signal, before processing even starts
CreatedBank: execution environment is live for this slot
Completed: all shreds received, not yet replayed
Processed: replayed, but 5–10% of these never get confirmed
Confirmed: supermajority vote; what production indexers actually use
Rooted: permanent, ~32 seconds of latency for the guarantee
Dead(String): slot rejected, the string tells you why

The Confirmed vs Processed distinction is real. Processed slots get skipped at a 5–10% rate. If your indexer acts on processed events and the slot gets skipped, you've acted on data that doesn't exist on-chain. Most production indexers wait for confirmed: supermajority guarantee without the 32-second finality wait.

Feature flags

feature flags

rust

fn account_data_notifications_enabled(&self) -> bool { true }
fn account_data_snapshot_notifications_enabled(&self) -> bool { true }
fn transaction_notifications_enabled(&self) -> bool { false }  // off by default
fn entry_notifications_enabled(&self) -> bool { false }         // off by default

Return false for what you don't need. The validator genuinely skips those codepaths. Transaction notifications are off by default; you have to explicitly turn them on. A lot of people miss that and then wonder why CPU usage is higher than expected on a high-throughput program.

04Building a Solana Geyser plugin in Rust: full tutorial#

The plugin below filters by program owner and writes account updates to Postgres. The hot-path/worker channel split is the one part that needs explaining; the rest is boilerplate you set up once.

Scaffold and manifest

scaffold

bash

cargo new --lib my-indexer-plugin
cd my-indexer-plugin

Cargo.toml

toml

[package]
name = "my-indexer-plugin"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
agave-geyser-plugin-interface = { git = "https://github.com/anza-xyz/agave" }
tokio = { version = "1", features = ["full"] }
sqlx = { version = "0.7", features = ["runtime-tokio-rustls", "postgres"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
bs58 = "0.5"

crate-type = ["cdylib"] is the line most teams miss on the first try. Without it, cargo build --release produces a regular Rust library the validator can't dlopen(). It needs a dynamic library with a C-ABI entry point. That's what cdylib produces.

Version compatibility is non-negotiable: the agave-geyser-plugin-interface version in your Cargo.toml must match the validator binary's version exactly, built with the same Rust toolchain. A mismatch produces either a silent dlopen() failure or undefined behavior across the FFI boundary. When Agave upgrades, you recompile. No workaround exists.

The plugin

src/lib.rs

rust

use agave_geyser_plugin_interface::geyser_plugin_interface::{
    GeyserPlugin, GeyserPluginError, ReplicaAccountInfoVersions,
    ReplicaBlockInfoVersions, ReplicaTransactionInfoVersions, SlotStatus,
    Result as GeyserResult,
};
use serde::Deserialize;
use std::sync::mpsc::{channel, Sender};
use tokio::runtime::Runtime;

#[derive(Deserialize)]
struct Config {
    target_program: String,
    database_url: String,
}

struct AccountJob {
    pubkey: Vec<u8>,
    owner: Vec<u8>,
    lamports: u64,
    data: Vec<u8>,
    slot: u64,
    write_version: u64,
}

pub struct MyIndexer {
    cfg: Option<Config>,
    sender: Option<Sender<AccountJob>>,
    runtime: Option<Runtime>,
}

impl GeyserPlugin for MyIndexer {
    fn name(&self) -> &'static str { "my-indexer" }

    fn on_load(&mut self, config_file: &str, _is_reload: bool) -> GeyserResult<()> {
        let cfg: Config = serde_json::from_str(
            &std::fs::read_to_string(config_file)
                .map_err(|e| GeyserPluginError::Custom(Box::new(e)))?
        ).map_err(|e| GeyserPluginError::Custom(Box::new(e)))?;

        let (tx, rx) = channel::<AccountJob>();
        let rt = Runtime::new().unwrap();
        let db_url = cfg.database_url.clone();

        rt.spawn(async move {
            let pool = sqlx::PgPool::connect(&db_url).await
                .expect("db connect failed");
            while let Ok(job) = rx.recv() {
                // write_version is a global monotonic counter, not per-account.
                // Same account written twice in one slot: write_version tells
                // you which came second. slot alone is insufficient.
                let _ = sqlx::query(
                    "INSERT INTO accounts (pubkey, owner, lamports, data, slot, write_version)
                     VALUES ($1, $2, $3, $4, $5, $6)
                     ON CONFLICT (pubkey) DO UPDATE
                       SET lamports      = EXCLUDED.lamports,
                           data          = EXCLUDED.data,
                           slot          = EXCLUDED.slot,
                           write_version = EXCLUDED.write_version
                       WHERE accounts.write_version < EXCLUDED.write_version"
                )
                .bind(&job.pubkey)
                .bind(&job.owner)
                .bind(job.lamports as i64)
                .bind(&job.data)
                .bind(job.slot as i64)
                .bind(job.write_version as i64)
                .execute(&pool).await;
            }
        });

        self.cfg     = Some(cfg);
        self.sender  = Some(tx);
        self.runtime = Some(rt);
        Ok(())
    }

    fn on_unload(&mut self) {
        // Drop sender first: the worker sees the channel close and drains
        // any in-flight writes before the runtime shuts down.
        drop(self.sender.take());
        if let Some(rt) = self.runtime.take() {
            rt.shutdown_timeout(std::time::Duration::from_secs(10));
        }
    }

    fn account_data_notifications_enabled(&self) -> bool { true }
    fn transaction_notifications_enabled(&self) -> bool { false }
    fn entry_notifications_enabled(&self) -> bool { false }

    fn update_account(
        &self,
        account: ReplicaAccountInfoVersions,
        slot: u64,
        _is_startup: bool,
    ) -> GeyserResult<()> {
        // Hot path. Filter in microseconds, hand off, return.
        // Any I/O here falls behind the validator.
        let info = match account {
            ReplicaAccountInfoVersions::V0_0_3(i) => i,
            _ => return Ok(()),
        };

        let target = self.cfg.as_ref().unwrap().target_program.as_str();
        if bs58::encode(info.owner).into_string() != target {
            return Ok(());
        }

        let _ = self.sender.as_ref().unwrap().send(AccountJob {
            pubkey:        info.pubkey.to_vec(),
            owner:         info.owner.to_vec(),
            lamports:      info.lamports,
            data:          info.data.to_vec(),
            slot,
            write_version: info.write_version,
        });
        Ok(())
    }

    fn update_slot_status(&self, _: u64, _: Option<u64>, _: &SlotStatus) -> GeyserResult<()> { Ok(()) }
    fn notify_end_of_startup(&self) -> GeyserResult<()> { Ok(()) }
    fn notify_transaction(&self, _: ReplicaTransactionInfoVersions, _: u64) -> GeyserResult<()> { Ok(()) }
    fn notify_block_metadata(&self, _: ReplicaBlockInfoVersions) -> GeyserResult<()> { Ok(()) }
}

#[no_mangle]
pub unsafe extern "C" fn _create_plugin() -> *mut dyn GeyserPlugin {
    Box::into_raw(Box::new(MyIndexer {
        cfg: None, sender: None, runtime: None,
    }))
}

Build and deploy

build

bash

cargo build --release
# → target/release/libmy_indexer_plugin.so

my-plugin.json

json

{
  "libpath": "/etc/solana/plugins/libmy_indexer_plugin.so",
  "target_program": "6EF8rrecthR5Dkzon8Nwu78hRvfCKubJ14M5uBEwF6P",
  "database_url": "postgres://user:pass@localhost/solana"
}

validator startup

bash

agave-validator \
  --geyser-plugin-config /etc/solana/my-plugin.json \
  [... your normal validator flags]

05Six failure modes that will ruin your week#

We've had support tickets for all of these. Some more than once.

1. The variant mismatch

Your plugin compiles. The validator starts. No errors anywhere. No callbacks fire. Agave 3.x ships ReplicaAccountInfoVersions::V0_0_3. If you're pattern-matching V0_0_2 (which most tutorials still show), the compiler says nothing. The match arm is perfectly valid Rust. It just never executes. You'll spend two hours adding print statements before you find it.

Write a unit test before you deploy: construct ReplicaAccountInfoVersions::V0_0_3 directly, call update_account, assert the channel receiver got a job. If the counter stays at zero, you found the bug on your laptop. That's the right place to find it.

2. The slow callback

Your update_account takes 1.5ms per call instead of 100µs. The validator is still producing blocks. But it's starting to skip slots. A skipped slot means transactions landed and your callback never fired for them. Permanently. You won't see this in your plugin logs (there's nothing to log when a callback doesn't execute). You'll see it as gaps in indexer data, or a trading strategy underperforming by exactly the margin of missed events, discovered weeks later in a post-mortem.

Monitor callback p99 continuously. Not “is the validator running” but callback duration specifically. Different metrics. Only one catches this.

3. The silent panic

This is the one I mentioned at the top. GitHub issue #27283, closed as “not planned.” When a Geyser plugin panics inside a callback, the validator keeps reporting itself as healthy. RPC health endpoints return ok. Block production continues. Only that specific callback stops executing, silently.

If notify_transaction panics, you stop receiving transaction notifications. If update_account panics, accounts stop being indexed. The validator has no idea either happened. No alert, no log entry on the validator side. Your indexer looks fine. Trading strategies running off it start underperforming slightly. We had a customer spend twelve days in that loop before we asked them to add a heartbeat counter inside the callback itself.

Mitigations we use in production: panic_on_db_errors: true in your PostgreSQL plugin config forces the validator to terminate on errors rather than continue without them. std::panic::catch_unwind around your callback body turns panics into Err returns. A heartbeat counter emitted from inside the callback: not a generic “is the validator running” check, but something tracking whether this specific callback fired in the last 5 seconds. Alert on that. It's the only signal that catches cases where the first two didn't trigger.

4. The startup double-count

Validator restarts. Snapshot replay fires update_account with is_startup: true for every account in the snapshot. If your callback doesn't distinguish startup writes from live writes, you re-process everything. For a basic indexer that upserts by pubkey, that's fine. For anything that counts events, computes rolling aggregates, or writes one row per transaction, startup replay is silent corruption.

There's also a second piece that gets teams: during validator startup, account updates arrive for slots N through roughly N+150 before the corresponding slot status notifications are sent (GitHub issue #28871). If your indexer needs a slot notification before finalizing an account write, those 150 slots of account updates will arrive with no slot to attach them to. Buffer them, release only after notify_end_of_startup fires and slot notifications have started coming in.

5. The slot ordering trap

SlotStatus::Confirmed and SlotStatus::Processed can arrive in either order. Almost nobody's state machine is built for it, and when it happens it 's intermittent enough that you might not catch it for weeks.

From Agave 3.0 onward, notify_transaction is guaranteed to arrive before SlotStatus::Processed for the same slot. Before 3.0, either could arrive first. Use write_version for ordering account writes within a slot. Use SlotStatus::Rooted for permanent finality. Don't build ordering logic on slot status arrival sequence.

6. Missing on_unload

Your plugin has a worker pool. on_unload is a no-op. The validator restarts for a routine Agave bump. The sender drops while the worker is mid-INSERT. Some rows commit, some don't, you can't tell which.

We've had teams find this months after deploy, during an audit, staring at half-written account state they couldn't explain. Every resource opened in on_load must be released in on_unload, in order, with a timeout:

on_unload — correct order

rust

fn on_unload(&mut self) {
    drop(self.sender.take());  // close the channel first so the worker stops
    if let Some(rt) = self.runtime.take() {
        rt.shutdown_timeout(std::time::Duration::from_secs(10));
    }
}

Drop the sender before shutting down the runtime. The worker needs to see the channel close and finish its in-flight writes before the async runtime disappears underneath it. Get the order wrong and you're back to partial writes.

06Testing before you touch mainnet#

Most teams compile, copy the .so to the validator, watch logs for ten minutes, and ship it. The first time we hit failure mode #1 on a production validator (plugin loaded cleanly, zero callbacks firing, no errors anywhere), we stopped.

Unit test the callback. Construct ReplicaAccountInfoVersions::V0_0_3 directly in a test, call update_account, assert the channel receiver got a job. Twenty minutes to write. Catches variant mismatches on your laptop, which is where you want to catch them, not during a weekend deploy.
Staging validator, 24 hours. Run a non-voting validator on a separate identity keypair that follows mainnet but doesn't vote. Every customer plugin gets a day on it before mainnet. Watch callback p99 the whole time. If it creeps past 500µs, find the cause before that's your production box missing slots.
Restart soak, 50 times. Restart the staging validator 50 times and check the database after each for partial writes. We've had teams skip this and find the problem three months later during an audit. 50 restarts surfaces it in an afternoon.

If it breaks on any of the three, you have a specific, reproducible failure before it gets anywhere near mainnet.

07Performance: what healthy actually looks like#

Metric	You're fine	Watch this	Something is wrong
Callback p50	< 50 µs	100–500 µs	> 1 ms
Callback p99	< 200 µs	500 µs–2 ms	> 5 ms
Skipped slots / 24h	0	1–5	> 10
on_unload drain time	< 500 ms	1–5 s	> 30 s
Channel backlog (steady state)	< 100 items	100–1,000	unbounded growth

Callback health thresholds from production Geyser plugin operation.

Data path	Slot latency p90	Account latency p90
RPC polling (getProgramAccounts)	~150 ms	N/A
WebSocket (accountSubscribe)	~10 ms	~374 ms
Yellowstone gRPC	~5 ms	~215 ms
In-process Geyser plugin	< 1 ms	< 1 ms

Slot and account latency by data path. Numbers from Triton One's published benchmarks and NLN operational data.

08Do you actually need a Geyser plugin?#

Most teams who ask us this don't. Genuinely.

You probably need one if:

You're running a sniper, MEV searcher, or liquidator where a missed event has a real dollar cost
You need to write to a backend no managed stream supports (proprietary schema, internal message bus, something custom)
Your filter logic can't be expressed in Yellowstone's filter language (cross-program predicates, composite discriminator matching)
You're integrating a validator into a larger custody or exchange system with specific data contracts
You have a regulatory or data-sovereignty requirement to own the full stack

You probably don't if:

You haven't tried a managed Yellowstone gRPC stream yet. Start there honestly.
You only need account state at human timescales (sub-second WebSocket is fine)
You're submitting transactions (staked RPC connections, not plugins)
Updates every second or two are sufficient. WebSocket handles that with zero infrastructure.

We tell people this even when they've already signed up. A managed stream genuinely solves 90% of real-time data problems with a fraction of the complexity. Not sure which side of that line you're on? Read our Yellowstone gRPC vs WebSockets guide before committing to the plugin path.

09The streaming ecosystem in 2026#

Provider	Model	Latency	Replay	Notes
NoLimitNodes	Hosted plugin or gRPC	In-process µs (plugin); ms (gRPC)	N/A	Only provider hosting custom .so plugins at flat rate
Alchemy	Per-TB gRPC	5–15ms avg	48h / 6,000+ slots	Multi-region; first-slot-data guarantees
Helius LaserStream	Tiered monthly	Up to 8ms faster via preprocessed shreds	24h	9 global regions; 1.3 GB/s JS client throughput
Triton Dragon's Mouth	Per-GB gRPC	Sub-5ms	~1,000 slots	Built Yellowstone; also offers Fumarole and Old Faithful
Chainstack	Per-stream monthly	N/A	~100 slots (Global), ~3,000 (dedicated)	5 concurrent filters per connection; mainnet only

Five gRPC stream providers, five genuinely different business models.

NOTE / The gRPC keepalive problem

Cloud load balancers kill idle gRPC streams after 60–90 seconds. Yellowstone sends server pongs every 15 seconds in response to client pings. Skip the client-side ping and your stream silently dies. Events stop arriving with no error.

keepalive ping/pong

text

// Client sends every 30 seconds:
{ ping: { id: 1 } }
// Server responds:
{ pong: { id: 1 } }

Miss the pong, reconnect. Don't implement it and you'll diagnose a dead connection as a “data issue” for an hour.

10How teams actually ship a Geyser plugin in 2026#

	Self-hosted validator	Hosted plugin	Managed gRPC stream
You operate	Validator + host + on-call rotation	Nothing	A gRPC client
You write	the .so	the .so	a stream consumer
Monthly cost	High (hardware + ops)	Flat monthly	Tiered by usage
Latency floor	In-process (µs)	In-process (µs)	Network hop (ms)
Config changes	Up to 1h validator restart	30 min redeployment	Live
Who it fits	Exchanges, custody, regulated infra	Indexers, bots, custom pipelines	90% of builders

Three real paths, genuinely different trade-offs.

The config restart cost is real and consistently underestimated. On mainnet, a validator restart can take up to an hour: snapshot download, replay, catching up to tip. Hot-reload workarounds exist but require custom implementation to get right. On hosted, a config change is a file upload and a 30-minute staging run.

Self-hosting is the right call when you have a genuine requirement to own the hardware (regulatory, custody, or you're already running a validator for consensus). “We want full control” alone isn't a reason. On hosted plans you own your .so and your config. You just don't carry the hardware cost or the on-call burden.

TIP / NoLimitNodes Geyser Plugin Hosting

Ship your .so and a config.json. We run ABI verification, smoke test on a non-voting staging validator, and promote to production. Under 30 minutes from upload to live on mainnet.

Plan	What's included
Standard	Validator hosting, maintenance, Agave upgrades (<72h SLA), log streaming
Ultra	2 plugins + 30 hours of plugin development per month
Enterprise	Dedicated validator, contractual SLA with service credits, named support engineer

Infrastructure: 256GB RAM / 4TB NVMe / AMD EPYC / 10Gbps unmetered. Owned racks, not cloud reseller margin.
Deploy a plugin · Talk to an engineer · Free account, no card required

The short version: a Solana Geyser plugin is in-process on the validator. The RPC is not. Write the filter in the callback, do the I/O in a worker. The six things that break it: variant mismatch, slow callback, silent panic, startup double-count with notification gap, slot ordering assumptions, missing on_unload. 90% of builders should start with a managed gRPC stream. The 10% who need custom consumer logic should host a plugin, not self-host a validator.

Not sure which camp you're in? The Yellowstone gRPC vs WebSockets guide answers it in one read.

///Read next

EngineeringJun 18, 2026

Yellowstone gRPC Providers Compared (2026): Latency, Decoded Streams & What Nobody Tells You Before You Go Live

Triton, Helius, Alchemy, Chainstack, QuickNode, and NoLimitNodes: latency tables, buffer depth, decoded vs raw streams, and the three things that silently break gRPC consumers at 3am.

#yellowstone#grpc#streaming

12 min read

EngineeringJun 16, 2026

Bare Metal for Solana Applications (2026): Frankfurt, CPU Steal, and Six Servers Matched to What You're Actually Running

Why CPU steal is the number you are not watching, why Frankfurt is a physics decision, and how to match the right bare-metal server to your Solana workload: validators, MEV bots, Firedancer, and private RPC.

#bare-metal#solana#frankfurt

16 min read

Run it yourself

Every benchmark in this blog runs against our public endpoints.

Spin up an RPC, WebSocket, or gRPC endpoint in under a minute. Flat pricing, no request caps. Reproduce the numbers for your own workload.

See pricing

Solana Geyser Plugin (2026): Build, Deploy & 6 Failure Modes

NoLimitNodes Engineering

Infrastructure Team

Jun 17, 202622 min read

On this page +

01What the Solana Geyser plugin actually is#

Geyser path vs RPC path latency


slot fires
  geyser path:  your .so runs in-process                  < 1ms
  rpc path:     RPC -> JSON/WebSocket -> your app          ~10-150ms

Fig. 1: A Geyser plugin runs in-process. The RPC layer sees state after the Geyser callback has already fired.

If you've been polling getProgramAccounts at 250ms, or using accountSubscribe over WebSocket, you're on the RPC path. A Geyser plugin bypasses it entirely.

02Why it exists#

03The trait: nine methods and one change that mattered#

Lifecycle methods

name() is the only truly required method. Name it after whatever program you're indexing, not “plugin”. “pumpfun-indexer” tells you everything at 3am; “plugin” tells you nothing.

on_unload() is where things go wrong when teams skip it. Every resource you open in on_load must be closed here, in order, with a timeout. See failure mode #6.

The account callback

update_account() is the one that matters. Agave's own docs say “any delay here may cause the validator to fall behind the network.” That's not a suggestion.

Slot status variants

update_slot_status() has more variants than any blog post I've seen actually documents:

FirstShredReceived: earliest possible signal, before processing even starts
CreatedBank: execution environment is live for this slot
Completed: all shreds received, not yet replayed
Processed: replayed, but 5–10% of these never get confirmed
Confirmed: supermajority vote; what production indexers actually use
Rooted: permanent, ~32 seconds of latency for the guarantee
Dead(String): slot rejected, the string tells you why

Feature flags

feature flags

rust

fn account_data_notifications_enabled(&self) -> bool { true }
fn account_data_snapshot_notifications_enabled(&self) -> bool { true }
fn transaction_notifications_enabled(&self) -> bool { false }  // off by default
fn entry_notifications_enabled(&self) -> bool { false }         // off by default

04Building a Solana Geyser plugin in Rust: full tutorial#

The plugin below filters by program owner and writes account updates to Postgres. The hot-path/worker channel split is the one part that needs explaining; the rest is boilerplate you set up once.

Scaffold and manifest

scaffold

bash

cargo new --lib my-indexer-plugin
cd my-indexer-plugin

Cargo.toml

toml

[package]
name = "my-indexer-plugin"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
agave-geyser-plugin-interface = { git = "https://github.com/anza-xyz/agave" }
tokio = { version = "1", features = ["full"] }
sqlx = { version = "0.7", features = ["runtime-tokio-rustls", "postgres"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
bs58 = "0.5"

The plugin

src/lib.rs

rust

use agave_geyser_plugin_interface::geyser_plugin_interface::{
    GeyserPlugin, GeyserPluginError, ReplicaAccountInfoVersions,
    ReplicaBlockInfoVersions, ReplicaTransactionInfoVersions, SlotStatus,
    Result as GeyserResult,
};
use serde::Deserialize;
use std::sync::mpsc::{channel, Sender};
use tokio::runtime::Runtime;

#[derive(Deserialize)]
struct Config {
    target_program: String,
    database_url: String,
}

struct AccountJob {
    pubkey: Vec<u8>,
    owner: Vec<u8>,
    lamports: u64,
    data: Vec<u8>,
    slot: u64,
    write_version: u64,
}

pub struct MyIndexer {
    cfg: Option<Config>,
    sender: Option<Sender<AccountJob>>,
    runtime: Option<Runtime>,
}

impl GeyserPlugin for MyIndexer {
    fn name(&self) -> &'static str { "my-indexer" }

    fn on_load(&mut self, config_file: &str, _is_reload: bool) -> GeyserResult<()> {
        let cfg: Config = serde_json::from_str(
            &std::fs::read_to_string(config_file)
                .map_err(|e| GeyserPluginError::Custom(Box::new(e)))?
        ).map_err(|e| GeyserPluginError::Custom(Box::new(e)))?;

        let (tx, rx) = channel::<AccountJob>();
        let rt = Runtime::new().unwrap();
        let db_url = cfg.database_url.clone();

        rt.spawn(async move {
            let pool = sqlx::PgPool::connect(&db_url).await
                .expect("db connect failed");
            while let Ok(job) = rx.recv() {
                // write_version is a global monotonic counter, not per-account.
                // Same account written twice in one slot: write_version tells
                // you which came second. slot alone is insufficient.
                let _ = sqlx::query(
                    "INSERT INTO accounts (pubkey, owner, lamports, data, slot, write_version)
                     VALUES ($1, $2, $3, $4, $5, $6)
                     ON CONFLICT (pubkey) DO UPDATE
                       SET lamports      = EXCLUDED.lamports,
                           data          = EXCLUDED.data,
                           slot          = EXCLUDED.slot,
                           write_version = EXCLUDED.write_version
                       WHERE accounts.write_version < EXCLUDED.write_version"
                )
                .bind(&job.pubkey)
                .bind(&job.owner)
                .bind(job.lamports as i64)
                .bind(&job.data)
                .bind(job.slot as i64)
                .bind(job.write_version as i64)
                .execute(&pool).await;
            }
        });

        self.cfg     = Some(cfg);
        self.sender  = Some(tx);
        self.runtime = Some(rt);
        Ok(())
    }

    fn on_unload(&mut self) {
        // Drop sender first: the worker sees the channel close and drains
        // any in-flight writes before the runtime shuts down.
        drop(self.sender.take());
        if let Some(rt) = self.runtime.take() {
            rt.shutdown_timeout(std::time::Duration::from_secs(10));
        }
    }

    fn account_data_notifications_enabled(&self) -> bool { true }
    fn transaction_notifications_enabled(&self) -> bool { false }
    fn entry_notifications_enabled(&self) -> bool { false }

    fn update_account(
        &self,
        account: ReplicaAccountInfoVersions,
        slot: u64,
        _is_startup: bool,
    ) -> GeyserResult<()> {
        // Hot path. Filter in microseconds, hand off, return.
        // Any I/O here falls behind the validator.
        let info = match account {
            ReplicaAccountInfoVersions::V0_0_3(i) => i,
            _ => return Ok(()),
        };

        let target = self.cfg.as_ref().unwrap().target_program.as_str();
        if bs58::encode(info.owner).into_string() != target {
            return Ok(());
        }

        let _ = self.sender.as_ref().unwrap().send(AccountJob {
            pubkey:        info.pubkey.to_vec(),
            owner:         info.owner.to_vec(),
            lamports:      info.lamports,
            data:          info.data.to_vec(),
            slot,
            write_version: info.write_version,
        });
        Ok(())
    }

    fn update_slot_status(&self, _: u64, _: Option<u64>, _: &SlotStatus) -> GeyserResult<()> { Ok(()) }
    fn notify_end_of_startup(&self) -> GeyserResult<()> { Ok(()) }
    fn notify_transaction(&self, _: ReplicaTransactionInfoVersions, _: u64) -> GeyserResult<()> { Ok(()) }
    fn notify_block_metadata(&self, _: ReplicaBlockInfoVersions) -> GeyserResult<()> { Ok(()) }
}

#[no_mangle]
pub unsafe extern "C" fn _create_plugin() -> *mut dyn GeyserPlugin {
    Box::into_raw(Box::new(MyIndexer {
        cfg: None, sender: None, runtime: None,
    }))
}

Build and deploy

build

bash

cargo build --release
# → target/release/libmy_indexer_plugin.so

my-plugin.json

json

{
  "libpath": "/etc/solana/plugins/libmy_indexer_plugin.so",
  "target_program": "6EF8rrecthR5Dkzon8Nwu78hRvfCKubJ14M5uBEwF6P",
  "database_url": "postgres://user:pass@localhost/solana"
}

validator startup

bash

agave-validator \
  --geyser-plugin-config /etc/solana/my-plugin.json \
  [... your normal validator flags]

05Six failure modes that will ruin your week#

We've had support tickets for all of these. Some more than once.

1. The variant mismatch

2. The slow callback

Monitor callback p99 continuously. Not “is the validator running” but callback duration specifically. Different metrics. Only one catches this.

3. The silent panic

4. The startup double-count

5. The slot ordering trap

6. Missing on_unload

on_unload — correct order

rust

fn on_unload(&mut self) {
    drop(self.sender.take());  // close the channel first so the worker stops
    if let Some(rt) = self.runtime.take() {
        rt.shutdown_timeout(std::time::Duration::from_secs(10));
    }
}

06Testing before you touch mainnet#

Unit test the callback. Construct ReplicaAccountInfoVersions::V0_0_3 directly in a test, call update_account, assert the channel receiver got a job. Twenty minutes to write. Catches variant mismatches on your laptop, which is where you want to catch them, not during a weekend deploy.
Staging validator, 24 hours. Run a non-voting validator on a separate identity keypair that follows mainnet but doesn't vote. Every customer plugin gets a day on it before mainnet. Watch callback p99 the whole time. If it creeps past 500µs, find the cause before that's your production box missing slots.
Restart soak, 50 times. Restart the staging validator 50 times and check the database after each for partial writes. We've had teams skip this and find the problem three months later during an audit. 50 restarts surfaces it in an afternoon.

If it breaks on any of the three, you have a specific, reproducible failure before it gets anywhere near mainnet.

07Performance: what healthy actually looks like#

Metric	You're fine	Watch this	Something is wrong
Callback p50	< 50 µs	100–500 µs	> 1 ms
Callback p99	< 200 µs	500 µs–2 ms	> 5 ms
Skipped slots / 24h	0	1–5	> 10
on_unload drain time	< 500 ms	1–5 s	> 30 s
Channel backlog (steady state)	< 100 items	100–1,000	unbounded growth

Callback health thresholds from production Geyser plugin operation.

Data path	Slot latency p90	Account latency p90
RPC polling (getProgramAccounts)	~150 ms	N/A
WebSocket (accountSubscribe)	~10 ms	~374 ms
Yellowstone gRPC	~5 ms	~215 ms
In-process Geyser plugin	< 1 ms	< 1 ms

Slot and account latency by data path. Numbers from Triton One's published benchmarks and NLN operational data.

08Do you actually need a Geyser plugin?#

Most teams who ask us this don't. Genuinely.

You probably need one if:

You're running a sniper, MEV searcher, or liquidator where a missed event has a real dollar cost
You need to write to a backend no managed stream supports (proprietary schema, internal message bus, something custom)
Your filter logic can't be expressed in Yellowstone's filter language (cross-program predicates, composite discriminator matching)
You're integrating a validator into a larger custody or exchange system with specific data contracts
You have a regulatory or data-sovereignty requirement to own the full stack

You probably don't if:

You haven't tried a managed Yellowstone gRPC stream yet. Start there honestly.
You only need account state at human timescales (sub-second WebSocket is fine)
You're submitting transactions (staked RPC connections, not plugins)
Updates every second or two are sufficient. WebSocket handles that with zero infrastructure.

09The streaming ecosystem in 2026#

Provider	Model	Latency	Replay	Notes
NoLimitNodes	Hosted plugin or gRPC	In-process µs (plugin); ms (gRPC)	N/A	Only provider hosting custom .so plugins at flat rate
Alchemy	Per-TB gRPC	5–15ms avg	48h / 6,000+ slots	Multi-region; first-slot-data guarantees
Helius LaserStream	Tiered monthly	Up to 8ms faster via preprocessed shreds	24h	9 global regions; 1.3 GB/s JS client throughput
Triton Dragon's Mouth	Per-GB gRPC	Sub-5ms	~1,000 slots	Built Yellowstone; also offers Fumarole and Old Faithful
Chainstack	Per-stream monthly	N/A	~100 slots (Global), ~3,000 (dedicated)	5 concurrent filters per connection; mainnet only

Five gRPC stream providers, five genuinely different business models.

NOTE / The gRPC keepalive problem

keepalive ping/pong

text

// Client sends every 30 seconds:
{ ping: { id: 1 } }
// Server responds:
{ pong: { id: 1 } }

Miss the pong, reconnect. Don't implement it and you'll diagnose a dead connection as a “data issue” for an hour.

10How teams actually ship a Geyser plugin in 2026#

	Self-hosted validator	Hosted plugin	Managed gRPC stream
You operate	Validator + host + on-call rotation	Nothing	A gRPC client
You write	the .so	the .so	a stream consumer
Monthly cost	High (hardware + ops)	Flat monthly	Tiered by usage
Latency floor	In-process (µs)	In-process (µs)	Network hop (ms)
Config changes	Up to 1h validator restart	30 min redeployment	Live
Who it fits	Exchanges, custody, regulated infra	Indexers, bots, custom pipelines	90% of builders

Three real paths, genuinely different trade-offs.

TIP / NoLimitNodes Geyser Plugin Hosting

Ship your .so and a config.json. We run ABI verification, smoke test on a non-voting staging validator, and promote to production. Under 30 minutes from upload to live on mainnet.

Plan	What's included
Standard	Validator hosting, maintenance, Agave upgrades (<72h SLA), log streaming
Ultra	2 plugins + 30 hours of plugin development per month
Enterprise	Dedicated validator, contractual SLA with service credits, named support engineer

Infrastructure: 256GB RAM / 4TB NVMe / AMD EPYC / 10Gbps unmetered. Owned racks, not cloud reseller margin.
Deploy a plugin · Talk to an engineer · Free account, no card required

Not sure which camp you're in? The Yellowstone gRPC vs WebSockets guide answers it in one read.

///Read next

EngineeringJun 18, 2026

Yellowstone gRPC Providers Compared (2026): Latency, Decoded Streams & What Nobody Tells You Before You Go Live

Triton, Helius, Alchemy, Chainstack, QuickNode, and NoLimitNodes: latency tables, buffer depth, decoded vs raw streams, and the three things that silently break gRPC consumers at 3am.

#yellowstone#grpc#streaming

12 min read

EngineeringJun 16, 2026

Bare Metal for Solana Applications (2026): Frankfurt, CPU Steal, and Six Servers Matched to What You're Actually Running

#bare-metal#solana#frankfurt

16 min read

Run it yourself

Every benchmark in this blog runs against our public endpoints.

Spin up an RPC, WebSocket, or gRPC endpoint in under a minute. Flat pricing, no request caps. Reproduce the numbers for your own workload.

See pricing

01What the Solana Geyser plugin actually is#

02Why it exists#

03The trait: nine methods and one change that mattered#

Lifecycle methods

The account callback

Slot status variants

Feature flags

04Building a Solana Geyser plugin in Rust: full tutorial#

Scaffold and manifest

The plugin

Build and deploy

05Six failure modes that will ruin your week#

1. The variant mismatch

2. The slow callback

3. The silent panic

4. The startup double-count

5. The slot ordering trap

6. Missing on_unload

06Testing before you touch mainnet#

07Performance: what healthy actually looks like#

08Do you actually need a Geyser plugin?#

You probably need one if:

You probably don't if:

09The streaming ecosystem in 2026#

10How teams actually ship a Geyser plugin in 2026#

Yellowstone gRPC Providers Compared (2026): Latency, Decoded Streams & What Nobody Tells You Before You Go Live

Bare Metal for Solana Applications (2026): Frankfurt, CPU Steal, and Six Servers Matched to What You're Actually Running

Every benchmark in this blog runs against our public endpoints.

Ready to get started?

01What the Solana Geyser plugin actually is#

02Why it exists#

03The trait: nine methods and one change that mattered#

Lifecycle methods

The account callback

Slot status variants

Feature flags

04Building a Solana Geyser plugin in Rust: full tutorial#

Scaffold and manifest

The plugin

Build and deploy

05Six failure modes that will ruin your week#

1. The variant mismatch

2. The slow callback

3. The silent panic

4. The startup double-count

5. The slot ordering trap

6. Missing on_unload

06Testing before you touch mainnet#

07Performance: what healthy actually looks like#

08Do you actually need a Geyser plugin?#

You probably need one if:

You probably don't if:

09The streaming ecosystem in 2026#

10How teams actually ship a Geyser plugin in 2026#

Yellowstone gRPC Providers Compared (2026): Latency, Decoded Streams & What Nobody Tells You Before You Go Live

Bare Metal for Solana Applications (2026): Frankfurt, CPU Steal, and Six Servers Matched to What You're Actually Running

Every benchmark in this blog runs against our public endpoints.

Ready to get started?