Historical Dataset

Solana trading datasets, parsed and packaged for backtests

Per-month Parquet and CSV bundles of parsed Solana trades, pool events, mints, and transfers. Pick a program, set a duration, pay once. Download links arrive within 24 hours.

$200/mo per program. 30% off at 6 months, 50% off at 12. Drop the bundle into DuckDB or pandas in two minutes.

Parquet + CSVPer-month bundlesUp to 12 months back40+ programs50% off at 12 monthsDuckDB-friendly

Build your dataset bundle

Pick programs, set duration per dataset, pay once. Download links arrive within 24 hours.

On-chain programs covered

Datasets in every program bundle

parsed row-per-event tables, schema documented, decimals normalized

EventTypeDescriptionFrequencyLatency
dex_tradeseventOne row per swap across Raydium, Orca, Meteora, Jupiter, PumpSwap. Includes amount_in, amount_out, USD value, pool address, route length, signer.Very high
pool_eventseventPool initialize, deposit, withdraw, position open/close, fee collect. Bin and tick resolution preserved where the program supports it.High
token_mintseventEvery new SPL or Token-2022 mint with metadata, mint authority, freeze authority, decimals, supply, and the creator wallet.Medium
token_transferseventDecoded SPL transfers with sender, recipient, mint, decimals normalized, and a USD value computed off the pricing oracle nearest the slot.Very high
pumpfun_eventseventcreate, trade, graduate decoded with bonding-curve reserves, virtual reserves, and SOL/token amounts in native units.High
jupiter_routeseventPer-aggregator-call breakdown: hops, dex names, intermediate mints, slippage realized, total fee paid by the swapper.High
liquidity_changeseventNet base/quote reserve deltas per pool per slot. Drives TVL backfills and impermanent-loss research.Medium

Catalog scale and pricing at a glance

last reviewed 2026-04-29

Programs covered
40+
DEX, DeFi, token, NFT, infrastructure. New programs onboarded by request
Verified 2026-04-29
History depth
12 months
Rolling 12-month window for most programs. From-genesis bundles available on contract
Verified 2026-04-29
Dex_trades row count
~3.2B / month
Combined across all DEX programs in a typical month, parsed and de-duplicated
Verified 2026-04-29
Compressed bundle size
4-12 GB
Per-program-per-month tar.zst at typical activity levels
Starting price
$200/mo
Per program per month. 30% off at 6 months, 50% off at 12 months

Parsed datasets vs raw getBlock JSON

The cheapest historical Solana data on the planet is AWS Public Blockchain Data. Free getBlock JSON on S3, updated daily, going back to genesis. If you have a tolerance for shell pipelines and an ETL team, you don't need us.

Most teams don't. The first wall is the IDL set: every Solana program has its own Borsh layout, and AMM v4, CLMM, and CPMM are three different layouts under the “Raydium” umbrella alone. Add Orca Whirlpool, Meteora DLMM and DAMM, Jupiter V6 inner instructions, Pump.fun's bonding curve, and you're maintaining a few thousand lines of decoder code that breaks every time someone redeploys with a new discriminator.

The second wall is decimals. Token-2022 mints can carry transfer hooks that change the effective amount; SPL Token has implicit decimal handling; PumpSwap quotes one side in lamports and the other in token base units. Normalize one wrong and your USD column is off by a factor of a thousand for an entire program.

We sell the parsed result. Trades, pools, mints, transfers, routes. Per-program-per-month Parquet, schema documented, USD-normalized, decimal-corrected, signed-URL delivered. The first day of work is opening DuckDB and writing your model query, not writing a Borsh decoder.

What's in the catalog

The catalog is split four ways. DEX, DeFi, Token/NFT, and Infrastructure. Each program ships one or more datasets; the most-bought are trades, pool_events, and token_mints.

DEX / AMM
  • Raydium AMM v4, CLMM, CPMM
  • Orca Whirlpool
  • Meteora DLMM, DAMM, DBC
  • Jupiter V6 aggregator
  • Phoenix order book
  • OpenBook v2
  • Lifinity, Stabble, Gavel
  • PumpSwap, Heaven, Boop
DeFi / lending
  • Kamino lending + farms
  • MarginFi v2
  • Drift perps + spot
  • Marinade liquid staking
  • Solayer restaking
  • Zeta options
  • Sharky NFT lending
Token / NFT
  • SPL Token + Token-2022
  • Metaplex Core + Token Metadata
  • Bubblegum compressed NFTs
  • Pump.fun
  • Moonshot
  • Virtuals
Infrastructure
  • System program transfers
  • Stake program
  • Address Lookup Tables
  • Name Service
  • Circle CCTP
  • Memo
  • Swig session keys

Don't see your program? It's usually a one-time ingestion to add. Tell us the program ID and which instructions you care about and we'll quote it.

Who actually buys these datasets

Quant teams running backtests

Six months of Raydium plus Meteora trades, joined to mint metadata, joined to Jupiter routes, sitting in DuckDB. Run the strategy in seconds, not over a weekend on a flaky RPC scrape.

ML training pipelines

Token-launch outcome models, rug-pull classifiers, MEV detectors. The signal lives in the parsed instructions, not in raw logs, and you don't want to spend three months building the labeler.

Research and journalism

Volume-by-DEX charts, attacker-flow tracing, exchange-deposit attribution. Three months of trades plus transfers usually covers the brief.

Tax and compliance vendors

Per-wallet trade history with cost basis sourced from the same Parquet bundle the rest of the company uses. No more disagreement between the analytics team and the compliance team about what a trade was.

Internal data warehouses

Drop the monthly bundle into Snowflake or Iceberg, replace the home-grown ingest pipeline, free up two engineers to work on the actual product. The most common reason teams renew.

Liquidity-provider analytics

Per-position P&L on Whirlpool and DLMM with realized fee collection events resolved against tick or bin movement. Hard to compute without a parsed dataset; trivial with one.

Pricing and how the discounts work

Base price is $200 a month per program. That covers every dataset for that program: trades, pool events, mints, all of it. You don't buy “trades” and “pool_events” as two SKUs, you buy the program.

TermPer-month rateEffective $/programDiscount
1 month$200$2000%
6 months$140$840 total30%
12 months$100$1,200 total50%

Multiple programs stack. Most quant customers run three to five programs at the 12-month rate, which lands around $300 to $500 a month all-in for parsed data covering most of Solana DeFi. Compared to a Bitquery enterprise contract or a Dune-export pipeline, the math is unsubtle.

Custom range, custom format, redistribution license, or a program we don't list? Reach out via talk to sales. The base rate covers the standard SKU; everything else gets quoted.

Where we sit vs Bitquery, Dune, and AWS

Three competitors come up in every sales call.

Bitquery is the GraphQL incumbent for parsed Solana data. They're excellent at flexible queries with complex JOINs and very deep history. They charge per query and per dataset, which means the bill scales with how curious your analysts are. If your job is “I want to ask thirty different questions and see what sticks,” Bitquery is the right tool. If your job is “ship me Parquet I can put on disk,” we're cheaper.

Dune Analytics has Solana coverage on top of Spellbook with manual model curation. Strong for ad-hoc SQL dashboards. CSV export caps and rate limits make it painful to use as a real backfill source. Most teams query Dune for one-off charts and buy the Parquet from us for production modeling.

AWS Public Blockchain Data ships the raw getBlock JSON to S3 for free. The price is right; the parsing burden is not. Onboarding even a single AMM into a usable internal schema is a multi-engineer-month project. Worth it if you have the team. Most teams who try it end up buying parsed data from someone after about month two.

We're not the right answer for every workload. We are the right answer when you want parsed Solana trades on disk tomorrow at a flat predictable price. If that's the job, this is the cheapest path that doesn't end with you maintaining decoders.

Frequently asked questions

A tar.zst of Parquet files. One Parquet per day for the requested program and dataset, plus a manifest.json describing the schema, the slot range, the parsed-instruction discriminators, and the SHA-256 of every file. CSV is also available if you don't want Parquet, although the Parquet bundles compress to about 4x smaller and DuckDB reads them directly.

Download a month of Solana trades today

$200/mo per program, 30% off at 6 months, 50% off at 12. Multiple programs stack. Custom ranges and from-genesis bundles available on contract.

Ready to get started?

Get your free API key and start building in under 30 seconds.

Talk to Sales