Tutorials

Learn to explore 6.7 million physical samples from scientific collections worldwide using modern browser-based tools.

Start Here

Tutorial What You’ll Learn
Interactive Explorer Browse samples on a 3D globe with H3-clustered, zoom-adaptive rendering
Search Explorer Faceted search and filter across all 6.7M samples with cross-filtering
Deep-Dive Analysis Comprehensive DuckDB-WASM analysis with Observable JS — charts, maps, statistics
Technical: Narrow vs Wide Schema comparison and performance benchmarks for the PQG data formats

What’s in the Data?

Source Samples Focus
SESAR 4.6M Earth science — rocks, minerals, sediments, soils
OpenContext 1M Archaeology — artifacts, excavation materials
GEOME 605K Biology — genomic and tissue specimens
Smithsonian 322K Natural history — museum collections

Data Files

All data is hosted on data.isamples.org with HTTP range request support — DuckDB-WASM only downloads the bytes it needs.

File Size Description
Wide format 292 MB One row per entity, all sources — primary file for tutorials. Stable alias redirects to the current dated build (isamples_YYYYMM_wide.parquet).
Wide + H3 292 MB Wide format with H3 spatial indices for globe visualizations
Facet summaries 2 KB Pre-computed filter counts — loads instantly
H3 clusters (res4) 0.6 MB Zoomed-out globe view

Why Browser-Based?

Our approach using geoparquet + DuckDB-WASM provides:

  • Universal access — No installation, works in Chrome, Firefox, Edge, Safari, and Brave
  • Fast analysis — 5-10x faster than downloading full datasets
  • Memory efficient — Analyze 300MB datasets using <100MB browser memory
  • Minimal transfer — HTTP range requests download only the columns and rows you need (typically <1 MB to start)
  • Reproducible — All code is visible and foldable on tutorial pages

For Developers

All tutorial source code is on GitHub. Want to build your own analysis? Fork the repo, modify a .qmd file, and run quarto preview.