Expand description
Lance Columnar Data Format
Lance columnar data format is an alternative to Parquet. It provides 100x faster for random access, automatic versioning, optimized for computer vision, bioinformatics, spatial and ML data. Apache Arrow and DuckDB compatible.
§Create a Dataset
use lance::{dataset::WriteParams, Dataset};
let schema = Arc::new(Schema::new(vec![Field::new("test", DataType::Int64, false)]));
let batches = vec![RecordBatch::new_empty(schema.clone())];
let reader = RecordBatchIterator::new(
batches.into_iter().map(Ok), schema
);
let write_params = WriteParams::default();
Dataset::write(reader, &uri, Some(write_params)).await.unwrap();
§Scan a Dataset
use futures::StreamExt;
use lance::Dataset;
let dataset = Dataset::open(&path).await.unwrap();
let mut scanner = dataset.scan();
let batches: Vec<RecordBatch> = scanner
.try_into_stream()
.await
.unwrap()
.map(|b| b.unwrap())
.collect::<Vec<RecordBatch>>()
.await;
Re-exports§
pub use dataset::Dataset;
Modules§
- Extend Arrow Functionality
- Extends DataFusion
- Lance Dataset
- Secondary Index
- I/O utilities.
- Various utilities
Structs§
Enums§
Functions§
- Creates and loads a
Dataset
from the given path. Infers the storage backend to use from the scheme in the given table path.