delta-lake
Here are 172 public repositories matching this topic...
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
Updated
Nov 30, 2024 - Java
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
-
Updated
Nov 30, 2024 - Java
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
-
Updated
Nov 27, 2024 - Scala
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
-
Updated
Oct 24, 2024 - Rust
A native Rust library for Delta Lake, with bindings into Python
-
Updated
Dec 1, 2024 - Rust
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
-
Updated
May 8, 2024 - Scala
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
-
Updated
Nov 27, 2024 - Java
An open protocol for secure data sharing
-
Updated
Nov 27, 2024 - Scala
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
-
Updated
Dec 1, 2024 - Python
Analytical database for data-driven Web applications 🪶
-
Updated
Nov 29, 2024 - Rust
Amazon SageMaker Local Mode Examples
-
Updated
Aug 4, 2024 - Python
Iceberg/Delta Columnstore Table in Postgres
-
Updated
Nov 27, 2024 - C++
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
-
Updated
Oct 28, 2024 - Python
Sample project to demonstrate data engineering best practices
-
Updated
Feb 24, 2024 - Python
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
-
Updated
Dec 15, 2023 - Dockerfile
A Minimalistic Rust Implementation of Delta Sharing Server.
-
Updated
Dec 1, 2024 - Rust
This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
-
Updated
Feb 24, 2022 - Python
Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline
-
Updated
Feb 15, 2023 - HTML
Improve this page
Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."