Skip to content
View JinsYin's full-sized avatar
💪
💪

Block or report JinsYin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Datalake

数据湖
21 repositories

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.

Java 875 292 Updated Nov 26, 2024

LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.

Java 2,387 424 Updated Nov 25, 2024

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

Java 1,044 130 Updated Nov 27, 2024

Smart Automation Tool for building modern Data Lakes and Data Pipelines

Scala 111 20 Updated Nov 26, 2024

lakeFS - Data version control for your data lake | Git for data

Go 4,461 359 Updated Nov 26, 2024

Dremio - the missing link in modern data

Java 1,383 445 Updated Oct 25, 2024

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activelo…

Python 8,197 631 Updated Nov 27, 2024

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

Java 2,440 959 Updated Nov 27, 2024

𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com

Rust 7,887 753 Updated Nov 27, 2024

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licen…

Java 1,110 580 Updated Jan 12, 2023

Playbook of Kyuubi and Arctic Demo

Dockerfile 6 3 Updated Nov 20, 2022

Lakehouse storage system benchmark

Scala 66 9 Updated Feb 22, 2023

A version control system to manage large files.

Go 291 13 Updated Feb 7, 2023

Apache InLong - a one-stop, full-scenario integration framework for massive data

Java 1,400 530 Updated Nov 26, 2024
Java 1,612 281 Updated Nov 19, 2024

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

Java 1,098 345 Updated Nov 27, 2024

a curated list of awesome lakehouse frameworks, applications, etc

17 2 Updated Jul 31, 2024

Universal solution for geospatial data tailored to data lakehouse systems for the first time in the industry

Java 63 4 Updated Oct 24, 2023

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Java 921 147 Updated Nov 22, 2024

Open Control Plane for Tables in Data Lakehouse

Java 312 52 Updated Nov 25, 2024

Open, Multi-modal Catalog for Data & AI

Java 2,449 394 Updated Nov 26, 2024