GitHub - open-data-fabric/open-data-fabric: Open protocol for decentralized exchange and transformation of data

Open protocol for decentralized exchange and transformation of data

Website | Reference Implementation | Original Whitepaper | Chat

Introduction

Open Data Fabric is an open protocol specification for decentralized exchange and transformation of semi-structured data, that aims to holistically address many shortcomings of the modern data management systems and workflows.

The goal of this specification is to develop a method of data exchange that would:

Enable worldwide collaboration around data cleaning, enrichment, and derivation
Create an environment of verifiable trust between participants without the need for a central authority
Enable high degree of data reuse, making quality data more readily available
Improve liquidity of data by speeding up the data propagation times from publishers to consumers
Create a feedback loop between data consumers and publishers, allowing them to collaborate on better data availability, recency, and design

ODF protocol is a Web 3.0 technology that powers a distributed structured data supply chain for providing timely, high-quality, and verifiable data for data science, smart contracts, web and applications.

Protocol Map

Introductory materials

More tutorials and articles can be found in kamu-cli documentation.

Current State

The specification is currently in actively evolving and welcomes feedback.

See also our Roadmap for future direction and RFC archive for the record of changes.

Implementations

Coordinator implementations:

kamu-cli - data management tool that serves as the reference implementation.

Engine implementations:

kamu-engine-spark - engine based on Apache Spark.
kamu-engine-flink - engine based on Apache Flink.

History

The specification was originally developed by Kamu as part of the kamu-cli data management tool. While developing it, we quickly realized that the very essence of what we're trying to build - a collaborative open data processing pipeline based on verifiable trust - requires full transparency and openness on our part. We strongly believe in the potential of our ideas to bring data management to the next level, to provide better quality data faster to the people who need it to innovate, fight deceases, build better businesses, and make informed political decisions. Therefore, we saw it as our duty to share these ideas with the community and make the system as inclusive as possible for the existing technologies and future innovations, and work together to build momentum needed to achieve such radical change.

Contributing

See Contribution Guidelines

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
.github/workflows		.github/workflows
images		images
protocols		protocols
rfcs		rfcs
schemas-generated/flatbuffers		schemas-generated/flatbuffers
schemas		schemas
src		src
tools		tools
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
open-data-fabric.md		open-data-fabric.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Protocol Map

Introductory materials

Current State

Implementations

History

Contributing

RFC List

About

Releases

Packages

Contributors 4

Languages

License

open-data-fabric/open-data-fabric

Folders and files

Latest commit

History

Repository files navigation

Introduction

Protocol Map

Introductory materials

Current State

Implementations

History

Contributing

RFC List

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages