Skip to content
@OpenDCAI

OpenDCAI

Define the future of Data-centric AI together

OpenDCAI

Website Google Scholar X Bilibili RedNote Stars Followers

👋 Welcome

✨We are dedicated to advancing research and open-source tools in Data-Centric Artificial Intelligence (DCAI).✨

🚀Our goal is to develop effective and efficient DCAI systems and algorithms that support and enhance the performance of AI models and applications.

🤝 Community

QR_en

Pinned Loading

  1. DataFlow DataFlow Public

    Easy Data Preparation with latest LLMs-based Operators and Pipelines.

    Python 2.2k 149

  2. MyScaleDB MyScaleDB Public

    Forked from OriginHubAI/MyScaleDB

    AI Database for unified, scalable SQL + vector data management, search and analytics

    C++ 40 1

  3. DataFlex DataFlex Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    Python 97 8

  4. Paper2Any Paper2Any Public

    Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

    Python 533 32

Repositories

Showing 10 of 24 repositories
  • OpenDCAI/DataFlow-WebUI’s past year of commit activity
    Python 8 8 0 0 Updated Jan 7, 2026
  • DataFlow Public

    Easy Data Preparation with latest LLMs-based Operators and Pipelines.

    OpenDCAI/DataFlow’s past year of commit activity
    Python 2,163 Apache-2.0 149 8 1 Updated Jan 7, 2026
  • .github Public
    OpenDCAI/.github’s past year of commit activity
    0 0 0 0 Updated Jan 7, 2026
  • Paper2Any Public

    Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

    OpenDCAI/Paper2Any’s past year of commit activity
    Python 533 Apache-2.0 32 0 1 Updated Jan 7, 2026
  • DataFlow-Doc Public

    Documentation for DataFlow, Data-centric AI system for LLM.

    OpenDCAI/DataFlow-Doc’s past year of commit activity
    Python 11 26 4 0 Updated Jan 6, 2026
  • MyScaleDB Public Forked from OriginHubAI/MyScaleDB

    AI Database for unified, scalable SQL + vector data management, search and analytics

    OpenDCAI/MyScaleDB’s past year of commit activity
    C++ 40 Apache-2.0 34 0 0 Updated Jan 5, 2026
  • DataFlow-MM-Doc Public

    Documentation for DataFlow-MM

    OpenDCAI/DataFlow-MM-Doc’s past year of commit activity
    Python 2 6 0 1 Updated Jan 4, 2026
  • OpenDCAI/DataFlow-Agent’s past year of commit activity
    Python 23 Apache-2.0 3 0 1 Updated Jan 3, 2026
  • DataFlex Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    OpenDCAI/DataFlex’s past year of commit activity
    Python 97 8 0 0 Updated Jan 3, 2026
  • DataFlex-Doc Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    OpenDCAI/DataFlex-Doc’s past year of commit activity
    Python 2 7 0 0 Updated Dec 27, 2025