Stars
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
Start building LLM-empowered multi-agent applications in an easier way.
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
A programming framework for agentic AI 🤖
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
省市区县乡镇三级或四级城市数据,带拼音标注、坐标、行政区域边界范围;2024年06月16日最新采集,提供csv格式文件,支持在线转成多级联动js代码、通用json格式,提供软件转成shp、geojson、sql、导入数据库;带浏览器里面运行的js采集源码,综合了中华人民共和国民政部、国家统计局、高德地图、腾讯地图行政区划数据
使用Bert,ERNIE,进行中文文本分类
Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
更加简洁友好的接口,封装elasticsearch,lucene等索引工具的细节,提供通用搜索服务
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
A Pythonic wrapper for the Wikipedia API
FlinkSQL数据脱敏和行级权限解决方案及源码,支持面向用户级别的数据脱敏和行级数据访问控制,即特定用户只能访问到脱敏后的数据或授权过的行。此方案是实时领域Flink的解决方案,类似于离线数仓Hive Ranger中的Row-level Filter和Column Masking方案。
Unsupervised text tokenizer for Neural Network-based text generation.
RedisShake is a Redis data processing and migration tool.
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
一款支持标准化schema定义、自动化部署产品包的软件,旨在对产品包下每个服务进行部署、升级、卸载、配置等操作,解放人工运维成本。
The simplest, fastest way to get business intelligence and analytics to everyone in your company 😋
An Open Standard for lineage metadata collection