Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Langchain_Community: SQL LanguageParser #28430

Merged
merged 19 commits into from
Dec 19, 2024
Merged

Conversation

anushak18
Copy link
Contributor

Description

(This PR has contributions from @khushiDesai, @ashvini8, and @ssumaiyaahmed).

This PR addresses Issue #11229 which addresses the need for SQL support in document parsing. This is integrated into the generic TreeSitter parsing library, allowing LangChain users to easily load codebases in SQL into smaller, manageable "documents."

This pull request adds a new SQLSegmenter class, which provides the SQL integration.

Issue

Issue #11229: Add support for a variety of languages to LanguageParser

Testing

We created a file test_sql.py with several tests to ensure the SQLSegmenter is functional. Below are the tests we added:

  • def test_is_valid: Checks SQL validity.
  • def test_extract_functions_classes: Extracts individual SQL statements.
  • def test_simplify_code: Simplifies SQL code with comments.

Copy link

vercel bot commented Nov 30, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 19, 2024 8:30pm

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) labels Nov 30, 2024
@ccurme ccurme enabled auto-merge (squash) December 19, 2024 20:29
@ccurme ccurme merged commit 26bdf40 into langchain-ai:master Dec 19, 2024
18 checks passed
@RokeToke115
Copy link

Could someone provide an example of how to use the SQLSegmenter? I've tried to use it, but I wasn't able.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants