Polars AI represents a pioneering utility featuring a command-line interface (CLI) complemented by a sophisticated crate/library. It empowers you to engage in conversational interactions with your Polars DataFrames, harnessing the capabilities of AI for data analysis. Polars AI seamlessly integrates the formidable prowess of OpenAI's GPT-3.5 Turbo, thereby augmenting and optimizing data exploration and manipulation tasks.
Polars AI allows you to:
- Chat with your Polars DataFrames using plain text queries.
- Perform data analysis tasks such as filtering, aggregating through AI-generated Rust code.
- Visualize data using charts and plots (coming soon).
- Installation 🚀
- Getting Started 🏁
- Usage 🧑💻
- Examples 💡
- Contributing 🤝
- License 📜
To use Polars AI, you'll need to follow these installation steps:
-
Install Rust (if not already installed) by following the instructions at Rust Install.
-
Fork the repository on GitHub:
- Click the "Fork" button on the top right of the GitHub repository page.
-
Clone the Polars AI repository to your local machine:
$ git clone https://github.com/yourusername/polars-ai.git
-
Build the project using Rust's package manager, Cargo:
$ cd polars-ai $ cargo build --release
-
Set the OpenAI API key:
$ export OPENAI_API_KEY=sk-
-
Run the CLI:
$ ./target/release/polars-ai help
To use Polars AI, you can also install it using Cargo, the Rust package manager:
-
Build the project using Rust's package manager, Cargo:
$ cargo install polars-ai
-
Set the OpenAI API key:
$ export OPENAI_API_KEY=sk-
-
Run the CLI:
$ polars-ai help
Before you begin, make sure you have a Polars DataFrame that you want to analyze and interact with. Polars AI works with Polars DataFrames, so ensure that you have the necessary data loaded.
With Polars AI, you can chat with your DataFrames using plain text queries. Simply enter your question or query when prompted by the CLI. For example:
$ export OPENAI_API_KEY=sk-
$ polars-ai input -f examples/datasets/flights.csv show
📊 DataFrame:
shape: (18, 7)
┌────────────┬───────────┬─────────┬─────────────────┬───────────────┬──────────┬──────────┐
│ DayofMonth ┆ DayOfWeek ┆ Carrier ┆ OriginAirportID ┆ DestAirportID ┆ DepDelay ┆ ArrDelay │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞════════════╪═══════════╪═════════╪═════════════════╪═══════════════╪══════════╪══════════╡
│ 19 ┆ 5 ┆ DL ┆ 11433 ┆ 13303 ┆ -3 ┆ 1 │
│ 19 ┆ 5 ┆ DL ┆ 14869 ┆ 12478 ┆ 0 ┆ -8 │
│ 19 ┆ 5 ┆ DL ┆ 14057 ┆ 14869 ┆ -4 ┆ -15 │
│ 19 ┆ 5 ┆ DL ┆ 15016 ┆ 11433 ┆ 28 ┆ 24 │
│ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … │
│ 19 ┆ 5 ┆ DL ┆ 10397 ┆ 12451 ┆ 71 ┆ null │
│ 19 ┆ 5 ┆ DL ┆ 12451 ┆ 10397 ┆ 75 ┆ null │
│ 19 ┆ 5 ┆ DL ┆ 12953 ┆ 10397 ┆ -1 ┆ null │
│ 19 ┆ 5 ┆ DL ┆ 11433 ┆ 12953 ┆ -3 ┆ null │
└────────────┴───────────┴─────────┴─────────────────┴───────────────┴──────────┴──────────┘
$ polars-ai input -f examples/datasets/flights.csv ask -q 'What is the average of the first column?'
🤖 AI Response:
use polars::prelude::*;
fn analyze_data(dfs: Vec<DataFrame>) -> Result<DataFrame> {
let df = &dfs[0];
let avg_first_column = df
.select(&[col("DayofMonth")])
.expect("Column 'DayofMonth' must exist")
.mean()
.unwrap()
.select(&[col("mean")])
.unwrap();
let top_carriers = df
.groupby(&[col("Carrier")])
.expect("Column 'Carrier' must exist")
.mean()
.unwrap()
.sort(&[col("mean")], false)
.expect("Column 'mean' must exist")
.head(Some(5))
.select(&[col("Carrier")])
.unwrap();
let result_df = df
.join(&top_carriers, &[col("Carrier")], &[col("Carrier")], JoinType::Inner)
.expect("Column 'Carrier' must exist")
.sort(&[col("DayofMonth")], false)
.expect("Column 'DayofMonth' must exist")
.head(Some(5));
let final_result = result_df
.select(&[col("Carrier"), col("DayofMonth")])
.unwrap();
Ok(final_result)
}
let result = analyze_data(dfs);
println!("{}", result);
Now, based on the query above, you can run the Rust code.
The generated Rust code follows a structured data analysis workflow:
- Prepare: Preprocess and clean the data if required.
- Process: Manipulate the data for analysis (e.g., grouping, filtering, aggregating).
- Analyze: Conduct the analysis.
- Output: Return results in various formats.
You can modify the generated code to customize your analysis.
Refer to the examples folder to use Polars AI to analyze your data. Polars AI will generate Rust code to perform eda on the data.
We welcome contributions to Polars AI! If you'd like to contribute to this project, please follow these steps:
-
Fork the repository on GitHub:
- Click the "Fork" button on the top right of the GitHub repository page.
-
Create a new branch for your feature or bug fix:
-
Use the following Git command to create a new branch:
$ git checkout -b feature-or-bugfix-branch
-
-
Make your changes and commit them:
-
Edit the files in your local repository and use the following Git commands to commit your changes:
$ git add . $ git commit -m "Your commit message here"
-
-
Create a pull request with a clear description of your changes:
-
Push your branch to your forked repository on GitHub and then create a pull request from there.
$ git push origin feature-or-bugfix-branch
-
Visit your forked repository on GitHub, and you'll see an option to create a pull request for the branch you just pushed.
-
This project is licensed under the MIT License - see the LICENSE file for details.