A data scientist agent that pairs a local LLM with a catalog of reusable pandas tooling. Given tabular data and a written business context, the agent iteratively decides how to process the dataset, and drafts hypotheses or follow-up questions.
The tool is designed for data scientists running preliminary analyses, as well as for non-technical users who depend on them but want quick insights without having to pull a data scientist away from other work.
There are 3 main scenarios:
- EDA: Open-ended exploration of raw data out of a database into an unnormalized file and want quick insights — plus a sense of what’s missing to build a clearer picture.
- Estimation: Beyond data you identified tha target to predict, the agent runs classification/regression.
- Evaluating predictions: You already have model predictions and want to spot where the real impact is (and where further attention might be needed).
The mode is inferred based on the flags provided. It changes tools available to the agent, its goal and prompts.
cli.py— Typer-powered command-line entry point that streams agent events with Rich panels.agents.py— streaming agent loop, and LLM client implementations.tools.py— registry of mapper/reducer processing tools exposed for LLM selection.prompts.py— prompt builders.server.py— FastAPI wrapper that runs the agent asynchronously and streams JSON events over HTTP.streamlit_app.py— Streamlit app to render live events.
- Get the environment and right Python version:
uv venv --python 3.10 .venv - Install dependencies with
uv pip install -r requirements.txt
Download Ollama server https://ollama.com/download and download your favourite model.
uv run python cli.py --data=tests/data/winequality.csv --model-name=ollama/granite4:tiny-h --api-base=http://localhost:11434Locally with llama.cpp
brew install llama.cpp
llama-cli -hf unsloth/granite-4.0-h-tiny-GGUF
llama-server -m ~/Library/Caches/llama.cpp/unsloth_LFM2-8B-A1B-GGUF_LFM2-8B-A1B-Q4_K_M.gguf --port 8080 --threads 6 --ctx-size 65536 --flash-attn 1 --gpu-layers -1 --cont-batching
uv run python cli.py --data=tests/data/winequality.csv --model-name=llamacpp/Lfm2-8B-A1BRecommended models:
- unsloth/granite-4.0-h-tiny-GGUF
- unsloth/LFM2-8B-A1B-GGUF (8B MoE model but still >50 tokens/secons on M4 24GB)
- unsloth/Phi-4-mini-reasoning-GGUF
- bartowski/LiquidAI_LFM2-1.2B-Tool-GGUF
- ai21labs/AI21-Jamba-Reasoning-3B-GGUF
In the cloud wiht OpenAI:
export OPENAI_API_KEY="sk-..."
uv run python cli.py --data=tests/data/winequality.csv --context=tests/data/winequality.md --model-name=gpt-4o-miniTests (run from project run directory):
scripts/presubmit.shProgrammatically
state, llm = agents.get_state_llm(agents.RunSettings(data_file="tests/data/winequality.csv"))
final_hypotheses = agents.run_sync(state, llm)- Launch the FastAPI server so the agent can execute in the background:
uv run uvicorn server:app --reload --host 0.0.0.0 --port 8000
- Start the Streamlit dashboard to submit runs and observe live updates:
The interface exposes inputs for the dataset, business context, and model name, and automatically reconnects to active runs to replay event streams.
uv run streamlit run streamlit_app.py
Q: Why isn’t it implemented as a Pydantic AI agent? A: Pydantic AI assumes a conversational setup with back-and-forth exchanges, which doesn’t fit this case. Here, the workflow is purely tool-driven and follows a strict sequence rather than model-driven choices. Most modern agent frameworks let the model decide when and how to use tools; in contrast, we enforce a defined cycle — applying a transformation function, then summarization, and finally hypothesis generation.