Improve local Ollama fallback for low-resource machines by naraypv · Pull Request #1549 · Graphify-Labs/graphify

naraypv · 2026-06-29T21:26:32Z

Summary

This PR improves Graphify's local semantic extraction path for machines that cannot comfortably run large local models.

Implemented changes:

default local Ollama model policy targets small laptop-safe models: qwen2.5-coder:3b first, then gemma3:4b
routes Ollama extraction through the native /api/chat endpoint so context options such as num_ctx actually apply
adds robust fallback behavior so a failed local model can retry the next local model before falling back to MiniMax when configured
keeps MiniMax as an optional last fallback, not the default workhorse, with controls to cap remote spillover
adds load/daytime controls and defer mode so background Graphify work is less likely to make a daily-driver laptop lag
hardens malformed extraction cleanup so bad LLM fragments do not crash graph construction
updates generated skills/docs and tests for the new fallback policy

Why this may help the community

A lot of users do not have enough RAM/VRAM to run 20B or 30B coding models while also using their machine for normal work. This keeps the local path useful on lower-resource laptops by trying small local models first, failing over cleanly, and only using cloud fallback when the user has configured it. It should make large-project Graphify runs less brittle without forcing everyone into large local model downloads.

Verification

Ran locally before opening this PR:

uv run pytest -q -> 2537 passed, 28 skipped
python3 -m compileall -q graphify
python3 -m tools.skillgen --check
uv lock --check
live Ollama fallback smoke: missing primary model retried with qwen2.5-coder:3b and produced graph.json
graphify update . --no-cluster
machine credential guard passed for staged changes, outgoing commits, and full HEAD

No repo-specific CONTRIBUTING file or pull request template was present, so this targets the upstream default branch v8 and follows the visible project/test/security expectations.

# Conflicts: # README.md # graphify/build.py # graphify/detect.py # graphify/llm.py # graphify/prs.py # graphify/skill-amp.md # graphify/skill-claw.md # graphify/skill-codex.md # graphify/skill-copilot.md # graphify/skill-droid.md # graphify/skill-kilo.md # graphify/skill-kiro.md # graphify/skill-opencode.md # graphify/skill-pi.md # graphify/skill-trae.md # graphify/skill-vscode.md # graphify/skill-windows.md # graphify/skill.md # tests/test_detect.py # tools/skillgen/expected/graphify__skill-amp.md # tools/skillgen/expected/graphify__skill-claw.md # tools/skillgen/expected/graphify__skill-codex.md # tools/skillgen/expected/graphify__skill-copilot.md # tools/skillgen/expected/graphify__skill-droid.md # tools/skillgen/expected/graphify__skill-kilo.md # tools/skillgen/expected/graphify__skill-kiro.md # tools/skillgen/expected/graphify__skill-opencode.md # tools/skillgen/expected/graphify__skill-pi.md # tools/skillgen/expected/graphify__skill-trae.md # tools/skillgen/expected/graphify__skill-vscode.md # tools/skillgen/expected/graphify__skill-windows.md # tools/skillgen/expected/graphify__skill.md # tools/skillgen/fragments/core/core.md # uv.lock

naraypv added 4 commits June 29, 2026 17:01

fix(llm): harden Ollama extraction fallback

a324f47

fix: stabilize production merge

299e1bc

fix: harden ollama fallback and worker limits

b5397a8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve local Ollama fallback for low-resource machines#1549

Improve local Ollama fallback for low-resource machines#1549
naraypv wants to merge 4 commits into
Graphify-Labs:v8from
naraypv:production

naraypv commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

naraypv commented Jun 29, 2026

Summary

Why this may help the community

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant