Skip to content

Improve local Ollama fallback for low-resource machines#1549

Open
naraypv wants to merge 4 commits into
Graphify-Labs:v8from
naraypv:production
Open

Improve local Ollama fallback for low-resource machines#1549
naraypv wants to merge 4 commits into
Graphify-Labs:v8from
naraypv:production

Conversation

@naraypv

@naraypv naraypv commented Jun 29, 2026

Copy link
Copy Markdown

Summary

This PR improves Graphify's local semantic extraction path for machines that cannot comfortably run large local models.

Implemented changes:

  • default local Ollama model policy targets small laptop-safe models: qwen2.5-coder:3b first, then gemma3:4b
  • routes Ollama extraction through the native /api/chat endpoint so context options such as num_ctx actually apply
  • adds robust fallback behavior so a failed local model can retry the next local model before falling back to MiniMax when configured
  • keeps MiniMax as an optional last fallback, not the default workhorse, with controls to cap remote spillover
  • adds load/daytime controls and defer mode so background Graphify work is less likely to make a daily-driver laptop lag
  • hardens malformed extraction cleanup so bad LLM fragments do not crash graph construction
  • updates generated skills/docs and tests for the new fallback policy

Why this may help the community

A lot of users do not have enough RAM/VRAM to run 20B or 30B coding models while also using their machine for normal work. This keeps the local path useful on lower-resource laptops by trying small local models first, failing over cleanly, and only using cloud fallback when the user has configured it. It should make large-project Graphify runs less brittle without forcing everyone into large local model downloads.

Verification

Ran locally before opening this PR:

  • uv run pytest -q -> 2537 passed, 28 skipped
  • python3 -m compileall -q graphify
  • python3 -m tools.skillgen --check
  • uv lock --check
  • live Ollama fallback smoke: missing primary model retried with qwen2.5-coder:3b and produced graph.json
  • graphify update . --no-cluster
  • machine credential guard passed for staged changes, outgoing commits, and full HEAD

No repo-specific CONTRIBUTING file or pull request template was present, so this targets the upstream default branch v8 and follows the visible project/test/security expectations.

naraypv added 4 commits June 29, 2026 17:01
# Conflicts:
#	README.md
#	graphify/build.py
#	graphify/detect.py
#	graphify/llm.py
#	graphify/prs.py
#	graphify/skill-amp.md
#	graphify/skill-claw.md
#	graphify/skill-codex.md
#	graphify/skill-copilot.md
#	graphify/skill-droid.md
#	graphify/skill-kilo.md
#	graphify/skill-kiro.md
#	graphify/skill-opencode.md
#	graphify/skill-pi.md
#	graphify/skill-trae.md
#	graphify/skill-vscode.md
#	graphify/skill-windows.md
#	graphify/skill.md
#	tests/test_detect.py
#	tools/skillgen/expected/graphify__skill-amp.md
#	tools/skillgen/expected/graphify__skill-claw.md
#	tools/skillgen/expected/graphify__skill-codex.md
#	tools/skillgen/expected/graphify__skill-copilot.md
#	tools/skillgen/expected/graphify__skill-droid.md
#	tools/skillgen/expected/graphify__skill-kilo.md
#	tools/skillgen/expected/graphify__skill-kiro.md
#	tools/skillgen/expected/graphify__skill-opencode.md
#	tools/skillgen/expected/graphify__skill-pi.md
#	tools/skillgen/expected/graphify__skill-trae.md
#	tools/skillgen/expected/graphify__skill-vscode.md
#	tools/skillgen/expected/graphify__skill-windows.md
#	tools/skillgen/expected/graphify__skill.md
#	tools/skillgen/fragments/core/core.md
#	uv.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant