feat: add centralized retry mechanism for LLM providers by tdakanalis · Pull Request #13793 · infiniflow/ragflow

tdakanalis · 2026-03-25T14:34:10Z

What problem does this PR solve?

This PR solves the problem of duplicated and inconsistent retry logic across LLM model implementations.

Background:

Previously, each LLM model class (embedding, CV, rerank, TTS, OCR, speech2text) had its own ad-hoc retry implementation, leading to:

Duplicated code across 10+ files
Inconsistent retry behavior between different model providers
Difficult maintenance when changing retry settings
Risk of bugs when retry logic wasn't applied uniformly

Solution:

Extract retry logic into a centralized rag/llm/retry.py module with:
- @ retry decorator for sync methods (retries on transient errors, re-raises on failure)
- @ retry_or_fallback decorator for methods that return error tuples
- is_retryable() - determines if an error is transient (rate limits, server errors)
- classify_error() - categorizes exceptions into LLMErrorCode for detailed logging
- async_handle_exception() - shared async error handler
Apply decorators consistently across all embedding, CV, rerank, TTS, OCR, and speech2text models
Add comprehensive unit tests (76 tests)

Behavior preserved:

Same default retry settings (max_retries=5, base_delay=2.0)
Same retryable error signals (rate limits, 5xx server errors)
Same backoff jitter formula (base_delay * uniform(10, 150))

Type of change

New Feature (non-breaking change which adds functionality)
Refactoring

@Retry

Extract duplicated retry logic from chat_model.py into a new rag/llm/retry.py module with decorators (@Retry, @retry_or_fallback) for consistent error handling across all LLM providers (embedding, CV, rerank, TTS, OCR, speech2text). Key changes: - New rag/llm/retry.py module with: - @Retry: decorator for sync methods that retries on transient errors - @retry_or_fallback: decorator that returns fallback value on failure - is_retryable(): check if exception is transient - classify_error(): categorize exceptions into LLMErrorCode - async_handle_exception(): shared async error handler - Updated all LLM model classes to use centralized retry decorators - Added exception handling to caller sites (parser.py, conversation_app.py) - Fixed LLMErrorCode.ERROR_MAX_ROUNDS value for consistency - Added comprehensive unit tests in test/unit_test/rag/llm/test_retry.py Behavior preserved: - Same default retry settings (max_retries=5, base_delay=2.0) - Same retryable error signals (rate limits, 5xx server errors) - Same backoff jitter formula (base_delay * uniform(10, 150))

codecov · 2026-03-25T15:32:54Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.72%. Comparing base (24fcd6b) to head (99c1cf6).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main   #13793   +/-   ##
=======================================
  Coverage   96.72%   96.72%           
=======================================
  Files          10       10           
  Lines         702      702           
  Branches      112      112           
=======================================
  Hits          679      679           
  Misses          5        5           
  Partials       18       18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

yingfeng requested a review from yongtenglei March 25, 2026 14:46

yingfeng added the ci Continue Integration label Mar 25, 2026

yingfeng marked this pull request as draft March 25, 2026 14:46

yingfeng marked this pull request as ready for review March 25, 2026 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add centralized retry mechanism for LLM providers #13793

feat: add centralized retry mechanism for LLM providers #13793
tdakanalis wants to merge 1 commit intoinfiniflow:mainfrom
tdakanalis:feat/centralized-llm-retry-mechanism

tdakanalis commented Mar 25, 2026 •

edited by yingfeng

Loading

Uh oh!

codecov bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tdakanalis commented Mar 25, 2026 • edited by yingfeng Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Background:

Solution:

Behavior preserved:

Type of change

Uh oh!

codecov bot commented Mar 25, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tdakanalis commented Mar 25, 2026 •

edited by yingfeng

Loading