Skip to content

UN-3034 [FIX] Add retry backoff configuration for LLMWhisperer client#1836

Merged
chandrasekharan-zipstack merged 18 commits intomainfrom
fix/llmwhisperer-retry
Mar 12, 2026
Merged

UN-3034 [FIX] Add retry backoff configuration for LLMWhisperer client#1836
chandrasekharan-zipstack merged 18 commits intomainfrom
fix/llmwhisperer-retry

Conversation

@gaya3-zipstack
Copy link
Contributor

@gaya3-zipstack gaya3-zipstack commented Mar 6, 2026

What

  • Add configurable retry backoff parameters for the LLMWhisperer v2 client to handle transient HTTP errors (429, 5xx)
  • Increment LLMWhisperer client version in sdk1

Why

  • LLMWhisperer API calls can fail due to transient HTTP errors (rate limiting 429, server errors 5xx)
  • Without retry logic, these transient failures cause document extraction to fail unnecessarily
  • Configurable backoff allows tuning retry behavior per deployment environment

How

  • Added three new environment variables: ADAPTER_LLMW_MAX_RETRIES, ADAPTER_LLMW_RETRY_MIN_WAIT, ADAPTER_LLMW_RETRY_MAX_WAIT
  • Added corresponding constants in WhispererEnv and WhispererDefaults
  • Pass retry parameters (max_retries, retry_min_wait, retry_max_wait) to LLMWhispererClientV2 constructor
  • Updated sample.env for prompt-service only (backend sample.env is not changed as the retry env vars are only needed by the prompt-service)
  • Incremented LLMWhisperer client version in sdk1

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

  • No. The retry parameters have sensible defaults (3 retries, 1s min wait, 60s max wait) that match previous behavior. Existing deployments without these env vars will use the defaults.

Database Migrations

  • None

Env Config

  • ADAPTER_LLMW_MAX_RETRIES (default: 3) - Max retry attempts for transient HTTP errors. Set 0 to disable.
  • ADAPTER_LLMW_RETRY_MIN_WAIT (default: 1.0) - Min backoff wait in seconds between retries
  • ADAPTER_LLMW_RETRY_MAX_WAIT (default: 60.0) - Max backoff wait in seconds between retries

Relevant Docs

Related Issues or PRs

  • UN-3034

Dependencies Versions

  • Updated llmwhisperer-client version in sdk1

Notes on Testing

  • Verified retry parameters are correctly passed to the LLMWhisperer client constructor
  • Tested with default env values and custom overrides for extraction

Screenshots

image

Checklist

I have read and understood the Contribution Guidelines.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 9, 2026

Walkthrough

This PR adds configurable retry and backoff behavior for the LLMWhisperer v2 client by introducing three environment variables, updating the client dependency version, and propagating these configuration parameters through the adapter initialization chain.

Changes

Cohort / File(s) Summary
Environment Configuration
prompt-service/sample.env
Added three new environment variables for LLMWhisperer retry configuration: ADAPTER_LLMW_MAX_RETRIES, ADAPTER_LLMW_RETRY_MIN_WAIT, and ADAPTER_LLMW_RETRY_MAX_WAIT with defaults and inline documentation.
Dependency Updates
unstract/sdk1/pyproject.toml
Updated llmwhisperer-client dependency from >=2.2.1 to >=2.6.2 to support the new retry configuration parameters.
Constants and Defaults
unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/constants.py
Added three public constants to WhispererEnv class and three corresponding attributes to WhispererDefaults class to expose retry configuration settings from environment variables with default values (max_retries: 3, retry_min_wait: 1.0, retry_max_wait: 60.0).
Client Integration
unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/helper.py
Updated LLMWhispererClientV2 initialization in make_request to pass the three new retry configuration parameters from defaults.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding retry backoff configuration for the LLMWhisperer client, which aligns with all modified files (environment config, constants, helper, and version bump).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description comprehensively covers all required template sections with clear details on what changed, why it was needed, how it was implemented, and backward compatibility assurances.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/llmwhisperer-retry

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gaya3-zipstack gaya3-zipstack force-pushed the fix/llmwhisperer-retry branch from 2097c44 to 513b919 Compare March 9, 2026 08:48
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/helper.py (1)

110-116: ⚠️ Potential issue | 🟠 Major

Use the exception's status code instead of hard-coding 500.

The LLMWhispererClientException carries a .status_code attribute that reflects the upstream HTTP status. With the new retry logic enabled, this exception path becomes more important. Instead of always mapping it to status_code=500 at line 159, use e.status_code when available so callers can distinguish 429 (throttling) from 5xx (server errors).

except LLMWhispererClientException as e:
    logger.error(f"LLM Whisperer error: {e}")
    raise ExtractorError(
        message=f"LLM Whisperer error: {e}",
        actual_err=e,
        status_code=e.status_code or 500,
    ) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/helper.py`
around lines 110 - 116, The exception handler for LLMWhispererClientException
currently maps all errors to status_code=500; update the handler in helper.py
(the except LLMWhispererClientException block) to pass the exception's actual
status via e.status_code (falling back to 500 if None) when raising
ExtractorError, and keep the existing logger.error and actual_err=e fields
intact so callers can distinguish 429 vs 5xx errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/constants.py`:
- Around line 114-116: Replace the raw import-time int/float conversions for
MAX_RETRIES, RETRY_MIN_WAIT, and RETRY_MAX_WAIT by using validation helpers
(e.g., get_int_env and get_float_env) that read os.getenv(WhispererEnv.*), treat
None or "" as the default, catch/handle non-numeric values, enforce min bounds
(e.g., MAX_RETRIES >= 0, RETRY_MIN_WAIT >= 0.0), and then validate the
relationship RETRY_MAX_WAIT >= RETRY_MIN_WAIT; update the constants assignment
to call these helpers so module import cannot crash and invalid backoff configs
are rejected with clear errors.

---

Outside diff comments:
In
`@unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/helper.py`:
- Around line 110-116: The exception handler for LLMWhispererClientException
currently maps all errors to status_code=500; update the handler in helper.py
(the except LLMWhispererClientException block) to pass the exception's actual
status via e.status_code (falling back to 500 if None) when raising
ExtractorError, and keep the existing logger.error and actual_err=e fields
intact so callers can distinguish 429 vs 5xx errors.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 43eccf10-b938-4493-9528-74c02bfca08a

📥 Commits

Reviewing files that changed from the base of the PR and between d11510a and 513b919.

⛔ Files ignored due to path filters (7)
  • platform-service/uv.lock is excluded by !**/*.lock
  • prompt-service/uv.lock is excluded by !**/*.lock
  • unstract/filesystem/uv.lock is excluded by !**/*.lock
  • unstract/sdk1/uv.lock is excluded by !**/*.lock
  • unstract/tool-registry/uv.lock is excluded by !**/*.lock
  • unstract/workflow-execution/uv.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • backend/sample.env
  • prompt-service/sample.env
  • unstract/sdk1/pyproject.toml
  • unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/constants.py
  • unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/helper.py

@gaya3-zipstack
Copy link
Contributor Author

Looks like some lock files were not updated in a long time like workflow-execution and workers. That could be the reason for such huge file diffs.

Copy link
Contributor

@pk-zipstack pk-zipstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

Test Results

Summary
  • Runner Tests: 11 passed, 0 failed (11 total)

Runner Tests - Full Report
filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_logs}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_client\_init}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_run\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{11}}$$ $$\textcolor{#23d18b}{\tt{11}}$$

@sonarqubecloud
Copy link

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR adds configurable retry backoff parameters (max_retries, retry_min_wait, retry_max_wait) to the LLMWhispererClientV2 constructor in sdk1 to gracefully handle transient HTTP errors (429, 5xx), and bumps the llmwhisperer-client minimum version from 2.2.1 to 2.6.2 which exposes these new constructor parameters.

  • New env vars ADAPTER_LLMW_MAX_RETRIES, ADAPTER_LLMW_RETRY_MIN_WAIT, and ADAPTER_LLMW_RETRY_MAX_WAIT are read at module import time via WhispererDefaults class-level attributes and forwarded directly to the client constructor — consistent with how WAIT_TIMEOUT and LOGGING_LEVEL are already handled.
  • prompt-service/sample.env is correctly updated; however, backend/sample.env (which also contains an ADAPTER_LLMW_* block) was not updated despite the PR description claiming both files were changed.
  • The WhispererEnv docstring's Attributes section was not updated to document the three new constants.

Confidence Score: 4/5

  • Safe to merge — the logic change is minimal and backward-compatible; the only gap is a missing update to backend/sample.env.
  • The core code change (three new constructor kwargs forwarded from env-driven defaults) is correct, well-isolated, and backward-compatible with sensible defaults. The only concrete gap is backend/sample.env not being updated, which is a documentation/discoverability issue rather than a runtime bug.
  • backend/sample.env — needs the three new retry env var entries added to match prompt-service/sample.env.

Important Files Changed

Filename Overview
unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/constants.py Adds MAX_RETRIES, RETRY_MIN_WAIT, RETRY_MAX_WAIT constants to WhispererEnv and WhispererDefaults; docstring for WhispererEnv is not updated to reflect the new attributes.
unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/helper.py Passes max_retries, retry_min_wait, retry_max_wait to LLMWhispererClientV2 constructor; change is straightforward and correct.
unstract/sdk1/pyproject.toml Bumps llmwhisperer-client minimum version from 2.2.1 to 2.6.2 and adds a section comment; straightforward version bump.
prompt-service/sample.env Adds the three new retry env var entries with comments; correct and complete.
backend/uv.lock Lock file updated to reflect the bumped llmwhisperer-client dependency; the main change is removal of upload-time metadata from package entries (revision 3→1), which is a routine lock regeneration artifact.

Sequence Diagram

sequenceDiagram
    participant Env as Environment Variables
    participant Defaults as WhispererDefaults
    participant Helper as LLMWhispererHelper
    participant Client as LLMWhispererClientV2
    participant API as LLMWhisperer API

    Env->>Defaults: ADAPTER_LLMW_MAX_RETRIES (default: 3)
    Env->>Defaults: ADAPTER_LLMW_RETRY_MIN_WAIT (default: 1.0s)
    Env->>Defaults: ADAPTER_LLMW_RETRY_MAX_WAIT (default: 60.0s)

    Helper->>Client: LLMWhispererClientV2(max_retries, retry_min_wait, retry_max_wait)
    activate Client

    Helper->>Client: client.whisper(...)
    Client->>API: POST /whisper
    API-->>Client: HTTP 429 / 5xx (transient error)
    Note over Client: exponential backoff retry<br/>(up to max_retries times,<br/>wait between min_wait and max_wait)
    Client->>API: POST /whisper (retry)
    API-->>Client: HTTP 200 OK
    Client-->>Helper: response
    deactivate Client
Loading

Comments Outside Diff (2)

  1. backend/sample.env, line 152 (link)

    Missing retry env vars in backend sample.env

    The PR description states "Updated sample.env files for both backend and prompt-service," but backend/sample.env was not changed in this PR. It contains the same ADAPTER_LLMW_* block as prompt-service/sample.env but the three new retry variables are absent:

    ADAPTER_LLMW_POLL_INTERVAL=30
    ADAPTER_LLMW_MAX_POLLS=1000
    ADAPTER_LLMW_STATUS_RETRIES=5
    # ADAPTER_LLMW_MAX_RETRIES, ADAPTER_LLMW_RETRY_MIN_WAIT, ADAPTER_LLMW_RETRY_MAX_WAIT are missing
    

    This means developers deploying or configuring the backend service using backend/sample.env as a reference won't discover the new tunable retry parameters. Please add the same three entries to backend/sample.env that were added to prompt-service/sample.env.

  2. unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/constants.py, line 37-44 (link)

    WhispererEnv docstring not updated for new env vars

    The class docstring lists WAIT_TIMEOUT and LOG_LEVEL under Attributes, but the three newly added env variables (MAX_RETRIES, RETRY_MIN_WAIT, RETRY_MAX_WAIT) are not mentioned. This makes the docstring incomplete and inconsistent with the actual class state.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: backend/sample.env
Line: 152

Comment:
**Missing retry env vars in backend sample.env**

The PR description states "Updated `sample.env` files for both backend and prompt-service," but `backend/sample.env` was not changed in this PR. It contains the same `ADAPTER_LLMW_*` block as `prompt-service/sample.env` but the three new retry variables are absent:

```
ADAPTER_LLMW_POLL_INTERVAL=30
ADAPTER_LLMW_MAX_POLLS=1000
ADAPTER_LLMW_STATUS_RETRIES=5
# ADAPTER_LLMW_MAX_RETRIES, ADAPTER_LLMW_RETRY_MIN_WAIT, ADAPTER_LLMW_RETRY_MAX_WAIT are missing
```

This means developers deploying or configuring the backend service using `backend/sample.env` as a reference won't discover the new tunable retry parameters. Please add the same three entries to `backend/sample.env` that were added to `prompt-service/sample.env`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/constants.py
Line: 37-44

Comment:
**`WhispererEnv` docstring not updated for new env vars**

The class docstring lists `WAIT_TIMEOUT` and `LOG_LEVEL` under `Attributes`, but the three newly added env variables (`MAX_RETRIES`, `RETRY_MIN_WAIT`, `RETRY_MAX_WAIT`) are not mentioned. This makes the docstring incomplete and inconsistent with the actual class state.

```suggestion
class WhispererEnv:
    """Env variables for LLMWhisperer.

    Can be used to alter behaviour at runtime.

    Attributes:
        WAIT_TIMEOUT: Timeout for the extraction in seconds. Defaults to 300s
        MAX_RETRIES: Max retry attempts for transient HTTP errors. Defaults to 3
        RETRY_MIN_WAIT: Min backoff wait in seconds between retries. Defaults to 1.0
        RETRY_MAX_WAIT: Max backoff wait in seconds between retries. Defaults to 60.0
        LOG_LEVEL: Logging level for the client library. Defaults to INFO
    """
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 6df4273

@chandrasekharan-zipstack chandrasekharan-zipstack merged commit 312dba3 into main Mar 12, 2026
10 checks passed
@chandrasekharan-zipstack chandrasekharan-zipstack deleted the fix/llmwhisperer-retry branch March 12, 2026 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants