[FIX] Redis Sentinel support for tool containers and sidecars#1858
[FIX] Redis Sentinel support for tool containers and sidecars#1858chandrasekharan-zipstack merged 7 commits intomainfrom
Conversation
MetricsMixin was creating a raw StrictRedis client using REDIS_HOST/PORT directly, which in Sentinel HA mode points to the Sentinel process (port 26379). Sentinel doesn't support the SELECT command, causing ResponseError on every /answer-prompt call. - Replace StrictRedis with create_redis_client() from unstract-core which handles Sentinel discovery via master_for() - Add unstract-core as sdk1 dependency (all services using sdk1 already depend on core, so zero new transitive deps) - Add error handling around set_start_time() and collect_metrics() so Redis failures degrade gracefully instead of crashing requests - Add missing COPY for unstract/core in tool Dockerfiles (classifier, structure, text_extractor) since sdk1 now depends on core Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tool Dockerfiles (classifier, text_extractor, structure) use pip, not uv. Since sdk1 now depends on unstract-core (which isn't on PyPI), pip can't resolve it via [tool.uv.sources]. Adding explicit editable install of core before sdk1 ensures pip finds it locally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Worker cache modules (cache_backends.py, cache_utils.py) duplicated
sentinel/standalone branching logic for Redis client creation. Both
now use create_redis_client() from unstract-core as the single entry
point, which already handles mode detection internally.
- Add SSL support (ssl, ssl_cert_reqs) to create_redis_client() via
env vars ({prefix}SSL, {prefix}SSL_CERT_REQS), disabled by default
- Remove if-sentinel-else-standalone branching in cache_backends.py
and cache_utils.py — replaced with single create_redis_client() call
- Remove unused os and redis imports from simplified modules
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The runner was not forwarding REDIS_SENTINEL_MODE and REDIS_SENTINEL_MASTER_NAME to tool containers and sidecars, causing the sidecar to connect in standalone mode to the Sentinel port (26379) and fail on all data commands (HGETALL, SETEX). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (3)
Summary by CodeRabbit
WalkthroughAdded Redis Sentinel configuration support by introducing two new environment variables ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR fixes a Redis Sentinel crash in sidecar and tool containers by forwarding
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant WE as Workflow Execution Service
participant Runner as Runner Service
participant Tool as Tool Container
participant Sidecar as Sidecar Container
participant Redis as Redis (Sentinel / Standalone)
WE->>WE: get_tool_environment_variables()<br/>reads REDIS_SENTINEL_MODE from WE env<br/>falls back to "False" via `or` default
WE->>Runner: run_container(envs={REDIS_SENTINEL_MODE, REDIS_HOST, ...})
Runner->>Runner: _get_sidecar_container_config()<br/>reads REDIS_SENTINEL_MODE from runner's own os.getenv()<br/>defaults to "False" if not set
Runner->>Tool: launch with envs[REDIS_SENTINEL_MODE] from WE
Runner->>Sidecar: launch with runner's os.getenv(REDIS_SENTINEL_MODE)
Note over Tool,Sidecar: ⚠️ Two independent env sources —<br/>must be configured consistently across both services
Tool->>Redis: create_redis_client()<br/>_is_sentinel_mode() parses REDIS_SENTINEL_MODE
Sidecar->>Redis: create_redis_client()<br/>_is_sentinel_mode() parses REDIS_SENTINEL_MODE
alt Both services have REDIS_SENTINEL_MODE=True
Redis-->>Tool: Sentinel master connection ✅
Redis-->>Sidecar: Sentinel master connection ✅
else Runner env missing REDIS_SENTINEL_MODE=True
Redis-->>Tool: Sentinel master connection ✅
Redis-->>Sidecar: Standalone to Sentinel port → "unknown command" ❌
end
Prompt To Fix All With AIThis is a comment left during a code review.
Path: runner/src/unstract/runner/runner.py
Line: 235-238
Comment:
**Sidecar sentinel config is decoupled from tool container sentinel config**
The sidecar's `REDIS_SENTINEL_MODE` is sourced exclusively from the **runner service's own environment** via `os.getenv()`, while the tool container receives it from the `envs` dict populated by the workflow-execution service (via `tools_utils.py`). These are two separate services that must be configured identically for the fix to hold.
If an operator sets `REDIS_SENTINEL_MODE=True` in the workflow-execution service but not in the runner service (or if the runner pod restarts before its env is updated), the sidecar will silently fall back to standalone mode with the default `"False"` — connecting to port 26379 in standalone mode and reproducing the exact `HGETALL`/`SETEX` "unknown command" crash this PR aims to fix.
The same divergence risk exists for `REDIS_HOST` and `REDIS_PORT`, but those are already long-established; adding two more independently-sourced sentinel vars raises the surface area further.
A more defensive approach would be to forward the sentinel values from the already-populated `envs` dict into the sidecar config inside `run_container()`:
```python
# Inside run_container(), before calling _get_sidecar_container_config()
sidecar_config = self._get_sidecar_container_config(
...,
# Pass sentinel vars from the tool envs so both containers use the same source
sentinel_mode=envs.get("REDIS_SENTINEL_MODE", os.getenv(Env.REDIS_SENTINEL_MODE, "False")),
sentinel_master_name=envs.get("REDIS_SENTINEL_MASTER_NAME", os.getenv(Env.REDIS_SENTINEL_MASTER_NAME, "mymaster")),
)
```
This would make the sidecar's sentinel mode consistent with the tool container's by design, rather than relying on operational discipline across two independently deployed services.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 73b8216 |
unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py
Show resolved
Hide resolved
Patch bump for tools that depend on unstract-core's updated Redis client with Sentinel support: - classifier: 0.0.76 -> 0.0.77 - structure: 0.0.97 -> 0.0.98 - text_extractor: 0.0.72 -> 0.0.73 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
create_redis_client()factory from unstract-coreREDIS_SENTINEL_MODEandREDIS_SENTINEL_MASTER_NAMEenv vars to tool containers and sidecar containersWhy
REDIS_HOSTandREDIS_PORT. The sidecar connects in standalone mode to port 26379 (Sentinel protocol port), which rejects data commands likeHGETALLandSETEXwith "unknown command". This causes the sidecar'sToolExecutionTrackerto fail writing execution status, making the worker conclude the tool container crashed — even though the tool completed successfully.How
runner.py,constants.py): ForwardREDIS_SENTINEL_MODEandREDIS_SENTINEL_MASTER_NAMEfrom runner env to sidecar container config, with sane defaults (False,mymaster)tools_utils.py,constants.py): Pass Sentinel env vars inget_tool_environment_variables()so tool containers (MetricsMixin) can also connect via Sentinelcreate_redis_client()factoryCan this PR break any existing features. If yes, please list possible items. If no, please explain why.
False/mymasterwhen not set, so non-Sentinel environments continue using standalone mode unchanged. Thecreate_redis_client()factory already handles both modes.Database Migrations
Env Config
REDIS_SENTINEL_MODE(default:False) — set toTrueto enable Sentinel mode for tool/sidecar containersREDIS_SENTINEL_MASTER_NAME(default:mymaster) — Sentinel master nameRelevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Checklist
I have read and understood the Contribution Guidelines.