[FIX] Fix Redis Sentinel HA crashes in backend and celery services#1854
[FIX] Fix Redis Sentinel HA crashes in backend and celery services#1854chandrasekharan-zipstack merged 4 commits intomainfrom
Conversation
Fix two bugs introduced in PR #1838 that cause CrashLoopBackOff in Sentinel mode: 1. log_events.py: Nest transport_options under connection_options for KombuManager - KombuManager doesn't accept transport_options directly - connection_options gets unpacked into kombu.Connection() which accepts transport_options 2. base.py: Add missing CONNECTION_POOL_CLASS for Django CACHES in Sentinel mode - SentinelClient.connect() requires CONNECTION_POOL_CLASS set to SentinelConnectionPool - Affects celery-beat and worker-metrics services Both fixes are isolated to if REDIS_SENTINEL_MODE: branch. Standalone Redis mode is unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
Summary by CodeRabbit
WalkthroughAdds a Redis Sentinel connection pool option to Django caching configuration and moves Kombu transport options into a nested Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR fixes two crash-causing regressions introduced by #1838 (Redis Sentinel HA support), both strictly gated behind the Changes:
All three changes are minimal, targeted, and correct. No new environment variables or dependencies are introduced. Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Backend/Worker as Backend / Celery Worker
participant KombuManager as socketio.KombuManager
participant KombuConn as kombu.Connection
participant Sentinel as Redis Sentinel
participant RedisMaster as Redis Master
Note over Backend/Worker: Sentinel mode: REDIS_SENTINEL_MODE=true
Backend/Worker->>KombuManager: KombuManager(url="sentinel://...", connection_options={"transport_options": {"master_name": "mymaster"}})
KombuManager->>KombuConn: kombu.Connection(url, **connection_options)
KombuConn->>Sentinel: Discover master via master_name
Sentinel-->>KombuConn: Redis Master address
KombuConn->>RedisMaster: Connect
Note over Backend/Worker: Django cache (CACHES)
Backend/Worker->>Backend/Worker: SentinelClient.connect() validates CONNECTION_POOL_CLASS == SentinelConnectionPool ✓
Backend/Worker->>Sentinel: Discover master
Sentinel-->>Backend/Worker: Redis Master address
Backend/Worker->>RedisMaster: Cache read/write
Last reviewed commit: 4568ca1 |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/utils/log_events.py`:
- Around line 19-21: The workers/log_consumer Socket.IO/Kombu connection still
passes transport_options at the top level (using _transport_options and
_sentinel_master_name), so update the kwargs construction where the
Socket.IO/Kombu client is created to nest transport options under
connection_options when present; replace the direct top-level inclusion of
_transport_options with a conditional expansion that yields
{"connection_options": {"transport_options": _transport_options}} if
_transport_options is truthy, otherwise {} so Sentinel master options like
{"master_name": _sentinel_master_name} are correctly nested.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5c7d83ad-5764-4861-9f9b-06de539babad
📒 Files selected for processing (2)
backend/backend/settings/base.pybackend/utils/log_events.py
gaya3-zipstack
left a comment
There was a problem hiding this comment.
Seeing a better coding way suggested by coderabbit. Otherwise, no concern.
…pstack/unstract into fix/redis-sentinel-ha-crashes
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|
|
@chandrasekharan-zipstack Have you tested this from local? |
|
Not yet @johnyrahul , as discussed will get this to staging and attempt testing this locally in parallel |
… cache PR #1854 added CONNECTION_POOL_CLASS but missed CONNECTION_FACTORY, causing the default ConnectionFactory to create SentinelConnectionPool via from_url() without the required service_name and sentinel_manager arguments — crashing migrations at import time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…1855) [FIX] Add missing SentinelConnectionFactory for Django Redis Sentinel cache PR #1854 added CONNECTION_POOL_CLASS but missed CONNECTION_FACTORY, causing the default ConnectionFactory to create SentinelConnectionPool via from_url() without the required service_name and sentinel_manager arguments — crashing migrations at import time. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>



What
transport_optionsunderconnection_optionsinlog_events.pyCONNECTION_POOL_CLASSin Django CACHES Sentinel config inbase.pyWhy
unstract-staging-ha-testnamespace for backend, celery-beat, and worker-metrics pods.How
log_events.py:transport_optionswas passed directly tosocketio.KombuManager(), which doesn't accept it. KombuManager passesconnection_optionstokombu.Connection(), which does accepttransport_options. Fix: nesttransport_optionsunderconnection_options.base.py:django_redis.client.SentinelClient.connect()validates that the connection pool is aSentinelConnectionPool. AddedCONNECTION_POOL_CLASS: redis.sentinel.SentinelConnectionPoolto the CACHES OPTIONS in the Sentinel branch.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
if REDIS_SENTINEL_MODE:branch. In standalone Redis mode (default),SOCKET_IO_TRANSPORT_OPTIONS = {}(falsy) so the connection_options block is skipped, and the CACHES config uses the standalone branch entirely.Database Migrations
Env Config
REDIS_SENTINEL_MODEcontrols the affected code paths.Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.