Skip to content

Fix/code quality improvements#51

Closed
dkargatzis wants to merge 4 commits intomainfrom
fix/code-quality-improvements
Closed

Fix/code quality improvements#51
dkargatzis wants to merge 4 commits intomainfrom
fix/code-quality-improvements

Conversation

@dkargatzis
Copy link
Member

@dkargatzis dkargatzis commented Feb 26, 2026

Summary by CodeRabbit

  • New Features

    • Added event filtering and validation to improve webhook processing robustness.
    • Introduced timeout handling for agent evaluations to prevent long-running processes.
    • Added callback URL requirement and timezone-aware timestamp tracking for deployments.
  • Bug Fixes

    • Improved handling of edge cases for pull requests, including draft and merged states.
    • Enhanced error handling and resilience across event processors.
  • Documentation

    • Added comprehensive unit tests for event filtering logic.

- Add centralized event_filter.py module for filtering irrelevant events
  - Filter archived repositories, closed/merged/draft PRs, deleted branches
  - Add FilterResult dataclass (frozen=True for immutability)
  - Add NULL_SHA constant for consistent null SHA handling
- Migrate push.py and pull_request.py from stdlib logging to structlog
- Add event filtering at dispatcher level before handler dispatch
- Add delivery_id parameter to WebhookEvent for better tracing
- Add comprehensive unit tests for event filtering logic

Closes #50

Signed-off-by: Dimitris Kargatzis <dkargatzis@gmail.com>
- Migrate deployment_protection_rule.py from stdlib logging to structlog
- Add AGENT_TIMEOUT_SECONDS constant for timeout configuration
- Add defensive getattr() calls for violation object attributes
- Migrate deployment_scheduler.py from stdlib logging to structlog
- Add callback_url to required_fields validation
- Fix falsy checks to is None checks for proper validation
- Fix timestamp handling to use timezone-aware datetime (UTC)
- Improve structured logging with consistent fields
- Fix UP038 isinstance and B011 assert False lint issues

Closes #49

Signed-off-by: Dimitris Kargatzis <dkargatzis@gmail.com>
…rotection

Signed-off-by: Dimitris Kargatzis <dkargatzis@gmail.com>
- Migrate all remaining standard logging calls to structlog across agents, event processors, handlers, and integrations
- Standardize all log messages to use snake_case event names for better log aggregation and observability
- Convert f-string interpolations into kwargs for structured log payloads
- Remove unused variables and fix enum types in core models
- Ensure 100% test coverage stability post-refactor

Signed-off-by: Dimitris Kargatzis <dkargatzis@gmail.com>
@dkargatzis dkargatzis self-assigned this Feb 26, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

📝 Walkthrough

Walkthrough

This pull request migrates logging across the codebase from Python's standard logging module to structlog, introduces event filtering logic, adds delivery tracking via delivery_id to webhooks, expands enums to use StrEnum, updates several public API signatures to include installation_id, and refactors the deployment scheduler with timezone awareness.

Changes

Cohort / File(s) Summary
Agent Logging Migration
src/agents/acknowledgment_agent/agent.py, src/agents/acknowledgment_agent/test_agent.py, src/agents/base.py, src/agents/engine_agent/agent.py, src/agents/feasibility_agent/agent.py, src/agents/feasibility_agent/nodes.py, src/agents/supervisor_agent/agent.py, src/agents/supervisor_agent/nodes.py
Replaced standard logging with structlog across agent modules. Updated log messages from f-strings to structured key-value pairs, removed emoji prefixes, and added contextual fields (timeout, commenter, rule descriptions, error details).
Event Processor Logging & Filtering
src/event_processors/base.py, src/event_processors/check_run.py, src/event_processors/deployment.py, src/event_processors/deployment_review.py, src/event_processors/deployment_status.py, src/event_processors/violation_acknowledgment.py, src/event_processors/pull_request.py, src/event_processors/push.py, src/event_processors/rule_creation.py
Migrated to structlog with standardized logging keys. Added early-exit paths for invalid PR states (closed, merged, draft), deletion events, and null SHAs. Enhanced error handling and structured logging for processing steps and violations.
Deployment Protection & Timeout Handling
src/event_processors/deployment_protection_rule.py
Introduced AGENT_TIMEOUT_SECONDS constant, wrapped agent evaluation with asyncio timeout, added defensive violation extraction with safe attribute access, and replaced log statements with structured keys. On timeout, approves deployment with logged warning instead of failing.
Core Models & Event Filtering
src/core/models.py, src/core/utils/event_filter.py
Added optional delivery_id parameter to WebhookEvent.__init__. Introduced new event_filter.py module with FilterResult dataclass and should_process_event() function to centralize webhook filtering logic (archived repos, PR states, push validations).
Enum Standardization
src/agents/engine_agent/models.py, src/rules/models.py
Replaced Enum with StrEnum for ValidationStrategy and RuleSeverity, simplifying enum base classes and improving string conversion behavior.
GitHub API & Integration Logging
src/integrations/codeowners.py, src/integrations/contributors.py, src/integrations/github_api.py
Migrated to structlog throughout. Added three new public wrapper methods (get_pr_files, get_pr_reviews, get_pr_checks) that delegate to existing internal implementations for improved API consistency.
Rules & Validation Logging
src/rules/github_provider.py, src/rules/utils.py, src/rules/validators.py
Replaced logging with structlog across rules processing. Updated log messages to structured format with contextual fields (repository, rule_description, exceptions) for better observability.
Webhook Dispatcher & Auth
src/webhooks/auth.py, src/webhooks/dispatcher.py
Migrated to structlog. Added event filtering integration in dispatcher using should_process_event() to skip processing filtered events before handler invocation.
Webhook Handlers
src/webhooks/handlers/check_run.py, src/webhooks/handlers/deployment.py, src/webhooks/handlers/deployment_protection_rule.py, src/webhooks/handlers/deployment_review.py, src/webhooks/handlers/deployment_status.py, src/webhooks/handlers/issue_comment.py, src/webhooks/handlers/pull_request.py, src/webhooks/handlers/push.py
Switched to structlog logging. Updated handler return payloads to include task_id and message fields alongside status. Pull request handler now validates rules.yaml on "opened" actions.
Webhook Router
src/webhooks/router.py
Migrated to structlog. Changed unsupported event handling to raise HTTP 202 response instead of silent no-op, ensuring graceful webhook responses.
Task Queue & Scheduling
src/tasks/task_queue.py, src/tasks/scheduler/deployment_scheduler.py
Updated TaskQueue.enqueue() signature to require installation_id parameter. TaskStatus enum now extends StrEnum and includes new FAILED member. Deployment scheduler refactored with UTC timezone awareness, new callback_url requirement, expanded status reporting with ISO-formatted timestamps.
Application Startup
src/main.py
Replaced logging setup with structlog. Updated deployment scheduler startup to use awaitable approach and structured logging.
Test Coverage
tests/unit/core/test_agent.py, tests/unit/core/test_event_filter.py
Updated agent tests to structlog expectations. Added comprehensive 229-line test suite validating event filtering logic for PR states, push validations, archived repos, and filter constants.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 Hops through logs with structured grace,
Emoji signs now gone without a trace,
Timezones tamed in UTC shine,
Events filter, schemas align,
Watchflow's refactored—hop, hooray! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Fix/code quality improvements' is vague and generic, using non-descriptive terms that don't convey the specific primary changes in the changeset. Replace with a specific, descriptive title that highlights the main change, such as 'Migrate logging to structlog and add event filtering' or 'Replace standard logging with structlog throughout codebase'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 80.35% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/code-quality-improvements

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (13)
src/webhooks/handlers/deployment_status.py (1)

25-45: ⚠️ Potential issue | 🟠 Major

Boundary logs are missing required telemetry fields and one message remains unstructured.

Line 25, Line 33-35, and Line 45 should emit consistent boundary fields (operation, subject_ids, decision, latency_ms). Also, Line 35 is still an interpolated message, which breaks structured log shape.

Suggested patch
+from time import perf_counter
@@
     async def handle(self, event: WebhookEvent) -> dict[str, Any]:
         """Handle deployment_status events."""
+        started_at = perf_counter()
         payload = event.payload
         repo_full_name = event.repo_full_name
         installation_id = payload.get("installation", {}).get("id")
@@
         if not installation_id:
-            logger.error("no_installation_id_found_in_deploymentstatus", repo_full_name=repo_full_name)
+            logger.error(
+                "deployment_status_rejected_missing_installation_id",
+                operation="webhook.deployment_status.handle",
+                subject_ids={"repo_full_name": repo_full_name, "delivery_id": event.delivery_id},
+                decision="reject",
+                latency_ms=int((perf_counter() - started_at) * 1000),
+            )
             return {"status": "error", "message": "Missing installation ID"}
@@
-        logger.info("enqueuing_deployment_status_event_for", repo_full_name=repo_full_name)
-        logger.info("state", state=state)
-        logger.info(f"   Environment: {deployment.get('environment', 'unknown')}")
+        subject_ids = {
+            "repo_full_name": repo_full_name,
+            "installation_id": installation_id,
+            "delivery_id": event.delivery_id,
+        }
+        logger.info(
+            "deployment_status_enqueue_started",
+            operation="webhook.deployment_status.enqueue",
+            subject_ids=subject_ids,
+            decision="enqueue",
+            latency_ms=int((perf_counter() - started_at) * 1000),
+            state=state,
+            environment=deployment.get("environment", "unknown"),
+        )
@@
-        logger.info("deployment_status_event_enqueued_with_task", task_id=task_id)
+        logger.info(
+            "deployment_status_enqueue_completed",
+            operation="webhook.deployment_status.enqueue",
+            subject_ids=subject_ids,
+            decision="enqueued",
+            latency_ms=int((perf_counter() - started_at) * 1000),
+            task_id=task_id,
+        )

As per coding guidelines, "Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/deployment_status.py` around lines 25 - 45, The
boundary logs in deployment_status handler (the logger.error call and the
logger.info calls that log enqueuing, state, environment, and task_id) must be
converted to structured logs that include the required telemetry fields:
operation (use "deployment_status"), subject_ids (include repo_full_name and
installation_id), decision (use a meaningful value like "accepted" or
"missing_installation"), and latency_ms (set to None or compute if available).
Replace the interpolated f-string logger.info(f"   Environment:...") with a
structured call that passes environment=deployment.get("environment","unknown"),
and ensure the final enqueue log includes task_id alongside those telemetry
fields; for the missing installation error set decision="missing_installation"
and include subject_ids with repo_full_name only.
src/integrations/github_api.py (1)

38-55: ⚠️ Potential issue | 🟠 Major

Boundary logs are missing the required normalized fields.

Updated integration logs still omit operation, subject_ids, decision, and latency_ms, so they won’t satisfy the logging contract consistently for querying/alerts.

As per coding guidelines, "Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms".

Also applies to: 81-85, 163-185

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/integrations/github_api.py` around lines 38 - 55, The boundary logs in
github_api.py (the logger.debug call that returns from
self._token_cache[installation_id] and the logger.info call after creating a
token in the installation-token flow) must be converted to structured logs that
include the required normalized fields: operation (e.g.,
"get_installation_token"), subject_ids (e.g., [installation_id]), decision
(e.g., "cache_hit" / "generated"), and latency_ms; capture start time before the
cache check and compute latency_ms on each terminal log path; also apply the
same structured logging pattern to the other boundary log sites referenced (the
calls around the session POST/response handling and in the methods that cover
ranges ~81-85 and ~163-185) so every entry emitted at these boundaries uses
logger.<level>(..., operation="get_installation_token",
subject_ids=[installation_id], decision="cache_hit" or "generated" or "error",
latency_ms=<ms>) consistently.
src/event_processors/rule_creation.py (1)

56-57: ⚠️ Potential issue | 🟠 Major

Reply-post failures are treated as success in the main result path.

If comment posting fails (Line 122/Line 125), process() still returns success=True on Line 71. This masks user-visible delivery failures and prevents proper remediation/retry signaling.

🛠️ Suggested fix
-            await self._post_result_to_comment(task, feasibility_result)
+            posted = await self._post_result_to_comment(task, feasibility_result)
...
-            return ProcessingResult(success=True, violations=[], api_calls_made=1, processing_time_ms=processing_time)
+            return ProcessingResult(
+                success=posted,
+                violations=[],
+                api_calls_made=1,
+                processing_time_ms=processing_time,
+                error=None if posted else "Failed to post feasibility reply",
+            )
...
-    async def _post_result_to_comment(self, task: Task, feasibility_result):
+    async def _post_result_to_comment(self, task: Task, feasibility_result) -> bool:
...
-                return
+                return False
...
             if result:
                 logger.info("successfully_posted_feasibility_reply_to_issuepr", issue_number=issue_number)
+                return True
             else:
                 logger.error("failed_to_post_feasibility_reply_to", issue_number=issue_number)
+                return False
...
         except Exception as e:
             logger.error("error_posting_feasibility_reply", e=e)
+            return False

As per coding guidelines: "Fail closed for risky decisions; provide actionable remediation in error paths".

Also applies to: 71-71, 120-126

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/event_processors/rule_creation.py` around lines 56 - 57, The process()
flow currently treats failures from _post_result_to_comment as a success; modify
process() and its callers so that any exception or error returned by
_post_result_to_comment (and the internal reply/post paths) flips the overall
result to failure instead of returning success=True. Concretely, wrap the call
to self._post_result_to_comment(...) in a try/except (or check its returned
status) inside process(), log the error with context, and return or propagate a
Result object with success=False and an informative error message; ensure
similar handling is added where post-to-comment is invoked (e.g., in the
feasibility/result paths) so posting errors are not masked and will trigger
retry/remediation.
src/webhooks/handlers/issue_comment.py (2)

122-133: ⚠️ Potential issue | 🟠 Major

Validation result is posted with the wrong payload type.

_validate_rules_yaml returns a dict-like result, but create_pull_request_comment(..., comment=...) should receive the message string.

Proposed fix
-                    await github_client.create_pull_request_comment(
+                    await github_client.create_pull_request_comment(
                         repo=repo,
                         pr_number=pr_number,
-                        comment=validation_result,
+                        comment=validation_result.get("message", str(validation_result)),
                         installation_id=installation_id,
                     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/issue_comment.py` around lines 122 - 133, The
validation_result returned by _validate_rules_yaml is a dict-like object but
github_client.create_pull_request_comment expects a string; change the code that
calls create_pull_request_comment to pass a string (e.g.,
validation_result.get("message") or serialize the dict with
json.dumps(validation_result, indent=2)) instead of the raw dict; ensure you
handle the case where "message" is missing by falling back to a safe string
(like a joined errors list or the serialized dict) before calling
github_client.create_pull_request_comment.

153-169: ⚠️ Potential issue | 🟠 Major

Do not log raw user comment text or acknowledgment reasons.

These fields are user-supplied and can contain secrets/PII. Log safe metadata (length/hash) instead of raw content.

Safer logging example
-        logger.info("extracting_acknowledgment_reason_from", comment_body=comment_body)
+        logger.info("extracting_acknowledgment_reason_from", comment_length=len(comment_body))
...
-                logger.info("pattern_matched_reason", reason=reason)
+                logger.info("pattern_matched_reason", reason_length=len(reason))

As per coding guidelines "Reject secrets in logs, prompts, or test fixtures."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/issue_comment.py` around lines 153 - 169, The code
currently logs raw user-supplied comment_body and extracted reason (logger.info
calls referencing comment_body and reason); remove those raw-value logs and
instead log only safe metadata: compute and log the comment length and a
non-reversible fingerprint (e.g., sha256 hex digest) for comment_body, and for
the extracted reason log its length and a hash/fingerprint rather than the text.
Update the logger.info calls around extracting_acknowledgment_reason_from, the
initial logger.info that references comment_body, and the
logger.info("pattern_matched_reason", ...) so they record only safe metadata
(length/hash) and not the raw strings; keep the matching logic and variables
(patterns, match, reason) unchanged.
src/agents/acknowledgment_agent/agent.py (1)

189-203: ⚠️ Potential issue | 🟠 Major

Add mandatory low-confidence gating before returning success.

The flow currently treats low-confidence model output as final. Add a threshold gate (< 0.5) to route to human review (or reject).

Proposed confidence gate
-            return AgentResult(
+            if structured_result.confidence < 0.5:
+                return AgentResult(
+                    success=False,
+                    message="Low-confidence acknowledgment evaluation; human review required",
+                    data={
+                        "is_valid": False,
+                        "confidence": structured_result.confidence,
+                        "recommendations": structured_result.recommendations,
+                    },
+                    metadata={"decision": "route_to_human"},
+                )
+
+            return AgentResult(
                 success=True,
                 message=f"Acknowledgment evaluation completed: {structured_result.reasoning}",
                 data={

As per coding guidelines "Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/acknowledgment_agent/agent.py` around lines 189 - 203, The code
currently returns a successful AgentResult regardless of model confidence; add a
gating check using structured_result.confidence and if confidence < 0.5 log a
warning (via logger.warn or logger.warning) and return an AgentResult with
success=False (or a specific status indicating "human_review_required") and an
explanatory message (e.g., "Low confidence — route to human review") while still
including the original data payload (confidence, reasoning, is_valid,
acknowledgable_violations, require_fixes, recommendations); place this check
before the current AgentResult return that uses logger.info and the existing
data fields so low-confidence outputs are routed instead of treated as final.
src/webhooks/handlers/push.py (1)

20-25: ⚠️ Potential issue | 🟡 Minor

Capture X-GitHub-Delivery header to enable delivery-level idempotency.

GitHub sends X-GitHub-Delivery in webhook requests, but the router ignores it. WebhookEvent supports delivery_id, yet the router never extracts or passes it (line 38 of src/webhooks/router.py). Additionally, the deduplication infrastructure exists (_create_event_hash, event_hashes) but is explicitly disabled (event_hash=None at line 125 of src/tasks/task_queue.py).

Without capturing and propagating delivery_id, the system has no defense against duplicate webhook delivery from GitHub, risking duplicate task processing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/push.py` around lines 20 - 25, Extract the
X-GitHub-Delivery header from incoming requests and populate
WebhookEvent.delivery_id, then propagate that delivery_id into
task_queue.enqueue when creating push events (the handler in push.py calls
task_queue.enqueue). Also enable dedupe by computing/using the existing
_create_event_hash/event_hashes machinery instead of passing event_hash=None in
task_queue.enqueue (see task_queue.enqueue and _create_event_hash), e.g., derive
the event_hash from delivery_id (or delivery_id+payload) and pass it through so
the deduplication layer can detect repeated deliveries.
src/webhooks/router.py (1)

38-38: ⚠️ Potential issue | 🟠 Major

Propagate X-GitHub-Delivery into WebhookEvent.

Line 38 creates WebhookEvent without delivery_id; this loses request-level traceability for filtering/dispatch logs and incident correlation.

Proposed fix
-def _create_event_from_request(event_name: str | None, payload: dict) -> WebhookEvent:
+def _create_event_from_request(
+    event_name: str | None,
+    payload: dict[str, Any],
+    delivery_id: str | None,
+) -> WebhookEvent:
@@
-    return WebhookEvent(event_type=event_type, payload=payload)
+    return WebhookEvent(event_type=event_type, payload=payload, delivery_id=delivery_id)
@@
     payload = await request.json()
     event_name = request.headers.get("X-GitHub-Event")
+    delivery_id = request.headers.get("X-GitHub-Delivery")
@@
-        event = _create_event_from_request(event_name, payload)
+        event = _create_event_from_request(event_name, payload, delivery_id)

Also applies to: 58-63

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/router.py` at line 38, The WebhookEvent returned at the end of
the webhook parsing functions is missing the request delivery ID; extract the
"X-GitHub-Delivery" header (e.g., via request.headers.get("X-GitHub-Delivery"))
and pass it into the WebhookEvent constructor as the delivery_id argument where
WebhookEvent(...) is created (the occurrence at the line returning
WebhookEvent(event_type=..., payload=...) and the other creation site around
lines 58-63); ensure both code paths set delivery_id so request-level
traceability is preserved.
src/webhooks/handlers/deployment_review.py (1)

20-24: ⚠️ Potential issue | 🟠 Major

Validate installation_id before enqueueing.

Line 23 passes event.installation_id directly; when missing, this fails downstream and turns a client/input problem into a server error path.

Proposed fix
     async def handle(self, event: WebhookEvent):
         """Handle deployment review events by enqueuing them for background processing."""
         logger.info("enqueuing_deployment_review_event_for", repo_full_name=event.repo_full_name)
 
+        if event.installation_id is None:
+            return {"status": "skipped", "reason": "missing installation_id in webhook payload"}
+
         task_id = await task_queue.enqueue(
             event_type="deployment_review",
             repo_full_name=event.repo_full_name,
             installation_id=event.installation_id,
             payload=event.payload,
         )

As per coding guidelines, "Validate all external inputs".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/deployment_review.py` around lines 20 - 24, The handler
calls task_queue.enqueue with event.installation_id without checking it;
validate that installation_id is present and of expected type before calling
task_queue.enqueue (in the deployment_review handler where task_id is assigned),
and if missing/invalid return a 4xx error or reject the request early (e.g.,
raise/return a BadRequest) instead of enqueuing; update the code path around
task_queue.enqueue/event.installation_id to perform the check and use the
validated value when invoking task_queue.enqueue.
src/tasks/task_queue.py (1)

165-170: ⚠️ Potential issue | 🔴 Critical

Prevent duplicate execution of the same task across workers.

Line 165–170 selects a pending task without reserving it atomically, so two workers can process the same task concurrently.

Proposed fix
 class TaskQueue:
@@
     def __init__(self):
         self.tasks: dict[str, Task] = {}
         self.event_hashes: dict[str, str] = {}  # event_hash -> task_id
         self.running = False
         self.workers = []
+        self._task_lock = asyncio.Lock()
@@
-                # Find pending tasks
-                pending_tasks = [task for task in self.tasks.values() if task.status == TaskStatus.PENDING]
-
-                if pending_tasks:
-                    # Process the oldest task
-                    task = min(pending_tasks, key=lambda t: t.created_at)
-                    await self._process_task(task, worker_name)
+                task = None
+                async with self._task_lock:
+                    pending_tasks = [t for t in self.tasks.values() if t.status == TaskStatus.PENDING]
+                    if pending_tasks:
+                        task = min(pending_tasks, key=lambda t: t.created_at)
+                        task.status = TaskStatus.RUNNING
+                        task.started_at = datetime.now()
+
+                if task:
+                    await self._process_task(task, worker_name)
                 else:
                     # No tasks, wait a bit
                     await asyncio.sleep(1)
@@
     async def _process_task(self, task: Task, worker_name: str):
         """Process a single task."""
         try:
-            task.status = TaskStatus.RUNNING
-            task.started_at = datetime.now()
-
             logger.info("worker_processing_task", worker_name=worker_name, id=task.id)

Also applies to: 184-187

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/task_queue.py` around lines 165 - 170, The code is selecting a
pending task then processing it without reserving it atomically, allowing
multiple workers to pick the same task; modify the selection to perform an
atomic reserve under the task-queue lock: iterate self.tasks while holding the
queue-level asyncio.Lock (or the existing lock used by this class), find the
oldest task with task.status == TaskStatus.PENDING, set task.status =
TaskStatus.RUNNING and record worker_name (e.g. task.worker = worker_name or
task.assigned_to) before releasing the lock, then call await
self._process_task(task, worker_name); apply the same atomic reserve pattern to
the other similar block referenced in the comment (the block at ~184-187) so no
two workers can concurrently start the same Task object.
src/event_processors/pull_request.py (1)

143-149: ⚠️ Potential issue | 🟠 Major

Acknowledgment matching uses the wrong key, so prior acknowledgments may never apply.

previous_acknowledgments is keyed by rule_id (from _get_previous_acknowledgments), but this check uses rule_description, which can cause acknowledged violations to be treated as unacknowledged.

✅ Suggested patch
             for violation in violations:
-                rule_description = violation.get("rule_description", "")
-                if rule_description in previous_acknowledgments:
+                rule_id = violation.get("rule_id", "")
+                if rule_id and rule_id in previous_acknowledgments:
                     # This violation was previously acknowledged
-                    logger.info("violation_previously_acknowledged", rule_description=rule_description)
+                    logger.info("violation_previously_acknowledged", rule_id=rule_id)
                     acknowledgable_violations.append(violation)
                 else:
                     # This violation requires acknowledgment
                     require_acknowledgment_violations.append(violation)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/event_processors/pull_request.py` around lines 143 - 149, The check for
prior acknowledgments is using rule_description but previous_acknowledgments is
keyed by rule_id; update the loop in the pull request event processor so it
reads the violation's "rule_id" (e.g., rule_id = violation.get("rule_id")) and
checks if that rule_id is in previous_acknowledgments instead of using
rule_description. Also update the logger call in the same block (logger.info) to
include the rule_id for clarity and leave the rest of the behavior (appending to
acknowledgable_violations) unchanged; touch the loop that iterates over
violations in the pull_request processor (the block referencing
previous_acknowledgments, rule_description, and acknowledgable_violations).
src/event_processors/push.py (1)

173-227: ⚠️ Potential issue | 🟠 Major

Check-run creation failures are swallowed, but processing still reports success.

If create_check_run fails, _create_check_run() logs and returns, and process() still returns successful processing. This hides a user-visible integration failure.

🛠️ Suggested fix
-    async def _create_check_run(self, task: Task, violations: list[dict[str, Any]]):
+    async def _create_check_run(self, task: Task, violations: list[dict[str, Any]]) -> bool:
@@
             if not sha or sha == NULL_SHA:
                 logger.warning(...)
-                return
+                return False
@@
             if result:
                 logger.info(...)
+                return True
             else:
                 logger.error(...)
+                return False
@@
         except Exception as e:
             logger.error(...)
+            return False

Then in process():

         if violations:
-            await self._create_check_run(task, violations)
+            check_run_ok = await self._create_check_run(task, violations)
             api_calls += 1
+            if not check_run_ok:
+                return ProcessingResult(
+                    success=False,
+                    violations=violations,
+                    api_calls_made=api_calls,
+                    processing_time_ms=int((time.time() - start_time) * 1000),
+                    error="Failed to create check run",
+                )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/event_processors/push.py` around lines 173 - 227, The _create_check_run
block currently swallows create_check_run failures
(github_client.create_check_run) and only logs them, causing process() to report
success; change the behavior so failures are surfaced: either raise the caught
exception from _create_check_run (or return a clear failure boolean) instead of
just logging, and update process() to check the _create_check_run result (or
catch the propagated exception) and mark processing as failed/return an error
accordingly; reference _create_check_run, process, and
github_client.create_check_run when making these changes and preserve existing
logging (logger.error/info) while ensuring failures influence the overall
process outcome.
src/event_processors/deployment_protection_rule.py (1)

177-203: ⚠️ Potential issue | 🟠 Major

Approval/rejection API failures are not propagated to process().

_approve_deployment and _reject_deployment only log on failure and return None, so process() can continue and report an outcome that was never applied upstream.

🛠️ Suggested approach
-    async def _approve_deployment(self, callback_url: str, environment: str, comment: str, installation_id: int):
+    async def _approve_deployment(
+        self, callback_url: str, environment: str, comment: str, installation_id: int
+    ) -> bool:
@@
             if result is None:
                 logger.error(...)
+                return False
             else:
                 logger.info(...)
+                return True
         except Exception as e:
             logger.error(...)
+            return False
@@
-    async def _reject_deployment(
+    async def _reject_deployment(
         self, callback_url: str, environment: str, violations: list[dict[str, Any]], installation_id: int
-    ):
+    ) -> bool:
@@
             if result is None:
                 logger.error(...)
+                return False
             else:
                 logger.info(...)
+                return True
         except Exception as e:
             logger.error(...)
+            return False

Then gate ProcessingResult in process() on these boolean outcomes.

Also applies to: 244-273, 274-307

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/event_processors/deployment_protection_rule.py` around lines 177 - 203,
The approval/rejection calls (_approve_deployment and _reject_deployment)
swallow errors and return None, so process() continues as if the upstream action
succeeded; update _approve_deployment and _reject_deployment to return a boolean
(True on success, False on failure) or raise on error, then change process() to
check those return values and short-circuit or set ProcessingResult to a failure
when they indicate failure (e.g., after calling _approve_deployment(...,
installation_id) and _reject_deployment(..., installation_id) in process(),
abort further success logging and mark the processing as failed if the call
returned False or raised). Apply the same change pattern to the other
approval/rejection sites noted (the blocks around the other ranges) so the
deployment scheduler branch and final logger lines only run when the API call
succeeded.
🟡 Minor comments (7)
src/rules/github_provider.py-63-63 (1)

63-63: ⚠️ Potential issue | 🟡 Minor

Use logger.exception() for error logging with exceptions.

Same as line 54 - use logger.exception() to capture the full traceback automatically.

🐛 Proposed fix
-            logger.error("error_fetching_rules_for", repository=repository, e=e)
+            logger.exception("error_fetching_rules_for", operation="get_rules", repository=repository)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/github_provider.py` at line 63, Replace the call to logger.error
that passes the exception object (logger.error("error_fetching_rules_for",
repository=repository, e=e)) with logger.exception so the traceback is captured
automatically; locate the logger call in src/rules/github_provider.py (the
logger variable near the handling of repository and exception e), remove the
explicit e argument and call logger.exception with a clear message and
repository context (or pass repository as structured context) so the full
traceback is logged.
src/rules/github_provider.py-54-54 (1)

54-54: ⚠️ Potential issue | 🟡 Minor

Use logger.exception() for error logging with exceptions.

When logging errors with exception context, use logger.exception() instead of logger.error() to automatically capture the full traceback. The static analysis hint TRY400 is valid here.

🐛 Proposed fix
-                    logger.error("error_parsing_rule", rule_description=rule_description, e=e)
+                    logger.exception("error_parsing_rule", rule_description=rule_description)

As per coding guidelines, use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/github_provider.py` at line 54, Replace the logger.error call that
passes the exception object with logger.exception so the traceback is captured;
in the code near where rule parsing fails (look for
logger.error("error_parsing_rule", rule_description=rule_description, e=e) in
src/rules/github_provider.py) change it to
logger.exception("error_parsing_rule", rule_description=rule_description,
operation="parse_rule", subject_ids=[...], decision="error", latency_ms=None)
(keep rule_description, remove e=e since logger.exception captures the
exception) and fill subject_ids/latency_ms with the appropriate variables in
scope (or None/defaults) so the log uses the structured fields operation,
subject_ids, decision, latency_ms.
src/agents/feasibility_agent/agent.py-77-83 (1)

77-83: ⚠️ Potential issue | 🟡 Minor

Include the missing execution context in these structured logs.

Line 77, Line 105, and Line 139 use event keys that imply timing/attempt details but do not emit those fields, which weakens debugging and retry observability.

As per coding guidelines Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms.

Also applies to: 105-105, 139-139

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/feasibility_agent/agent.py` around lines 77 - 83, The structured
logs currently emitted via logger.info for the events
"feasibility_analysis_completed_in_s" and "results_feasible_type_confidence"
must include the required execution context fields; update those logger.info
calls in agent.py (the ones using logger.info and the
"results_feasible_type_confidence" / "feasibility_analysis_completed_in_s" event
keys) to add operation="feasibility_analysis", subject_ids=<use existing subject
id list or request.subject_ids>, decision=result.is_feasible, and
latency_ms=<compute elapsed ms from the analysis start timestamp>. Ensure you
populate subject_ids from the existing request/context variable and compute
latency_ms from the start time used in the analysis so all three log sites
include these structured fields.
src/agents/engine_agent/nodes.py-220-221 (1)

220-221: ⚠️ Potential issue | 🟡 Minor

Latency value is computed but discarded.

Proposed fix
-        (time.time() - start_time) * 1000
-        logger.info("llm_evaluation_completed_in_ms")
+        latency_ms = (time.time() - start_time) * 1000
+        logger.info("llm_evaluation_completed", latency_ms=latency_ms)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/engine_agent/nodes.py` around lines 220 - 221, The computed
latency value is discarded; assign (time.time() - start_time) * 1000 to a
variable (e.g., elapsed_ms) and pass it to the logger so the
"llm_evaluation_completed_in_ms" entry includes the actual numeric latency;
update the code around the logger.info call (reference the start_time variable
and logger.info("llm_evaluation_completed_in_ms")) to log the elapsed_ms value
(and optionally include a descriptive message or structured field).
src/agents/engine_agent/nodes.py-157-158 (1)

157-158: ⚠️ Potential issue | 🟡 Minor

Latency value is computed but discarded.

Proposed fix
-        (time.time() - start_time) * 1000
-        logger.info("validator_evaluation_completed_in_ms")
+        latency_ms = (time.time() - start_time) * 1000
+        logger.info("validator_evaluation_completed", latency_ms=latency_ms)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/engine_agent/nodes.py` around lines 157 - 158, The computed
latency expression "(time.time() - start_time) * 1000" is discarded; assign it
to a variable (e.g., elapsed_ms = (time.time() - start_time) * 1000) and include
that value in the logger.info call (e.g.,
logger.info("validator_evaluation_completed_in_ms", elapsed_ms=elapsed_ms) or
logger.info(f"validator_evaluation_completed_in_ms: {elapsed_ms}")) so that the
elapsed time is actually recorded; update the code surrounding start_time and
the existing logger.info("validator_evaluation_completed_in_ms") call
accordingly.
src/agents/engine_agent/nodes.py-103-104 (1)

103-104: ⚠️ Potential issue | 🟡 Minor

Latency value is computed but discarded.

Same issue as above - latency should be included in the log.

Proposed fix
-        (time.time() - start_time) * 1000
-        logger.info("strategy_selection_completed_in_ms")
+        latency_ms = (time.time() - start_time) * 1000
+        logger.info("strategy_selection_completed", latency_ms=latency_ms)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/engine_agent/nodes.py` around lines 103 - 104, The computed
latency is currently discarded; capture the result of (time.time() - start_time)
* 1000 into a variable (e.g., latency_ms) and include that value in the logger
call that currently reads logger.info("strategy_selection_completed_in_ms") so
the log contains the numeric latency (either as part of the message string or as
a structured field); update the code around the existing start_time and
logger.info usage in nodes.py accordingly.
src/agents/engine_agent/nodes.py-56-57 (1)

56-57: ⚠️ Potential issue | 🟡 Minor

Latency value is computed but discarded - include it in the log.

The time delta is calculated but not used. The log message implies timing info but doesn't include it. This appears to be an incomplete refactoring.

Proposed fix
-        (time.time() - start_time) * 1000
-        logger.info("rule_analysis_completed_in_ms")
+        latency_ms = (time.time() - start_time) * 1000
+        logger.info("rule_analysis_completed", latency_ms=latency_ms)

As per coding guidelines: "Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/engine_agent/nodes.py` around lines 56 - 57, The computed latency
(time.time() - start_time) * 1000 is being discarded and the logger.info call
logger.info("rule_analysis_completed_in_ms") must include it and follow
structured logging fields; compute latency_ms = int((time.time() - start_time) *
1000) and replace the plain log with a structured log call including operation
(e.g., "rule_analysis"), subject_ids (use the variable that holds subject IDs in
this scope), decision (the decision variable in scope), and latency_ms so
logger.info(...) emits these fields rather than just the string; update the code
around start_time and the current logger.info invocation in nodes.py
accordingly.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 267e19f and 3bd266b.

📒 Files selected for processing (44)
  • src/agents/acknowledgment_agent/agent.py
  • src/agents/acknowledgment_agent/test_agent.py
  • src/agents/base.py
  • src/agents/engine_agent/agent.py
  • src/agents/engine_agent/models.py
  • src/agents/engine_agent/nodes.py
  • src/agents/feasibility_agent/agent.py
  • src/agents/feasibility_agent/nodes.py
  • src/agents/supervisor_agent/agent.py
  • src/agents/supervisor_agent/nodes.py
  • src/core/models.py
  • src/core/utils/event_filter.py
  • src/event_processors/base.py
  • src/event_processors/check_run.py
  • src/event_processors/deployment.py
  • src/event_processors/deployment_protection_rule.py
  • src/event_processors/deployment_review.py
  • src/event_processors/deployment_status.py
  • src/event_processors/pull_request.py
  • src/event_processors/push.py
  • src/event_processors/rule_creation.py
  • src/event_processors/violation_acknowledgment.py
  • src/integrations/codeowners.py
  • src/integrations/contributors.py
  • src/integrations/github_api.py
  • src/main.py
  • src/rules/github_provider.py
  • src/rules/models.py
  • src/rules/utils.py
  • src/rules/validators.py
  • src/tasks/scheduler/deployment_scheduler.py
  • src/tasks/task_queue.py
  • src/webhooks/auth.py
  • src/webhooks/dispatcher.py
  • src/webhooks/handlers/check_run.py
  • src/webhooks/handlers/deployment.py
  • src/webhooks/handlers/deployment_protection_rule.py
  • src/webhooks/handlers/deployment_review.py
  • src/webhooks/handlers/deployment_status.py
  • src/webhooks/handlers/issue_comment.py
  • src/webhooks/handlers/pull_request.py
  • src/webhooks/handlers/push.py
  • src/webhooks/router.py
  • tests/unit/core/test_event_filter.py

Comment on lines +88 to 90
logger.info("evaluating_acknowledgment_request_from", commenter=commenter)
logger.info("reason", acknowledgment_reason=acknowledgment_reason)
logger.info(f"🧠 Violations to evaluate: {len(violations)}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove raw acknowledgment reason and LLM reasoning from logs.

Both fields may contain sensitive content and should not be emitted verbatim. Keep only safe summaries/metadata.

Safer logging example
-            logger.info("reason", acknowledgment_reason=acknowledgment_reason)
+            logger.info("acknowledgment_reason_received", reason_length=len(acknowledgment_reason))
...
-            logger.info("reasoning", reasoning=structured_result.reasoning)
+            logger.info("acknowledgment_reasoning_generated", reasoning_present=bool(structured_result.reasoning))

Based on learnings "Applies to **/*.py : Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only)".

Also applies to: 184-187

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/acknowledgment_agent/agent.py` around lines 88 - 90, The logs
currently emit raw sensitive fields (acknowledgment_reason and the LLM's raw
reasoning) — update the logging in the acknowledgment agent to stop printing
verbatim content: in the block using
logger.info("evaluating_acknowledgment_request_from", commenter=commenter) and
logger.info("reason", acknowledgment_reason=acknowledgment_reason) replace the
raw-field logs with non-sensitive metadata (e.g., redacted/summarized flags,
lengths, hash or boolean indicators like has_sensitive_content) and keep only
safe counts such as len(violations); likewise locate the later LLM reasoning
emission (around the LLM response handling at the referenced lines ~184-187) and
log only a sanitized summary or metadata rather than the full LLM output;
reference variables/functions to change: acknowledgment_reason, violations,
commenter, and the LLM response handling code in this module.

except Exception as e:
execution_time = time.time() - start_time
logger.error(f"❌ Feasibility analysis failed: {e}")
logger.error("feasibility_analysis_failed", e=e)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid logging raw exception objects in agent paths.

Logging e=e can leak sensitive prompt/context data from downstream integrations. Log sanitized metadata (error_type, bounded message) instead.

Proposed diff
-            logger.error("feasibility_analysis_failed", e=e)
+            logger.error(
+                "feasibility_analysis_failed",
+                error_type=type(e).__name__,
+                decision="failed",
+            )
@@
-                logger.error("exception_on_attempt", e=e)
+                logger.error(
+                    "exception_on_attempt",
+                    error_type=type(e).__name__,
+                    attempt=attempt + 1,
+                    max_retries=self.max_retries,
+                )

Based on learnings Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only).

Also applies to: 147-147

🧰 Tools
🪛 Ruff (0.15.2)

[warning] 119-119: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/feasibility_agent/agent.py` at line 119, The current logger.error
call in feasibility_agent.agent.py (logger.error("feasibility_analysis_failed",
e=e)) is logging the raw exception object and may leak sensitive prompt/context
data; update the error logging inside the feasibility analysis path (and the
second occurrence around the other logger.error noted) to log only sanitized
metadata: include error_type (e.__class__.__name__), a bounded/truncated message
(e.g., first N chars) after stripping secrets/PII, and any non-sensitive error
code or flag; remove or avoid passing the raw exception object or full
stack/trace into logger.error and instead store full details only in a secure
store if needed.

Comment on lines 84 to 85
# Set end time
state.end_time = time.time()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Type mismatch: end_time expects datetime | None, not float.

According to SupervisorState in src/agents/supervisor_agent/models.py, end_time is typed as datetime | None, but time.time() returns a float (Unix timestamp). This will cause a Pydantic validation error.

Proposed fix
+from datetime import datetime
 import time
 
 # ... later in the function ...
 
         # Set end time
-        state.end_time = time.time()
+        state.end_time = datetime.now()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Set end time
state.end_time = time.time()
# Set end time
state.end_time = datetime.now()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agents/supervisor_agent/nodes.py` around lines 84 - 85, The code sets
state.end_time = time.time() but SupervisorState.end_time is typed datetime |
None; replace the float timestamp with a datetime value (e.g., use
datetime.now() or datetime.fromtimestamp(time.time())) so the assigned value
matches the SupervisorState type; update the import to include datetime if
missing and change the assignment in nodes.py (reference:
SupervisorState.end_time and the assignment site setting end_time) to use a
datetime-producing call.

Comment on lines +123 to +142
except TimeoutError:
logger.warning(
"agent_execution_timeout",
operation="process",
subject_ids={"repo": repo_full_name, "deployment_id": deployment_id},
timeout_seconds=AGENT_TIMEOUT_SECONDS,
)
if deployment_callback_url and environment:
await self._approve_deployment(
deployment_callback_url,
environment,
"Deployment approved due to evaluation timeout. Review rules for performance issues.",
installation_id,
)
return ProcessingResult(
success=True,
violations=[],
api_calls_made=1,
processing_time_ms=int((time.time() - start_time) * 1000),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Timeout handling currently bypasses protection rules (fail-open).

On agent timeout, this path approves deployment and returns success=True. That allows risky deployments through exactly when evaluation is unavailable.

🔒 Suggested fail-closed direction
             except TimeoutError:
                 logger.warning(
                     "agent_execution_timeout",
@@
                 )
-                if deployment_callback_url and environment:
-                    await self._approve_deployment(
-                        deployment_callback_url,
-                        environment,
-                        "Deployment approved due to evaluation timeout. Review rules for performance issues.",
-                        installation_id,
-                    )
+                timeout_violation = {
+                    "rule_description": "Deployment protection evaluation timed out",
+                    "severity": "high",
+                    "message": "Rule engine did not return a decision before timeout.",
+                    "how_to_fix": "Retry deployment and investigate slow rules/agent dependencies.",
+                }
+                if deployment_callback_url and environment:
+                    await self._reject_deployment(
+                        deployment_callback_url, environment, [timeout_violation], installation_id
+                    )
                 return ProcessingResult(
-                    success=True,
-                    violations=[],
+                    success=False,
+                    violations=[timeout_violation],
                     api_calls_made=1,
                     processing_time_ms=int((time.time() - start_time) * 1000),
+                    error="Deployment protection evaluation timed out",
                 )

As per coding guidelines, "Fail closed for risky decisions; provide actionable remediation in error paths".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/event_processors/deployment_protection_rule.py` around lines 123 - 142,
The TimeoutError handler currently "fails open" by calling _approve_deployment
and returning ProcessingResult(success=True) — change this to fail-closed: do
NOT call _approve_deployment on timeout, instead record a clear violation (e.g.,
add a timeout-related entry to the violations list), return ProcessingResult
with success=False, appropriate api_calls_made and processing_time_ms, and if
there is an existing reject method (e.g., _reject_deployment or similar), invoke
it with a descriptive message and installation_id; update the logger.warning
call to include that the decision resulted in a failed evaluation and
remediation was taken.

)
comment_body = self._format_violations_comment(violations)
logger.debug(f"Comment body: {comment_body[:200]}...")
logger.debug("comment_body_preview", body_preview=comment_body[:200])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not log generated comment body previews.

Even truncated comment text can include sensitive repository/user content. Prefer metadata-only logging (length/hash/counts).

🔐 Suggested patch
-            logger.debug("comment_body_preview", body_preview=comment_body[:200])
+            logger.debug(
+                "comment_body_generated",
+                operation="post_violations_to_github",
+                subject_ids={"repo": task.repo_full_name, "pr_number": pr_number},
+                body_length=len(comment_body),
+            )

As per coding guidelines, "Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only)".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.debug("comment_body_preview", body_preview=comment_body[:200])
logger.debug(
"comment_body_generated",
operation="post_violations_to_github",
subject_ids={"repo": task.repo_full_name, "pr_number": pr_number},
body_length=len(comment_body),
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/event_processors/pull_request.py` at line 509, The current logger.debug
call logs a truncated comment body ("comment_body_preview" using
comment_body[:200]) which may expose PII/secrets; replace it by removing any
raw/truncated content and log only safe metadata (e.g., length and a
non-reversible fingerprint) instead. Update the logger.debug invocation in
pull_request.py (the line that uses "comment_body_preview" and comment_body) to
log something like comment_length=len(comment_body) and a hashed fingerprint
(use a secure hash like sha256 on the bytes and shorten the hex for display) so
no raw comment text appears in logs.

# Start the deployment scheduler
asyncio.create_task(get_deployment_scheduler().start_background_scheduler())
logging.info("🚀 Deployment scheduler started")
logger.info("deployment_scheduler_started")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Structured log missing required fields.

The log statement should include structured fields per coding guidelines: operation, subject_ids, decision, latency_ms. At minimum, include operation for lifecycle events.

As per coding guidelines: "Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms"

♻️ Proposed fix to add required structured fields
-    logger.info("deployment_scheduler_started")
+    logger.info("deployment_scheduler_started", operation="startup", component="deployment_scheduler")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.info("deployment_scheduler_started")
logger.info("deployment_scheduler_started", operation="startup", component="deployment_scheduler")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main.py` at line 65, The plain
logger.info("deployment_scheduler_started") call is missing required structured
fields; update the logging call (the logger.info invocation in main.py where
"deployment_scheduler_started" is logged) to emit structured fields including at
least operation (e.g., operation="deployment_scheduler_started") and ideally
subject_ids, decision, and latency_ms (provide empty list/null or default values
if not available) so the call becomes a structured log emission rather than a
bare message; ensure you pass these as structured key/value fields compatible
with your logger (e.g., logger.info("deployment_scheduler_started", extra={...})
or logger.info({"message":"deployment_scheduler_started", "operation":...,
"subject_ids":..., "decision":..., "latency_ms":...})).

Comment on lines +141 to 146
if created_at is not None:
if isinstance(created_at, int | float):
# Convert timestamp to datetime
created_at = datetime.fromtimestamp(created_at)
# Convert timestamp to datetime with timezone awareness
created_at = datetime.fromtimestamp(created_at, tz=UTC)
age = current_time - created_at
if age > timedelta(days=7):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize created_at before age arithmetic.

Line 145 subtracts current_time - created_at even when created_at may be a string or naive datetime, which can raise and leave the same deployment retrying forever in subsequent loops.

Proposed fix
-                if created_at is not None:
-                    if isinstance(created_at, int | float):
-                        # Convert timestamp to datetime with timezone awareness
-                        created_at = datetime.fromtimestamp(created_at, tz=UTC)
-                    age = current_time - created_at
+                if created_at is not None:
+                    if isinstance(created_at, int | float):
+                        created_at = datetime.fromtimestamp(created_at, tz=UTC)
+                    elif isinstance(created_at, str):
+                        try:
+                            created_at = datetime.fromisoformat(created_at.replace("Z", "+00:00"))
+                        except ValueError:
+                            deployments_to_remove.append(i)
+                            continue
+                    elif not isinstance(created_at, datetime):
+                        deployments_to_remove.append(i)
+                        continue
+
+                    if created_at.tzinfo is None:
+                        created_at = created_at.replace(tzinfo=UTC)
+
+                    age = current_time - created_at
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/scheduler/deployment_scheduler.py` around lines 141 - 146,
Normalize created_at to a timezone-aware datetime before computing age: inside
the DeploymentScheduler logic where created_at and current_time are used, ensure
created_at (the variable) is parsed when it's a string (e.g., via ISO8601
parse), convert naive datetimes to UTC-aware datetimes (attach UTC tzinfo), and
convert numeric timestamps (int|float) as you already do; if parsing fails or
created_at is missing/invalid, skip computing age or mark the deployment for
safe handling so you don't perform current_time - created_at with incompatible
types. Ensure all branches set created_at to a datetime with tzinfo=UTC before
the age subtraction.

Comment on lines +105 to +109
logger.info("event_hash_debug_for", event_type=event_type)
logger.info(f" Comment ID: {event_data.get('comment_id')}")
logger.info(f" Comment body: {event_data.get('comment_body', '')[:50]}...")
logger.info(f" Comment created at: {event_data.get('comment_created_at')}")
logger.info(f" Event hash: {event_hash}")
logger.info("event_hash", event_hash=event_hash)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not log raw comment content in hash debug logs.

Line 107 logs comment body text, which can leak secrets/PII into logs.

As per coding guidelines, "Reject secrets in logs, prompts, or test fixtures".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/task_queue.py` around lines 105 - 109, The log at the logger.info
call that prints the raw comment body (using event_data.get('comment_body',
'')[:50]) must be removed or replaced to avoid leaking secrets/PII; update the
logging in the task_queue.py block that emits event_hash_debug_for/event_hash to
either log a redacted indicator (e.g., "<redacted>") or a non-sensitive
surrogate such as the comment length or a safe hash of
event_data.get('comment_body') instead of the raw text, keeping the surrounding
logs for event_type, comment_id, comment_created_at, and event_hash intact.

Comment on lines 80 to 88
except Exception as e:
logger.exception(f"Error executing handler for event {event.event_type.name}: {e}")
logger.exception(
"handler_error",
operation="dispatch",
subject_ids={"repo": event.repo_full_name},
event_type=event.event_type.name,
error=str(e),
)
return {"status": "error", "reason": str(e)}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Do not acknowledge webhook processing after handler failure.

Line 88 returns an error payload instead of re-raising, so the endpoint can still respond 200 and GitHub may not retry failed events.

Proposed fix
         except Exception as e:
             logger.exception(
                 "handler_error",
                 operation="dispatch",
                 subject_ids={"repo": event.repo_full_name},
                 event_type=event.event_type.name,
                 error=str(e),
             )
-            return {"status": "error", "reason": str(e)}
+            raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
logger.exception(f"Error executing handler for event {event.event_type.name}: {e}")
logger.exception(
"handler_error",
operation="dispatch",
subject_ids={"repo": event.repo_full_name},
event_type=event.event_type.name,
error=str(e),
)
return {"status": "error", "reason": str(e)}
except Exception as e:
logger.exception(
"handler_error",
operation="dispatch",
subject_ids={"repo": event.repo_full_name},
event_type=event.event_type.name,
error=str(e),
)
raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/dispatcher.py` around lines 80 - 88, The except block in the
webhook dispatch path currently logs the error with logger.exception and then
returns an error payload, which lets the HTTP endpoint respond 200; instead,
after logging you should re-raise the exception so the request fails and GitHub
can retry; update the except in the dispatch handler (the block containing
logger.exception and the return) to remove the return {"status": "error", ...}
and use a bare raise to re-raise the caught exception (preserving traceback) so
the caller/HTTP layer surfaces a non-200 response.

Comment on lines +183 to +191
def test_filter_result_frozen():
"""Test that FilterResult is frozen (immutable)."""
result = FilterResult(should_process=True, reason="test")
# Attempting to modify should raise an error
try:
result.should_process = False
raise AssertionError("Should have raised FrozenInstanceError")
except Exception:
pass # Expected
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

test_filter_result_frozen can pass even when immutability is broken.

The current try/except Exception: pass catches both the expected error and the intentional AssertionError, so this test does not actually validate frozen behavior.

✅ Suggested fix
+from dataclasses import FrozenInstanceError
+import pytest
@@
 def test_filter_result_frozen():
     """Test that FilterResult is frozen (immutable)."""
     result = FilterResult(should_process=True, reason="test")
-    # Attempting to modify should raise an error
-    try:
-        result.should_process = False
-        raise AssertionError("Should have raised FrozenInstanceError")
-    except Exception:
-        pass  # Expected
+    with pytest.raises(FrozenInstanceError):
+        result.should_process = False
🧰 Tools
🪛 Ruff (0.15.2)

[warning] 189-189: Abstract raise to an inner function

(TRY301)


[warning] 189-189: Avoid specifying long messages outside the exception class

(TRY003)


[error] 190-191: try-except-pass detected, consider logging the exception

(S110)


[warning] 190-190: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/core/test_event_filter.py` around lines 183 - 191, The test
test_filter_result_frozen currently swallows the AssertionError and wrongly
passes; change it to explicitly assert that assigning to
FilterResult.should_process raises the dataclasses.FrozenInstanceError (or use
pytest.raises(FrozenInstanceError)) instead of a broad try/except. Update the
test to construct FilterResult(should_process=True, reason="test") and then use
an explicit exception assertion targeting FrozenInstanceError (imported from
dataclasses or via pytest.raises) so the test fails if immutability is broken.

@dkargatzis dkargatzis closed this Feb 26, 2026
@dkargatzis dkargatzis deleted the fix/code-quality-improvements branch February 26, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant