Skip to content

Conversation

@jakehwll
Copy link
Contributor

@jakehwll jakehwll commented Dec 4, 2025

Related to internal#1139

Continuation of #21074

This implements some RBAC role specificity for dbpurge, ensuring that we follow the least-privileged model for removing data from the database. It is specified as following.

Site: rbac.Permissions(map[string][]policy.Action{
	// DeleteOldWorkspaceAgentLogs
	// DeleteOldWorkspaceAgentStats
	// DeleteOldProvisionerDaemons
	// DeleteOldTelemetryLocks
	// DeleteOldAuditLogConnectionEvents
	// DeleteOldConnectionLogs
	rbac.ResourceSystem.Type: {policy.ActionDelete},
	// DeleteOldNotificationMessages
	rbac.ResourceNotificationMessage.Type: {policy.ActionDelete},
	// ExpirePrebuildsAPIKeys
	// DeleteExpiredAPIKeys
	rbac.ResourceApiKey.Type: {policy.ActionDelete},
	// DeleteOldAIBridgeRecords
	rbac.ResourceAibridgeInterception.Type: {policy.ActionDelete},
}),
Position Pull-request
feat: add prometheus observability metrics for dbpurge
feat: add rbac specificity for dbpurge

@jakehwll jakehwll changed the base branch from main to jakehwll/dbpurge-observability-metrics-prometheus December 4, 2025 06:32
@github-actions github-actions bot added the stale This issue is like stale bread. label Dec 12, 2025
@github-actions
Copy link

github-actions bot commented Dec 12, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@jakehwll jakehwll marked this pull request as ready for review December 12, 2025 10:23
@jakehwll jakehwll requested a review from Emyrk as a code owner December 12, 2025 10:23
Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hell yeah 👍
It needs some tests please, though.

{
Identifier: rbac.RoleIdentifier{Name: "dbpurge"},
DisplayName: "DB Purge Daemon",
Site: rbac.Permissions(map[string][]policy.Action{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Identifier: rbac.RoleIdentifier{Name: "dbpurge"},
DisplayName: "DB Purge Daemon",
Site: rbac.Permissions(map[string][]policy.Action{
rbac.ResourceSystem.Type: {policy.ActionDelete},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunate we need system delete, but I see why

@jakehwll jakehwll removed the stale This issue is like stale bread. label Dec 12, 2025
@mtojek mtojek self-requested a review December 19, 2025 10:05
Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Feel free to merge.

jakehwll added a commit that referenced this pull request Dec 19, 2025
Related to
[`internal#1139`](coder/internal#1139)

This implements some prometheus metrics for records being removed from
the database. Currently we're tracking the following fields being
removed from the DB by this. They're viewable in the
`/api/v2/debug/metrics` endpoint.

* `expired_api_keys`
* `aibridge_records`
* `connection_logs`
* `duration`

```
# HELP coderd_dbpurge_iteration_duration_seconds Duration of each dbpurge iteration in seconds.
# TYPE coderd_dbpurge_iteration_duration_seconds histogram
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="1"} 1
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="5"} 1
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="10"} 1
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="30"} 1
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="60"} 1
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="300"} 1
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="600"} 1
coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="+Inf"} 1
coderd_dbpurge_iteration_duration_seconds_sum{success="true"} 0.014787814
coderd_dbpurge_iteration_duration_seconds_count{success="true"} 1
# HELP coderd_dbpurge_records_purged_total Total number of records purged by type.
# TYPE coderd_dbpurge_records_purged_total counter
coderd_dbpurge_records_purged_total{record_type="aibridge_records"} 0
coderd_dbpurge_records_purged_total{record_type="audit_logs"} 0
coderd_dbpurge_records_purged_total{record_type="connection_logs"} 0
coderd_dbpurge_records_purged_total{record_type="expired_api_keys"} 0
coderd_dbpurge_records_purged_total{record_type="workspace_agent_logs"} 0
```

| Position | Pull-request |
| -------- | ------------ |
| ✅ | [feat: add prometheus observability metrics for
`dbpurge`](#21074) |
| | [feat: add rbac specificity for
`dbpurge`](#21088) |
Base automatically changed from jakehwll/dbpurge-observability-metrics-prometheus to main December 19, 2025 13:21
@jakehwll jakehwll merged commit ea00e72 into main Dec 19, 2025
30 checks passed
@jakehwll jakehwll deleted the jakehwll/dbpurge-rbac-specificity branch December 19, 2025 14:02
@github-actions github-actions bot locked and limited conversation to collaborators Dec 19, 2025
@dannykopping
Copy link
Contributor

@jakehwll the test you added works but it's overfitting.

By that I mean it's basically reimplementing dbpurge by calling all of its methods.

It does check that all the methods called have the appropriate RBAC permissions, but this became stale the moment it was written. When we change dbpurge behaviour (i.e. add new steps) those will either have to be reflected here or there will be a testing blindspot.

Instead, I'd suggest that you extract the doTick func into its own method which returns an error, call it here, and validate that no RBAC error occurred.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants