Skip to content

tmater/wacht

Repository files navigation

Wacht

Go Test Smoke test License: AGPL-3.0 Status: Early development

Distributed uptime monitoring, built in the EU.

Run HTTP, TCP, and DNS checks from multiple probe locations. Quorum-based alerting means you only get paged when a majority of probes agree something is actually down — no false alerts from a single flaky probe.

Status: Early development. Self-hosting works but expect rough edges.

Quickstart

Requirements: Docker, Docker Compose, Git.

git clone https://github.com/tmater/wacht.git
cd wacht

Edit config/server.yaml — provision each probe with its own secret and configure your checks:

probes:
  - id: probe-1
    secret: replace-with-a-strong-secret-1
  - id: probe-2
    secret: replace-with-a-strong-secret-2
  - id: probe-3
    secret: replace-with-a-strong-secret-3
seed_user:
  email: admin@wacht.local
  password: changeme
checks:
  - id: my-site
    type: http
    target: https://example.com
    webhook: https://hooks.example.com/your-webhook-url
  - id: my-db
    type: tcp
    target: db.example.com:5432

The code default is to block private and internal targets. The shipped self-host sample configs set allow_private_targets: true, because monitoring Docker, VPN, and RFC1918 services is a common self-hosted use case. For hosted or managed-probe deployments, keep that setting disabled on both the server and the matching probe config.

Edit config/probe-1.yaml, config/probe-2.yaml, config/probe-3.yaml — each probe must use the matching secret provisioned in config/server.yaml:

secret: replace-with-a-strong-secret-1
server: http://server:8080
probe_id: probe-1
heartbeat_interval: 30s

Start everything:

docker compose up -d

The dashboard is available at http://<your-host>:3000.

First login: open http://localhost:3000, sign in with the seed_user credentials (admin@wacht.local / changeme), and change the password immediately. The seed user is only created on first boot when no users exist yet.

Check types

Type Target format Example Notes
http URL https://example.com Checks for a 2xx response
tcp host:port db.example.com:5432 Checks that a TCP connection can be opened
dns hostname example.com Checks that the hostname resolves to at least one address

Private, loopback, and link-local targets are blocked unless allow_private_targets: true is enabled on both the server and the probe.

How alerting works

A webhook fires when a strict majority of probes each report a check as down for 2 consecutive failures. Recovery requires a non-down majority with 2 consecutive healthy results from the probes that observed recovery. It fires once on transition (up → down and down → up), deduplicated via an incidents table.

Minimum recommended probe count is 3 — quorum works with 2 but leaves no room for a probe going offline.

Checks run every 30 seconds per probe.

/status marks a probe offline after 90 seconds without heartbeats by default. Override that with probe_offline_after in server.yaml if you want a shorter or longer UI timeout.

Webhook payload:

{
  "check_id": "my-site",
  "target": "https://example.com",
  "status": "down",
  "probes_down": 2,
  "probes_total": 3
}

Recovery notifications use the same payload with status set to up.

Webhook URLs must be public HTTP(S) endpoints; loopback, private, and link-local destinations are rejected. Alert delivery is persisted in the database and retried with backoff in the background so result ingestion is not blocked by slow destinations. Delivery is timed out after 5 seconds. If an outage resolves before its down alert can be delivered, that stale opening notification is superseded and the recovery notification becomes the current delivery target instead. Delivery state is visible in incident history.

Status pages

GET /status returns the current state of all checks for the authenticated user. Requests must include a valid session token.

# Log in and capture the session token:
TOKEN=$(curl -s -X POST http://<your-host>:3000/api/auth/login \
  -H 'Content-Type: application/json' \
  -d '{"email":"admin@wacht.local","password":"changeme"}' | jq -r .token)

# Fetch current status:
curl -H "Authorization: Bearer $TOKEN" http://<your-host>:3000/status

Each user also gets one anonymous read-only public page at /public/{slug}. The dashboard exposes that share URL via the Account page, and the backing JSON endpoint is GET /api/public/status/{slug}.

The public page intentionally exposes only check IDs and status state. It does not include raw targets, webhook URLs, probe details, or incident history.

Browser tests

Run the browser suite against a disposable packaged stack:

make browser

That boots the normal nginx + server + Postgres path with a dedicated seed config from config/server.browser.yaml, waits for http://127.0.0.1:13000, runs the Playwright specs in wacht-web/tests/, then tears the stack down.

Override the default browser stack settings if needed:

BROWSER_WEB_PORT=14000 BROWSER_PROJECT=my-wacht-browser make browser

License

AGPL-3.0

About

Simple distributed uptime monitoring, built in the EU.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors