Skip to content

Per-DVR Health

Each DVR registered in ChannelWatch has its own health endpoint that returns a JSON snapshot of its current state. This is useful for scripting, external monitoring, and diagnosing why a specific DVR is showing as unhealthy.

GET /api/v1/dvrs/<id>/health

Replace <id> with the DVR’s 8-character stable ID (visible in the web UI under Settings > DVRs, or from GET /api/v1/dvrs).

{
"dvr_id": "a1b2c3d4",
"dvr_name": "Living Room",
"connected": true,
"last_event_at": "2026-04-19T14:32:01Z",
"last_event_seconds_ago": 47,
"stale": false,
"staleness_threshold_seconds": 300,
"session_state_size": 12,
"disk_status": "ok",
"recent_alert_rate": 0.8
}
FieldTypeDescription
dvr_idstringStable 8-character DVR identifier
dvr_namestringUser-editable display name
connectedbooleanWhether the DVR’s asyncio task is alive
last_event_atISO 8601Timestamp of the most recent SSE event or poll
last_event_seconds_agonumberSeconds since last_event_at
stalebooleantrue if last_event_seconds_ago exceeds staleness_threshold_seconds
staleness_threshold_secondsnumberConfigured threshold (default 300)
session_state_sizenumberNumber of active sessions tracked for this DVR
disk_statusstring"ok", "warning", or "critical" based on disk thresholds
recent_alert_ratenumberAlerts per minute over the last 5 minutes

A 503 response means the DVR is either disconnected or stale. The body contains a short error description.

ChannelWatch runs a background watchdog coroutine that checks every 30 seconds. For each enabled DVR it verifies:

  1. The DVR’s asyncio task is alive (not crashed or cancelled).
  2. The DVR’s last_event_at is within staleness_threshold_seconds of now.

If either check fails, the watchdog marks the DVR unhealthy. This state is reflected in:

  • The /api/v1/dvrs/<id>/health response (connected: false or stale: true)
  • The /healthz/ready probe (returns 503 until all DVRs are healthy)
  • The Prometheus gauge channelwatch_dvr_last_event_seconds_ago
  • The UI staleness banner (described below)

The watchdog cannot be disabled. It is a core safety feature that prevents the “green UI, dead monitoring” failure mode where the dashboard looks fine but events stopped flowing hours ago.

After a hot reload that restarts a DVR task, the watchdog verifies the new task is alive AND that last_event_at updates within the first 60 seconds. If the new task does not produce an event within that window, the watchdog emits a notification through your configured channels so you know the reload did not fully succeed.

The default staleness threshold is 300 seconds (5 minutes). You can adjust it in Settings > DVRs > Advanced for each DVR individually.

Set it higher if your Channels DVR server has long idle periods with no activity (e.g. overnight when no one is watching). Set it lower if you want faster detection of a stalled connection.

When the watchdog detects a stale DVR, a red banner appears at the top of the ChannelWatch web UI:

DVR ‘Living Room’ has not received events for 312 seconds. Monitoring may be degraded.

The banner includes a Diagnose button that runs channelwatch doctor diagnose and displays the output inline. This gives you a quick path from “something looks wrong” to actionable diagnostic information without leaving the browser.

The banner clears automatically once the DVR resumes sending events and the watchdog confirms it is within the staleness threshold.

To check all DVRs at once, use the DVR list endpoint and inspect the health field:

GET /api/v1/dvrs

Each DVR object in the response includes a health summary. For the full detail on a specific DVR, follow up with the per-DVR health endpoint.