Monitor types¶
Lanby supports two broad categories of monitoring: probes that actively test a service on a schedule, and keepalive heartbeats that expect your service to check in periodically.
flowchart LR
L(Lanby)
P(Probe monitor)
K(Keepalive monitor)
S(Your service)
L -->|checks on schedule| P
P -->|probes| S
S -->|pings in| K
K --> L
Monitor states¶
Every monitor is always in one of these states:
| State | Meaning |
|---|---|
pending |
Newly created, no results yet. No alerts fire. |
up |
Last check passed. |
degraded |
Check passed but response was slow (above slow_threshold_ms), or a TLS certificate is expiring soon. |
down |
Check failed — wrong status, timeout, connection refused, etc. |
paused |
Monitoring suspended. No checks run, no alerts fire. |
unknown |
Monitor exists but no recent data. Occurs after a long offline period. |
Degraded vs down: degraded means the service is reachable but something is worth flagging — high latency, an expiring certificate, or an unexpected but non-fatal response. down means the service failed the check outright. Both states trigger alerts; you can configure separate alert channels for each.
Retries and recovery¶
Retries: Before marking a monitor down, Lanby retries the failing probe up to the configured retry count. This prevents transient blips from firing spurious alerts. A probe must fail retries + 1 consecutive times to transition to down.
Recovery: By default, a single passing check recovers a monitor from down back to up. Set recovery_successes to require multiple consecutive passes before recovery — useful for flappy services.
Recovery interval: When a monitor is down or degraded, the relay switches to recovery_interval_seconds instead of the normal interval. Set this lower than the normal interval to detect recovery faster.
Probe monitors¶
Probe monitors run on a configured schedule and actively test a target. If the target fails — wrong status code, unreachable port, timeout — Lanby marks it as down and fires an alert.
Probes run from the Lanby platform (for publicly reachable services) or from a relay agent for private network services.
HTTP / HTTPS¶
Sends an HTTP request to a URL and validates the response. The most common probe type.
Configuration¶
| Field | Default | Description |
|---|---|---|
target |
required | Full URL including scheme. e.g. https://mynas.local:8080/health |
method |
GET |
HTTP method: GET, HEAD, or POST |
interval_seconds |
60 |
How often to run the probe |
timeout_seconds |
10 |
Request timeout. Probe fails if no response within this time. |
retries |
0 |
Number of additional attempts before marking down |
recovery_successes |
1 |
Consecutive passes needed to recover from down |
recovery_interval_seconds |
(same as interval) | Interval to use while the monitor is down/degraded |
slow_threshold_ms |
(disabled) | Mark as degraded if response takes longer than this |
expected_status |
(any 2xx) | Exact HTTP status code required for success |
success_http_status_codes |
(empty) | List of acceptable HTTP status codes. Overrides expected_status if set. |
http_body_contains |
(disabled) | Response body must contain this substring |
follow_redirects |
true |
Whether to follow HTTP redirects |
max_redirects |
5 |
Maximum redirects to follow |
headers |
(empty) | Map of HTTP headers to include in the request |
ignore_tls_errors |
false |
Skip TLS certificate verification. Use only for internal services with self-signed certs. |
check_cert_expiry |
false |
Alert when the TLS certificate is close to expiry |
cert_expiry_min_days |
14 |
Days before expiry to start alerting (requires check_cert_expiry: true) |
Examples¶
Basic health check:
Authenticated API endpoint:
Target: https://myapp.local/api/status
Method: GET
Headers:
Authorization: Bearer mysecrettoken
X-Internal: true
Expected status: 200
Body keyword match — check the app is actually up, not just returning 200 from a load balancer:
Self-signed certificate (common for internal services):
Certificate expiry monitoring:
This marks the monitor asdegraded 21 days before the cert expires, giving you time to renew before it goes down.
Slow response alerting:
Responses over 2 seconds mark the monitordegraded even if the status code is correct.
Specific status codes — useful for endpoints that return 204 or 401:
TCP port¶
Attempts to open a TCP connection to a host and port. Succeeds if the connection is accepted; fails if refused or timed out. No application-layer handshake — pure connectivity.
Configuration¶
| Field | Default | Description |
|---|---|---|
target |
required | host:port — e.g. 192.168.1.10:5432 or mynas.local:22 |
interval_seconds |
60 |
How often to probe |
timeout_seconds |
10 |
Connection timeout |
retries |
0 |
Retries before marking down |
recovery_successes |
1 |
Passes needed to recover |
recovery_interval_seconds |
(same as interval) | Faster interval while down |
Examples¶
# PostgreSQL on a private server
Target: 192.168.1.10:5432
# SSH availability
Target: mynas.local:22
# Minecraft server
Target: mc.home.arpa:25565
# Home Assistant
Target: homeassistant.local:8123
ICMP ping¶
Sends ICMP echo requests. The simplest reachability check — useful when no port is guaranteed to be open.
Warning
ICMP ping requires a relay. The Lanby platform runs in cloud environments that block raw ICMP. Additionally, the relay container needs NET_RAW capability — see relay docs.
Configuration¶
| Field | Default | Description |
|---|---|---|
target |
required | Hostname or IP address. e.g. 192.168.1.1 or router.local |
timeout_seconds |
10 |
Wait time for ICMP reply |
interval_seconds |
60 |
How often to ping |
retries |
0 |
Retries before marking down |
Examples¶
# Router/gateway reachability
Target: 192.168.1.1
# Network device with no open ports
Target: 192.168.1.200
# Another machine by hostname
Target: myserver.local
DNS¶
Resolves a DNS name and optionally validates the answer. Useful for detecting broken records, split-horizon mismatches, or unexpected changes.
Configuration¶
| Field | Default | Description |
|---|---|---|
target |
required | Used as dns_host if dns_host is not set |
dns_host |
(target value) | The hostname to resolve |
dns_type |
A |
Record type: A, AAAA, CNAME, TXT, NS |
dns_expect |
(disabled) | Substring that must appear in at least one answer record |
dns_nameserver |
(system resolver) | Query a specific nameserver instead. e.g. 192.168.1.1 or 8.8.8.8 |
interval_seconds |
60 |
How often to query |
timeout_seconds |
5 |
Query timeout |
Examples¶
Check a domain resolves at all:
Verify a specific IP is returned (e.g. your internal DNS overrides an external record):
Fails if the answer doesn't contain192.168.1.50 — useful to catch split-horizon DNS breaking.
Check your Pi-hole or AdGuard Home is resolving correctly:
Queries your local resolver directly, bypassing the system default.Monitor a Let's Encrypt TXT record or DKIM selector:
Check CNAME is pointing at the right place:
gRPC health¶
Calls the standard grpc.health.v1.Health/Check RPC and expects a SERVING response. Compatible with any service implementing the gRPC Health Checking Protocol.
Configuration¶
| Field | Default | Description |
|---|---|---|
target |
required | host:port — e.g. myservice.local:50051 |
grpc_service |
(empty — checks overall server health) | Sub-service name to check |
grpc_tls |
false |
Use TLS. Set to true for production gRPC services. |
interval_seconds |
60 |
How often to check |
timeout_seconds |
10 |
RPC timeout |
retries |
0 |
Retries before marking down |
Examples¶
Overall server health:
Specific sub-service (e.g. a gRPC service with multiple handlers):
Keepalive monitors¶
Keepalive monitors flip the relationship: your service checks in with Lanby after each run. If check-ins stop arriving, Lanby alerts you.
Full keepalive documentation →
Planned probe types¶
Browser / synthetic¶
Drives a real browser to load a page and optionally interact with it — clicking buttons, filling forms, asserting text. Catches JavaScript errors, broken logins, and issues that HTTP probes miss entirely.
Requires a relay. Powered by Playwright.
SMTP / mail server¶
Connects to an SMTP server and performs the initial handshake (EHLO). Confirms the mail server is listening and responding — useful for self-hosted mail setups like Mailcow, Maddy, or Postfix.
Ports 25, 465, 587. STARTTLS support planned.
UDP¶
Sends a UDP packet and optionally checks for a response. Useful for game servers, VPN endpoints (WireGuard, OpenVPN), and other connectionless services.
Requires a relay.
SNMP¶
Polls an SNMP OID and checks the returned value against a threshold or expected string. Monitor network switches, routers, NAS devices, and UPSes that speak SNMP but don't expose an HTTP API.
SNMPv1, v2c, v3. Requires a relay.
Push / webhook receiver¶
Receives an inbound webhook payload from an external service (Grafana alerts, GitHub Actions, Uptime Kuma) and converts it into a Lanby notification.
Info
Have a monitor type you need that isn't listed? Reach out — we build based on what self-hosters actually run.
About the name¶
A LANBY — Large Automatic Navigation BuoY — is a floating navigational aid designed to replace crewed lightships. It sits offshore, watches over shipping lanes, and runs without intervention.
That's what Lanby the product is: infrastructure that watches your services quietly and reliably, from the outside, so you don't have to.