Monitoring

How OpsKitty monitors your websites, handles failures, and scales across regions.

Check Intervals

Each monitored endpoint has a configurable check interval that determines how frequently OpsKitty verifies its status. The minimum allowed interval depends on your plan.

Plan	Minimum Interval	Check Mode	Regions
Free	30 minutes	Local	—
Launch	1 minute	Local	—
Growth	30 seconds	Multi-region	3 regions
Pro	20 seconds	Multi-region	11 regions
Scale	15 seconds	Multi-region	29 regions

A global minimum of 30 seconds is enforced regardless of configuration to prevent excessive requests to monitored targets.

Success Criteria

Each check evaluates conditions to determine if an endpoint is healthy.

Default Behavior

When no custom conditions are configured, OpsKitty applies sensible defaults:

HTTP status code must be less than 400 (i.e., 2xx or 3xx responses are considered healthy)
The connection must be established successfully

Custom Conditions

You can configure custom conditions using placeholders:

Placeholder	Description	Example
[STATUS]	HTTP status code	[STATUS] == 200
[BODY]	Response body text	[BODY] contains "ok"
[RESPONSE_TIME]	Response time in ms	[RESPONSE_TIME] < 2000
[CONNECTED]	Connection established	[CONNECTED] == true
[CERTIFICATE_EXPIRATION]	TLS cert expiry (ms)	[CERTIFICATE_EXPIRATION] > 604800000

All conditions use AND logic — every condition must pass for the check to be considered successful.

Failure Handling & Backoff

When an endpoint fails consecutively, OpsKitty applies exponential backoff to reduce load on the target and avoid being blocked by firewalls or WAFs.

How It Works

First 3 failures: check continues at normal interval
After 3 failures: interval doubles with each additional failure
Maximum backoff: 15 minutes (900 seconds)
On recovery (successful check): interval resets to normal immediately

Backoff Progression

Example with a base interval of 60 seconds:

Consecutive Failures	Next Check In	Multiplier
1–3	60s	1x (normal)
4	120s	2x
5	240s	4x
6	480s	8x
7	900s	15 min cap
8+	900s	15 min cap

This prevents your monitoring from being flagged as abusive traffic while still continuing to check whether the endpoint recovers.

Multi-Region Monitoring

For Growth, Pro, and Scale plans, OpsKitty checks your endpoints from multiple AWS regions to detect regional outages and provide global uptime visibility.

Region Rotation

For plans with many regions (e.g., Scale with 29 regions), OpsKitty uses region rotation instead of hitting all regions simultaneously:

Each check cycle uses a subset of 5 regions
Regions rotate deterministically so all are covered over multiple cycles
Full global coverage is achieved over ceil(total_regions / 5) cycles
This prevents the target from seeing simultaneous requests from 29 IPs

Coverage Example

For a Scale plan with 29 regions and 60s interval:

Cycle	Regions Checked	Cumulative Coverage
1 (0s)	5 regions	~17%
2 (60s)	5 regions	~34%
3 (120s)	5 regions	~52%
4 (180s)	5 regions	~69%
5 (240s)	5 regions	~86%
6 (300s)	4 regions	100%

Status Aggregation

When results come back from multiple regions, OpsKitty uses an "any success" strategy:

If any region returns a successful check, the endpoint is marked UP
Only if all regions fail is the endpoint marked DOWN
This avoids false alarms from isolated regional issues

Status Changes & Alerts

OpsKitty tracks status transitions and can trigger alerts when an endpoint goes down or recovers.

Status changes (UP → DOWN or DOWN → UP) are recorded as events
Each event includes a timestamp and duration of the previous state
Alerts fire on status change based on your configured alert rules
Backoff applies to monitoring frequency but not to alert delivery

Supported Protocols

OpsKitty supports monitoring endpoints using multiple protocols:

HTTP/HTTPS

TCP

UDP

DNS

ICMP (Ping)

TLS

STARTTLS

WebSocket

SSH

SCTP