Monitoring

How OpsKitty monitors your websites, handles failures, and scales across regions.

Check Intervals

Each monitored endpoint has a configurable check interval that determines how frequently OpsKitty verifies its status. The minimum allowed interval depends on your plan.

PlanMinimum IntervalCheck ModeRegions
Free30 minutesLocal
Launch1 minuteLocal
Growth30 secondsMulti-region3 regions
Pro20 secondsMulti-region11 regions
Scale15 secondsMulti-region29 regions

A global minimum of 30 seconds is enforced regardless of configuration to prevent excessive requests to monitored targets.

Success Criteria

Each check evaluates conditions to determine if an endpoint is healthy.

Default Behavior

When no custom conditions are configured, OpsKitty applies sensible defaults:

  • HTTP status code must be less than 400 (i.e., 2xx or 3xx responses are considered healthy)
  • The connection must be established successfully

Custom Conditions

You can configure custom conditions using placeholders:

PlaceholderDescriptionExample
[STATUS]HTTP status code[STATUS] == 200
[BODY]Response body text[BODY] contains "ok"
[RESPONSE_TIME]Response time in ms[RESPONSE_TIME] < 2000
[CONNECTED]Connection established[CONNECTED] == true
[CERTIFICATE_EXPIRATION]TLS cert expiry (ms)[CERTIFICATE_EXPIRATION] > 604800000

All conditions use AND logic — every condition must pass for the check to be considered successful.

Failure Handling & Backoff

When an endpoint fails consecutively, OpsKitty applies exponential backoff to reduce load on the target and avoid being blocked by firewalls or WAFs.

How It Works

  • First 3 failures: check continues at normal interval
  • After 3 failures: interval doubles with each additional failure
  • Maximum backoff: 15 minutes (900 seconds)
  • On recovery (successful check): interval resets to normal immediately

Backoff Progression

Example with a base interval of 60 seconds:

Consecutive FailuresNext Check InMultiplier
1–360s1x (normal)
4120s2x
5240s4x
6480s8x
7900s15 min cap
8+900s15 min cap

This prevents your monitoring from being flagged as abusive traffic while still continuing to check whether the endpoint recovers.

Multi-Region Monitoring

For Growth, Pro, and Scale plans, OpsKitty checks your endpoints from multiple AWS regions to detect regional outages and provide global uptime visibility.

Region Rotation

For plans with many regions (e.g., Scale with 29 regions), OpsKitty uses region rotation instead of hitting all regions simultaneously:

  • Each check cycle uses a subset of 5 regions
  • Regions rotate deterministically so all are covered over multiple cycles
  • Full global coverage is achieved over ceil(total_regions / 5) cycles
  • This prevents the target from seeing simultaneous requests from 29 IPs

Coverage Example

For a Scale plan with 29 regions and 60s interval:

CycleRegions CheckedCumulative Coverage
1 (0s)5 regions~17%
2 (60s)5 regions~34%
3 (120s)5 regions~52%
4 (180s)5 regions~69%
5 (240s)5 regions~86%
6 (300s)4 regions100%

Status Aggregation

When results come back from multiple regions, OpsKitty uses an "any success" strategy:

  • If any region returns a successful check, the endpoint is marked UP
  • Only if all regions fail is the endpoint marked DOWN
  • This avoids false alarms from isolated regional issues

Status Changes & Alerts

OpsKitty tracks status transitions and can trigger alerts when an endpoint goes down or recovers.

  • Status changes (UP → DOWN or DOWN → UP) are recorded as events
  • Each event includes a timestamp and duration of the previous state
  • Alerts fire on status change based on your configured alert rules
  • Backoff applies to monitoring frequency but not to alert delivery

Supported Protocols

OpsKitty supports monitoring endpoints using multiple protocols:

HTTP/HTTPS
TCP
UDP
DNS
ICMP (Ping)
TLS
STARTTLS
WebSocket
SSH
SCTP