Monitoring & on-call

On-call you
draw, not configure

Hesklo checks your sites and servers around the clock, then runs the escalation policy you draw on a canvas. Wait before paging, branch on how long something's been down, restart a service, and send the all-clear when it recovers.

30s checks· HTTP · ping · TCP · SSH· page once per incident

Checkout API https · 60s DOWN 6m
MONITOR Monitor https / 60s DOWN UP Wait 5m If / else down > 5m Slack #ops Email all-clear
HTTP(S)
Status & body
Check a URL's response code and that the body contains what it should.
Ping
Host reachable
A plain reachability check for any host or IP you need to keep an eye on.
TCP
Port open
Confirm a database or service port is accepting connections.
SSH
Command exit code
Run a command on a server and treat a clean exit as up.

Features

Everything that decides whether you sleep through the night.

The hard part of on-call isn't noticing an outage. It's deciding who hears about it, when, and how often. Hesklo puts that decision on a canvas.

Your escalation policy, drawn on a canvas

Drag a monitor onto the board and wire it to waits, branches, schedule gates and notify steps. The whole policy reads at a glance instead of hiding in dropdowns and config files. What you see is exactly what runs.

drag & drop if / else wait repeat schedule gate

Branch on downtime

Route on whether something is down, up, just recovered, or down longer than you'll tolerate. Page only when it's worth it.

Restart it before you wake up

An HTTP action calls any endpoint mid-flow to restart a service or hit a runbook, so the flow can try a fix before it pages a human.

Recovery alerts, free

Every monitor has a recovery port. Hang a notify step off it and the all-clear sends itself when a service comes back.

SSL expiry watch

For HTTPS monitors, Hesklo tracks the certificate and flags it amber under three weeks, red under one. No more surprise expiries.

Response times tracked

Every check records latency. Watch the 24-hour trend per monitor to catch the slow creep before it turns into an outage.

Quiet hours and maintenance

Gate alerts to office hours with a schedule, and pause a monitor for planned work so a known outage never pages anyone.

Fewer false alarms

Require several failed checks in a row before a monitor counts as down, and set a flap cooldown so a service bouncing up and down only pages once.

Anatomy of a flow

Four kinds of block. One readable picture.

A flow starts at the monitor and follows the wires you draw. The down port runs your escalation; the up port sends the all-clear.

MONITOR Monitor HTTP / 60s DOWN UP Wait 5m If / else down > 5m Slack #ops-alerts Email all-clear

Monitor

The check itself, with a down port and an up port. Everything downstream hangs off one of the two.

Logic — wait, branch, check, schedule

Shape the flow. Hold before paging, split on how long it's been down, re-probe mid-flow, or only let alerts through during set hours.

Notify & act

Send the alert or take an action. Each step points at a connection you set up once and fires at most once per incident.

Recovery path

Wire a notify step to the up port and the all-clear goes out by itself the moment the service is healthy again.

Modules

A small set of blocks. Any policy you need.

Logic blocks shape the flow, notify blocks send the alert. Configure a connection once, then point any notify step at it.

Logicshape the flow

Seven blocks that decide what happens between a failed check and a page.

If / else branch
Route on down, up, recovered, or down longer than N minutes.
Condition check
Run a fresh probe mid-flow and split on pass or fail.
Wait
Hold for N minutes of continuous downtime before escalating.
Repeat alert
Keep paging every N minutes while down. Resets on recovery.
Schedule gate
Only let alerts through on set days and hours, in your timezone.
Log / annotate
Drop a note on the timeline to mark a point in the escalation.
HTTP action
Call an endpoint to restart, scale or trigger a runbook.
Notifysend the alert

Set a connection up once, then point any number of notify steps at it.

Slack
Post to a channel with an incoming webhook.
Discord & Microsoft Teams
Webhook posts to your team's chat of choice.
Email & SMS
Mail any address, or text a number through Twilio.
PagerDuty
Triggers on down and resolves itself on recovery.
Jira ticket
Open an issue in your project with a filled-in summary.
Webhook
POST a JSON payload anywhere, with an optional bearer token.

How it works

From nothing to paged the right way in minutes.

01

Add a monitor

Point it at a URL, a host, a host and port, or an SSH command. Set how often to check and how many failures in a row count as down.

02

Draw the escalation

Off the down port, add a wait, a branch and a schedule gate, then the notify steps. Off the up port, send the all-clear.

03

Let it run

Hesklo checks on schedule and follows the flow. Watch live status and 30-day uptime on the overview, and pause for planned work.

See what happened

A public status page and a full history, out of the box.

Every check is recorded. Share uptime with the world on a status page, and keep the incident trail for yourself.

Public status page

Flip a monitor on for your status page and it shows up at your own subdomain, so customers can check for themselves instead of emailing you.

acme.status.hesklo.com All systems operational
Marketing site100%
Checkout API99.4%
Database100%

Per-monitor history

Open any monitor for 30-day uptime, a 24-hour response-time trend, SSL status and every incident with its cause and duration.

99.82%
Uptime 30d
2
Outages
18d
SSL left
Response time · 24h

Why Hesklo

Less noise, less to learn.

The policy is the picture

A flow on a canvas is faster to read and hand over than alerting rules scattered across config files and dashboards.

Page once, not every tick

Each step fires once per incident and resets on recovery, so a long outage doesn't bury you in repeat alerts.

Agentless checks

Nothing to install on the things you watch. Hesklo probes over HTTP, ping, TCP and SSH from the outside.

Your data, scoped to you

Monitors, connections and history belong to your account, and integration secrets are encrypted at rest.

Draw your first escalation.

Add a monitor, wire up the flow, and let Hesklo handle the 3 a.m. part.

Open the dashboard