HADES documentation

Documentation

HADES, the Host-Aware Deployment & Execution System, turns a laptop you own into a small, honest server: declare an app, get a container, a route, and a public link. Host-aware means it treats a laptop's real constraints (sleep, battery, scarce memory, one contended machine) as signals it plans around. It is built for side projects, internal tools, and demos. It is not a datacenter replacement for anything that can't tolerate a machine that occasionally sleeps.

01Install

curl -fsSL https://tryhades.com/install.sh | sh

The script is idempotent; run it again any time. It will:

  • verify Docker (installs Docker Desktop via Homebrew if missing, starts the engine)
  • install cloudflared for public links (optional; apps degrade to local URLs without it)
  • install Rust via rustup if needed, then build the hades and hadesd binaries
  • place binaries in ~/.hades/bin and add it to your PATH
  • run hades host init: write config, generate an ntfy.sh notification topic for your phone, install the daemon under launchd (starts at login, restarts on crash), and run the doctor
  • offer to join an existing fleet (paste a connect line), interactively or via HADES_HUB=… HADES_TOKEN=… sh for unattended rollouts

Re-running the script is a real update: it refetches source and rebuilds. Only deploying from this machine, not hosting on it? curl -fsSL https://tryhades.com/cli.sh | sh installs the CLI alone.

The doctor gates everything. The daemon refuses deploys while any hard check fails (Docker down, disk low, no egress, port conflicts). hades host doctor prints each failure with its exact remedy.

Manual install

# from a checkout
cargo build --release --workspace
export PATH="$PWD/target/release:$PATH"
hades host init

02Log in

$ hades login
⚖ you have entered the underworld

  host      127.0.0.1:8786 · hadesd v0.2.0
  doctor    GREEN · ready for deploys
  capacity  6348MB allocatable · 0MB claimed · 0 apps

hades login connects the CLI to a host, verifies the daemon is healthy and the doctor is green, records the session in ~/.hades/session.json, and reports what the machine can carry. Exit code 0 means ready; 2 means the doctor is red; 5 means no daemon; 7 means bad token.

From another machine

The daemon binds loopback, but it publishes its own API through an authenticated control tunnel. On the host, run:

$ hades host connect-info

  on the other machine, run:

    hades login --host https://….trycloudflare.com --token 21c5fc74…

Paste that one line on any machine with the hades CLI and every command (deploy, logs, events, host status) operates the remote host through the tunnel. Every API call requires the bearer token (the API answers 401 without it); the control-tunnel URL changes when the host restarts, the token does not. hades logout returns the CLI to the local host.

Treat connect-info output like a password. Anyone holding the token can deploy containers to your machine. Build contexts and specs travel over HTTPS through the tunnel, so nothing assumes a shared filesystem.

03First deploy

$ cd my-app
$ hades init                    # scaffolds Hades.toml
$ hades deploy
my-app deployed
  local:  http://my-app.localhost:8787
  public: https://lazy-otter-4242.trycloudflare.com

The CLI packs your build context into a tarball, ships it to the daemon, and the daemon builds the image, starts replicas with hard resource limits, health-gates them, swaps routes, and provisions a tunnel. Deploys are idempotent upserts: deploying an existing name replaces it with zero downtime (new replicas come up and pass health checks before old ones stop). Retries are always safe.

How the link gets public: the exposure model. Your machine never opens a port and never needs a static IP or a configurable router. The host runs cloudflared, which dials outbound to Cloudflare's edge and holds the connection open; inbound requests ride back down that tunnel. So it works behind any NAT or home/coffee-shop wifi, and the only thing reachable from the internet is the single app you deployed. Nothing else on your machine is exposed. The trade is that you're trusting Cloudflare's edge to terminate TLS and relay traffic.
Quick-tunnel URLs are cattle. By default apps get a fresh *.trycloudflare.com URL that changes whenever the tunnel restarts. hades url <name> always tells the current truth, and url_changed_at (in --json) lets agents detect staleness. For a stable URL that never rotates, see Custom domains.

04Hades.toml

[app]
name = "my-app"            # lowercase [a-z0-9-]
ports = [8000]             # first port receives proxied traffic
replicas = 2
priority = "normal"        # critical | normal | low
max_concurrent_requests = 64

[app.build]                # or:  image = "nginx:alpine"
dockerfile = "Dockerfile"
context = "."

[app.resources]
cpu = 0.5                  # cores, fractional ok
memory = "256mb"           # MANDATORY: admission control needs it

[app.health_check]
path = "/"

[app.power]
on_battery = "run"         # or "pause"

[app.env]
PORT = "8000"
fieldtype · defaultmeaning
namestring, requiredapp identity; becomes <name>.localhost and the container labels
imagestringregistry image. Exactly one of image / [build]
build.dockerfilestring · "Dockerfile"path within the context
build.contextstring · "."directory tarred and shipped to the daemon (.git, target, node_modules excluded)
ports[u16], requiredexposed container ports; the first is routed
resources.cpufloat · 1.0CPU quota in cores (enforced via Docker)
resources.memorysize, requiredhard limit, e.g. "256mb", "2gb". The unit of admission control
replicasint · 1identical containers, round-robined by the proxy
priorityenum · normalshedding order under pressure: low pauses first, critical never
power.on_batteryenum · runpause = docker-pause when the host unplugs, resume on AC
max_concurrent_requestsint · ∞proxy in-flight cap; excess sheds with 503 + Retry-After
health_check.pathstringHTTP path probed before a replica receives traffic
envtableenvironment variables

05Secrets

$ hades secrets set STRIPE_KEY=sk_live_… --app shop
shop · 1 secret on local (keychain-encrypted) · running replicas restarted
  STRIPE_KEY

$ hades secrets list --app shop      # key names only
$ hades secrets unset STRIPE_KEY --app shop

Secrets live on the host that runs the app and are injected as environment variables at container start, merged over [app.env]. They are never in the manifest, never in git, never in the build context, and never come back out of the API. List returns key names only, and even manifest env values are redacted in API responses. Setting or unsetting restarts running replicas in place (new containers come up and pass health checks before old ones stop), so the change is live immediately.

At rest the store is encrypted with XChaCha20-Poly1305 using a per-host master key kept in the macOS Keychain (service hades-master-key), with no external key service, unlocked with your login session. If the Keychain is unavailable the store falls back to 0600 file permissions and says so plainly. Fleet-aware: set secrets through the hub and they are stored on whichever device the app is placed on. Destroying an app deletes its secrets.

06Fleet

One machine is a host. Several are a fleet. The host you log into is the hub; other devices join it, and from then on a plain hades deploy is placed on whichever device has the most free declared memory. Replicas round-robin within each device.

Adding a device

# on the hub: prints the join line
$ hades fleet add

# on the new machine: the installer offers this interactively,
# or run it yourself after install:
$ hades host join --hub https://….trycloudflare.com --token 21c5…

Joining requires the hub's bearer token. That is what marks the device as yours. The device also records its hub in its own config, and the hub stores how to reach the device (control URL + token), so it can deploy to it, stream its logs, and poll its health every 30 seconds.

Operating the fleet

$ hades fleet
DEVICE     HEALTH   FREE      ALLOCATABLE  APPS   LAST SEEN
studio     green    11468MB   12700MB      3      14:02:11
macbook    green     4121MB    6348MB      1      14:02:13

$ hades deploy --device studio    # pin placement
$ hades deploy --device local     # force the hub itself
$ hades fleet remove macbook      # apps keep running there

Updating the fleet

Hosts serve their own source (GET /host/src), so updates propagate the same way everything else does: machine to machine. Update the hub (hades update, which rebuilds from its checkout, git, or --from a site URL), then hades fleet update: every device pulls source from the hub, rebuilds, swaps binaries, and restarts its daemon. No registry, no release server.

hades apps list shows a DEVICE column; logs, stats, pause/resume and destroy are proxied through the hub to wherever the app lives.

Slow builds: a deploy placed on a remote device builds over that device's tunnel, and a multi-minute image build can outlast the tunnel stream. For heavy builds (e.g. a full Next.js build), pin with --device local so it builds on the machine you're driving from.

Spreading one app across machines

Placement puts an app on one machine. Spreading runs it on several at once and turns the hub into its load balancer. Start with an app on the hub, then add a device:

$ hades spread dhilan --to studio   # also run dhilan on studio
$ hades gather dhilan --from studio  # stop running it there
$ hades gather dhilan                 # pull it off every device

The hub keeps the app's link and round-robins requests across every instance. A request rides the hub's proxy, and for a remote instance it is forwarded through that device's control tunnel to a small /_relay/<app> endpoint, which hands it to the local container. The hub retains each app's build context, so it can rebuild the app on another machine without the original files.

This is also the failover story. Every instance is just a backend in the route, so:

  • a device going down drops out within one poll cycle, and traffic keeps flowing to the survivors;
  • pausing an instance (by hand or under memory pressure) sheds its backend immediately, so a frozen container returns a fast gateway error instead of hanging the request;
  • under critical memory pressure the hub migrates the lowest-priority app to a machine with room, then pauses the local copy, so pressure on one machine reaches the others instead of taking the app down.

hades dashboard shows every app and which machines its instances run on, live.

One image per machine: a spread instance is a real container on each device, so the spread path needs each machine on current hades (run hades fleet update first). Per-app secrets do not yet travel to spread instances; set them on each machine that runs the app, or keep secrets on single-placement apps for now.

SSH into a device

$ hades ssh macbook        # opens an ssh:// tunnel, drops you into a shell

Hades opens the road (the device's daemon spawns an ssh:// tunnel); authentication stays plain SSH against that machine's own user accounts, so the hades token alone can't get a shell. Remote Login must be enabled on the device (System Settings → General → Sharing → Remote Login). Needs cloudflared on both ends.

07Custom domains

Quick-tunnel URLs rotate. There are two ways to a stable URL that survives restarts and reboots. Pick by whether your users should need a Cloudflare account.

A · claim a subdomain (no Cloudflare account)

An operator runs one coordinator for a domain they own; everyone else just claims a name under it. This is how you give other people stable URLs without them ever touching Cloudflare.

$ hades domain claim coolname --app my-app
⚖ coolname.tryhades.com is yours
  https://coolname.tryhades.com        # stable, never rotates

$ hades domain claim @ --app site      # the apex: tryhades.com + www
$ hades domain list
$ hades domain release --app my-app

The host calls the coordinator, which creates a Cloudflare tunnel + DNS record and hands back a connector token; the host runs cloudflared with it and aliases the hostname to the app through its proxy. The URL is backed by a real DNS record, so it survives daemon restarts and reboots (reconciled on boot). The hub stores its domain.coordinator_url + domain.coordinator_secret in config.

Works fleet-wide. When an app runs on a fleet device, the claim is forwarded to that device and the hub hands the device its coordinator URL and secret in the same call, so a device never needs coordinator_url in its own config. The device runs the claim's tunnel pointed at its own proxy, so the stable name resolves to wherever the app actually lives.
One level only. Names are <name>.<domain>, which Cloudflare's free wildcard certificate covers. The apex claim (@) also serves www. One claim per app.

Running the coordinator (operator)

The coordinator is a small workspace binary. It needs a domain on Cloudflare and one API token with two permissions: Account · Cloudflare Tunnel · Edit and Zone · DNS · Edit, scoped to your zone. No per-machine cloudflared tunnel login, just the token. It reads its config from the environment:

CF_API_TOKEN=…            # Tunnel:Edit + DNS:Edit
CF_ACCOUNT_ID=…
HADES_PARENT_DOMAIN=tryhades.com
COORDINATOR_SECRET=…      # hosts present this to claim

$ hades-coordinator       # serves /claim, /claims, /health on :8000

For real use the coordinator must be durable and reachable from every host that claims, including fleet devices. Run it under launchd (a KeepAlive agent reading the token from a 0600 env file) and give it its own stable hostname: create one named tunnel pointing coordinator.<domain> at localhost:8099, run that cloudflared under launchd too, and set every hub's coordinator_url to https://coordinator.<domain>. Now the address never rotates and devices reach it over the internet.

After the domain goes active on Cloudflare, turn on Always Use HTTPS in the zone's SSL settings so http:// auto-upgrades (otherwise browsers flag plain HTTP as "Not Secure"). Universal SSL covers the apex and one level of wildcard automatically.

B · bring your own domain (named tunnel)

If you own a domain and run your own host, skip the coordinator: one named tunnel carries the whole host. Run cloudflared tunnel login once to authorize it, then:

$ hades host domain apps.example.com
  apps     https://<app>.apps.example.com
  api      https://api.apps.example.com

Apps live at <app>.<domain> and the control API at api.<domain>, so even your fleet join lines stop rotating.

08CLI reference

commandwhat it does
hades login [--host] [--token]connect to a host (local or remote), verify readiness, record the session
hades logoutforget the session; commands target the local host again
hades host connect-infoprint the login command another machine uses to control this host
hades host join --hub --tokenjoin this device to a fleet (run on the new device)
hades secrets set K=V… [--app]set secrets; replicas restart to pick them up
hades secrets list | unset [--app]key names only / remove keys
hades fleet [list]devices with live health and free capacity
hades fleet addprint what to run on a new machine
hades fleet remove <name>drop a device from the registry
hades update [--from <url>]self-update: refresh source (checkout › git › your hub › --from), rebuild, swap binaries, restart the daemon
hades fleet updateevery joined device self-updates, pulling source from this hub
hades ssh <device> [--user]shell into a fleet device; hades opens an ssh:// tunnel, auth stays plain ssh (Remote Login must be on there)
hades domain claim <name> [--app]claim a stable https://<name>.<domain> via the coordinator (@ = apex + www); no user Cloudflare account
hades domain release [--app] | listgive up / list claimed domains
hades host domain <domain>(BYO domain) put the whole host on a named tunnel: apps at <app>.<domain>, API at api.<domain>
hades initscaffold a Hades.toml in the current directory
hades deploy [--app] [--dir] [--device]idempotent upsert: build/pull, health-gate, swap routes, print the link
hades apps listall apps with state, replicas, memory, URLs
hades apps logs <name> [--follow]stream container logs
hades apps stats <name>requests, in-flight, shed count, p50/p95 latency, per-replica memory/CPU
hades apps pause | resume <name>docker-pause (keeps state, frees CPU) / resume
hades apps destroy <name>remove containers, routes, and tunnel
hades url <name>the current public URL (and url_changed_at in --json)
hades events [--follow] [--days]the host's event stream (NDJSON with --json)
hades notify testsend a test push through every configured channel
hades host initguided idempotent bootstrap (config · ntfy · launchd · doctor)
hades host doctorpreflight with remedies; red = deploys refused
hades host statusone screen: capacity vs allocated, power, availability, apps
hades host batterybattery health + degradation diagnosis
hades host uptime [--days]availability % and classified downtime windows
hades host psevery process HADES owns: daemon, containers, tunnels, with live RSS

09JSON contract

Every command accepts --json and then prints exactly one JSON object on stdout (streams print NDJSON). Progress and decoration go to stderr, so pipes stay clean:

hades deploy --json | jq -r .app.url

Errors are structured, with stable codes:

{ "error": { "code": "admission_rejected",
             "message": "deploy rejected: requested 102400MB …",
             "detail": { "vm_memory_mb": 7935, "allocatable_mb": 6348,
                         "allocations": [ { "app": "hello", "memory_mb": 128, "replicas": 2 } ] } } }
exitcodemeaning
0success
1other / docker / tunnelgeneric failure
2doctor_redhost not ready; run hades host doctor
3admission_rejectedovercommit; detail carries the full resource ledger
4app_not_foundno such app
5daemon_unreachablehadesd not running
6invalid_spec / manifest_not_foundfix the Hades.toml
7unauthorizedmissing or wrong bearer token; re-run hades host connect-info

10Events

Everything observable flows through one stream. hades events --follow --json is the live wire. Each line is { "at": …, "type": …, …fields }.

typefired when
host_uphost returns after a downtime window; carries downtime_secs + cause (slept / crashed / rebooted / unknown)
daemon_starteddaemon boot; unclean_shutdown true after a crash
app_deployed / app_destroyedlifecycle; replaced marks upserts
app_paused / app_resumedmanual, memory_pressure, or on_battery (the reason is included)
app_oom_killedcontainer hit its memory limit; includes the limit and restart count
app_crash_loop3 kills in 10 minutes; restarts stop, urgent push sent
url_changedtunnel re-provisioned; old and new URL included
tunnel_downcloudflared died; re-provisioning begins
memory_pressurehost pressure level changed (normal / warn / critical)
on_battery / on_acpower source transitions
disk_lowfree disk crossed the threshold
deploy_rejectedadmission control refused a deploy
doctor_reda previously green host failed checks

Urgent events (OOM kills, crash loops, host recovery after a crash, doctor red) are pushed to your phone via the ntfy.sh topic generated at install, no account needed. For true host-down alerts, configure a free healthchecks.io ping URL (hades host init --healthchecks-url …): the daemon heartbeats it every minute, and when heartbeats stop, their infrastructure alerts you, because a dead host can't report its own death.

11Operating the host

Capacity is the VM's, not the Mac's

On macOS, containers live inside the Docker VM. All admission math uses the VM's memory and CPUs; hades host status shows both numbers so you never budget against RAM your containers can't touch. A configurable reserve (default 20%) is kept out of the allocatable pool.

Uptime

The daemon heartbeats to disk every 30s. On any gap it classifies the window by cross-referencing the kernel boot time and pmset sleep history, appends it to the ledger, and pushes "host back, here are the new links". hades host uptime renders availability with causes.

Battery

hades host battery reads weeks of 5-minute telemetry and answers why: capacity trend vs design, cycle burn rate, high-charge dwell (the silent killer for always-plugged hosts), temperature under load, and HADES' own share of CPU during the window, with concrete remedies.

Nothing untracked

Every container carries a hades.app label; every cloudflared PID is registered. hades host ps shows the full tree with live RSS. A reaper kills anything labeled that desired state no longer explains, so crash-orphaned processes cannot accumulate.

12Architecture

crates/
├── hades-core      shared types: AppSpec, manifest, events, errors, config
├── hades-api       the CLI⇄daemon wire contract + client
├── hades-host      bootstrap, doctor, launchd, macOS probes (pmset/ioreg)
├── hades-sentinel  uptime ledger, notifiers, dead-man's switch
├── hades-runtime   Docker via bollard: build-from-tar, limits, OOM detection
├── hades-proxy     Host-header reverse proxy: aliases, replicas, shed caps
├── hades-tunnel    cloudflared: quick tunnels, named tunnels, ssh tunnels
├── hadesd          the daemon: API, reconcile, watchdog, policy engine
├── hades-cli       the `hades` binary
└── hades-coordinator  operator subdomain service (Cloudflare-backed claims)
  • Event bus as spine. Reconcile loop, watchdog, power monitor and tunnel supervisor publish events; notifications, the policy engine, the JSONL ledger and /events consume them. The policy engine is a pure function (event, app states) → actions, unit-tested without Docker.
  • Tunnel topology. By default each app gets its own cloudflared quick tunnel pointed at the proxy, and the scraped hostname is registered as an alias route. Stable URLs use the same alias mechanism over a coordinator-issued or named tunnel. The proxy stays the single ingress, routing every hostname by Host header.
  • Fleet. A hub holds device control URLs + tokens, polls health, places deploys by free memory, and proxies app commands to wherever an app lives. Devices re-register with their hub whenever their tunnel URL rotates (self-heal). Hosts serve their own source, so updates propagate hub→device with no registry.
  • Reconcile. Desired state is a JSON file; on restart the daemon converges reality to it (containers, routes, tunnels) and reports the downtime honestly.

13Configuration

~/.hades/config.toml, written by hades host init:

keydefaultmeaning
api_port8786daemon API (loopback)
proxy_port8787the ingress; local URLs are <app>.localhost:8787
reserve_pct20% of VM memory kept out of the admission budget
heartbeat_secs30uptime-ledger heartbeat cadence
pressure_warn_mb / pressure_critical_mb1024 / 512available-memory thresholds for shedding
disk_min_free_gb5.0doctor red below this
auth_tokengeneratedbearer token required on every API call; what remote logins present
notify.ntfy_topicgeneratedyour phone's push channel; keep it secret
notify.mac_notificationstruealso notify the local notification center
notify.healthchecks_urldead-man's switch ping URL
fleet.hub_url / fleet.hub_tokenwritten by hades host join: which hub owns this device
keep_awaketruehold a caffeinate power assertion so the host never idle-sleeps while hosting
domain.name / domain.tunnel_nameset by hades host domain: BYO-domain named tunnel
domain.coordinator_url / .coordinator_secretwhich coordinator hades domain claim uses, and the shared secret
source_dirset by installerwhere this host's source lives, rebuilt by hades update and served to devices at /host/src

State lives under ~/.hades/: desired state and the process registry in state/, the uptime and event ledgers in ledger/, battery and resource telemetry in metrics/, daemon logs in logs/.