Runbook: tunnel-recovery¶
When https://warehouse.caseymanos.com is down or misbehaving.
Architecture (as of 2026-05-05)¶
The site has two independent processes; either can be down independently.
- uvicorn —
com.casey.warehouse-uilaunchd job. Always-on, auto-restarts on crash, starts at login. Sources~/.warehouse_secrets.shfor env vars before starting. Runs at127.0.0.1:8765. - cloudflared — currently still foreground /
kbui-driven (terminal process). Routeswarehouse.caseymanos.com→localhost:8765.
This split is intentional — uvicorn is the harder one to keep alive (env vars, secrets, port), so it gets launchd. cloudflared is one binary with one config file and survives terminal restarts cheaply, so the kbui flow is fine for now.
Quick triage¶
# Is the tunnel reachable from the public internet?
curl -I https://warehouse.caseymanos.com
# Expected: 302 redirect to Cloudflare Access challenge
# Is uvicorn running locally? (launchd-managed)
launchctl list | grep com.casey.warehouse-ui
# Expected: <PID> 0 com.casey.warehouse-ui (PID = number, exit-code = 0)
# Or check the port directly:
curl -s -o /dev/null -w "%{http_code}\n" http://127.0.0.1:8765/
# Expected: 200
# Is cloudflared running?
ps aux | grep cloudflared | grep -v grep
# Expected: one cloudflared tunnel ... run warehouse line
# Is the tunnel registered with Cloudflare?
cloudflared tunnel list
# Should show "warehouse" with id edae87c6-61b1-4f84-a776-87d8732693ea
Common scenarios¶
Browser shows Cloudflare Access challenge but PIN never arrives¶
PIN delivery to email can be slow (up to 1min) or land in spam. Check spam. If still nothing after 2min:
- Browser → click "Resend code"
- Check Cloudflare Access policy in dashboard:
- Zero Trust → Access → Applications → warehouse
- Policy: email auth, allow only
[email protected] - Verify the email address matches what's in the policy
Browser shows "Tunnel offline" / Cloudflare 1033¶
cloudflared isn't running. Restart it:
# Foreground (terminal stays attached, easy to see logs):
cloudflared tunnel --config ~/.cloudflared/config.yml run warehouse
# Or background:
nohup cloudflared tunnel --config ~/.cloudflared/config.yml run warehouse \
> /tmp/cloudflared.log 2>&1 &
disown
kbui and ~/garmin-warehouse/scripts/run_ui.sh are pre-launchd-uvicorn
helpers — they still work but they redundantly try to start uvicorn (which
launchd already manages). Use them only if uvicorn is somehow off-line and
you want a one-shot dev session.
Browser shows 502 / 504 from Cloudflare¶
cloudflared is up but uvicorn isn't responding. Check:
curl http://localhost:8765/
# If timeout/refused: uvicorn died.
# If responds: cloudflared can't reach localhost:8765.
If uvicorn is down, restart via launchd:
# Force-restart in place (preferred):
launchctl kickstart -k gui/$(id -u)/com.casey.warehouse-ui
# Or full unload/load:
launchctl unload ~/Library/LaunchAgents/com.casey.warehouse-ui.plist
launchctl load ~/Library/LaunchAgents/com.casey.warehouse-ui.plist
# Check it's actually back:
launchctl list | grep com.casey.warehouse-ui
tail -20 ~/garmin-warehouse/logs/uvicorn.err.log
If uvicorn keeps crash-looping (PID column changes every check), look at the err log for the actual exception. KeepAlive + ThrottleInterval=10 means the process restarts at most once per 10s — bad enough exceptions still respect that throttle so you won't fork-bomb yourself.
If cloudflared can't reach uvicorn while both are running:
- Check ~/.cloudflared/config.yml ingress rule points at correct
port (http://localhost:8765)
- Check no firewall is blocking localhost (rare on Mac)
cloudflared won't start: cert.pem missing or invalid¶
ls -la ~/.cloudflared/cert.pem
# If missing or 0 bytes:
cloudflared tunnel login
# Browser opens, log in to CF account, cert.pem gets written.
cloudflared won't start: tunnel JWT missing¶
ls ~/.cloudflared/
# Expected: cert.pem + edae87c6-61b1-4f84-a776-87d8732693ea.json (the JWT)
# If JWT missing:
cloudflared tunnel token --cred-file ~/.cloudflared/edae87c6-61b1-4f84-a776-87d8732693ea.json warehouse
DNS not resolving¶
If wrong CNAME or none:
cloudflared tunnel route dns warehouse warehouse.caseymanos.com
# Re-creates the CNAME automatically in caseymanos.com zone.
Tunnel JSON revoked / lost¶
If the tunnel's JWT is gone and you can't recover:
# Delete and recreate (loses tunnel ID, requires DNS update):
cloudflared tunnel delete warehouse
cloudflared tunnel create warehouse
# Note the new tunnel ID, update ~/.cloudflared/config.yml
cloudflared tunnel route dns warehouse warehouse.caseymanos.com
# Restart tunnel
uvicorn missing env vars (Telegram card grey on /status)¶
The launchd plist sources ~/.warehouse_secrets.sh at startup. If env vars
go missing (TELEGRAM_BOT_TOKEN, RESEND_API_KEY, etc), the secrets file
is the place to update — not the plist itself.
ls -la ~/.warehouse_secrets.sh
# Expected: mode 600, owned by you, ~750 bytes
# If missing or wrong mode:
chmod 600 ~/.warehouse_secrets.sh
# Edit to add/rotate a secret:
vim ~/.warehouse_secrets.sh
# Then force uvicorn to re-source it:
launchctl kickstart -k gui/$(id -u)/com.casey.warehouse-ui
The secrets file is mirrored from ~/.zshrc exports — not auto-synced. If
you rotate TELEGRAM_BOT_TOKEN in zshrc, also update the secrets file.
Runtime-as-service (current state)¶
| Component | Mechanism | Auto-restart? |
|---|---|---|
| uvicorn (warehouse UI) | launchd com.casey.warehouse-ui |
yes (KeepAlive) |
| cloudflared tunnel | foreground / kbui | no — restart manually |
| OTQCheckinAgent worker | Cloudflare Workers (cron + webhook) | yes (Cloudflare-managed) |
cloudflared could also move to launchd if the foreground experience starts
to bite (e.g. it dies during long sleep cycles). Pattern is straightforward:
copy com.casey.warehouse-ui.plist, swap the ProgramArguments to
cloudflared tunnel --config ~/.cloudflared/config.yml run warehouse, drop
the secrets-source line, set RunAtLoad + KeepAlive.
Cloudflare Access policy quirks¶
- Policy changes can take ~30s to propagate
- "Service tokens" exist for non-browser access but aren't configured here — only Casey's email gets through
- If you need to allow another email temporarily: dashboard → Zero Trust → Access → Applications → warehouse → Policies → edit
cloudflared doesn't hot-reload config.yml¶
Symptom: You edited ~/.cloudflared/config.yml (e.g. added a new
ingress rule) but the new behavior isn't live.
Cause: cloudflared reads config only at process startup. Config-file edits are silently ignored until you restart the process.
Fix: Kill + restart cloudflared.
# Find the pid:
ps aux | grep cloudflared | grep -v grep
# Or just pkill:
pkill -f "cloudflared tunnel"
sleep 2
# Restart (foreground):
cloudflared tunnel --config ~/.cloudflared/config.yml run warehouse
Note: as of 2026-05-04, the tunnel only routes one thing —
warehouse.caseymanos.com → localhost:8765. The docs site is on
Cloudflare Pages, not the tunnel — see ADR 006.
If you find yourself adding ingress rules to expose more local
services, ask whether they should actually live on Pages / Workers
instead.
Related¶
systems/cloudflare-stack.mdsystems/status-dashboard.md—/statuspage is the live view of these layersreference/secrets.md- Memory:
cloudflare_setup.md