Table of Contents
- Garage Integration
- Overview
- Requirements
- Setup — new server
- Setup — existing enrolled agent
- Configuration
- Verifying it works
- Available commands
- Informational
- Admin-plane mutations
- Alias management
- Internal
- Customer bucket provisioning (long-running)
- Data-plane operations (long-running)
- Aliases
- Scheduled state
- Manifest contract (Storm-side reconciler)
- Customer bucket provisioning
- garage_provision_customer_bucket
- garage_delete_provisioned_bucket
- garage_provision_additional_key
- garage_rotate_customer_key
- Data-plane operations
- Security
- Troubleshooting
Garage Integration
Storm Pulse supports first-class integration with Garage S3 nodes. When enabled, the agent automatically collects Garage cluster state, reports it to the dashboard alongside system metrics, and exposes whitelisted commands for managing buckets and keys - all without opening a terminal.
Overview
When Garage integration is enabled, Storm Pulse:
- Collects node status, zone, capacity, and version on every connection
- Reports bucket names, sizes, object counts, aliases, and key permissions as a manifest
- Refreshes Garage state every 30 seconds (configurable)
- Exposes 27 whitelisted commands across info, admin-plane mutations, alias management, customer bucket provisioning, and data-plane operations
- Protects API key secrets - never logged at any level
Requirements
- Garage running as a Docker container (official
dxflrs/garageimage) - Container accessible via
docker execfrom the operator's admin user (the agent runs against rootlessdockerd) /opt/garage/garage.tomlpresent (or a custom path — see below)
Setup — new server
If you are setting up Storm Pulse on a fresh Garage node, run stormpulse init from /opt/garage/:
cd /opt/garage
stormpulse init
The wizard detects /opt/garage/garage.toml automatically and prompts:
Checking for Garage installation...
Found: /opt/garage/garage.toml
Enable Garage integration? [Y/n]:
If you confirm, Garage configuration is written into stormpulse.toml alongside the standard config. No separate step needed.
Setup — existing enrolled agent
If Storm Pulse is already enrolled and running, add Garage integration without re-running the full init wizard:
stormpulse garage init
The wizard auto-detects your Garage installation and prompts for confirmation:
Garage installation detected at /opt/garage/garage.toml
Container name [garaged]:
Garage binary [/garage]:
Docker binary [/usr/bin/docker]:
State push interval seconds [30]:
Enable Garage integration? [Y/n]: y
[garage] section written to ~/.config/stormpulse/stormpulse.toml
Restart stormpulse now? [Y/n]: y
Press Enter to accept defaults. The container name is auto-detected from your docker-compose.yml.
Use --force to overwrite an existing [garage] section:
stormpulse garage init --force
Use --garage-config if your Garage config is not in a standard location:
stormpulse garage init --garage-config /custom/path/garage.toml
Configuration
The [garage] section added to ~/.config/stormpulse/stormpulse.toml:
[garage]
enabled = true
container_name = "garaged" # Docker container name
garage_binary = "/garage" # Path to garage binary inside the container
docker_binary = "/usr/bin/docker" # Absolute path to docker on the host
config_path = "/opt/garage/garage.toml" # Used for detection only
state_push_interval_seconds = 30 # How often Garage state is refreshed (manifest cadence)
Detection scan paths (checked in order if --garage-config not specified):
/opt/garage/garage.toml/etc/garage/garage.toml./garage.toml
Verifying it works
After restart, check the agent logs:
journalctl --user -u stormpulse -f
You should see:
INFO Garage node detected, collecting initial state
INFO Sent register (v0.1.1)
On the dashboard, the Garage node's server record will include cluster state — zone, capacity, data available, bucket list, and version.
Available commands
Admin-plane commands run via docker exec <container> /garage <subcommand> with absolute paths and shell=False. Data-plane and provisioning commands are handled directly by the agent — see Customer bucket provisioning and Data-plane operations below for details.
All command names are prefixed garage_. The tables below omit the prefix for readability.
Informational
| Command | Description | Params |
|---|---|---|
status |
Show node status and health | — |
stats |
Show cluster statistics | — |
bucket_list |
List all buckets | — |
bucket_info |
Show bucket details | bucket_name |
key_list |
List all API keys | — |
Admin-plane mutations
All non-confirmation defaults are No. Commands marked Yes prompt the dashboard for explicit confirmation before dispatch.
| Command | Description | Params | Confirm |
|---|---|---|---|
bucket_create |
Create a new bucket | bucket_name |
No |
bucket_delete |
Delete a bucket | bucket_name |
Yes |
key_create |
Create a new API key (returns secret in stdout) |
key_name |
No |
key_delete |
Delete an API key | key_id |
Yes |
bucket_allow |
Grant full access to a bucket for a key | bucket_name, key_id |
No |
bucket_allow_rw |
Grant read+write access | bucket_name, key_id |
No |
bucket_allow_ro |
Grant read-only access | bucket_name, key_id |
No |
bucket_deny |
Revoke all access to a bucket for a key | bucket_name, key_id |
Yes |
bucket_website_allow |
Enable static website hosting | bucket_name, index_document (default index.html), error_document (default 404.html) |
No |
bucket_website_deny |
Disable static website hosting | bucket_name |
Yes |
Alias management
See Aliases below for what local vs. global aliases mean and how they appear in the manifest.
| Command | Description | Params | Confirm |
|---|---|---|---|
bucket_alias_global_add |
Attach a global alias to a bucket | bucket_name (UUID or existing alias), new_alias |
No |
bucket_alias_global_remove |
Detach a global alias | alias_name |
Yes |
bucket_alias_local_add |
Attach a local alias (scoped to a key) | key_id, bucket_name, new_alias |
No |
bucket_alias_local_remove |
Detach a local alias | key_id, alias_name |
Yes |
Internal
| Command | Description | Long-running |
|---|---|---|
refresh |
Trigger immediate state collection and metrics push | No |
Customer bucket provisioning (long-running)
These commands orchestrate multi-step Garage operations with rollback on failure. See Customer bucket provisioning for the orchestration model.
| Command | Description | Params | Confirm | Sensitive output |
|---|---|---|---|---|
provision_customer_bucket |
Create bucket + admin key + local alias atomically | display_name, key_name_admin |
No | Yes |
delete_provisioned_bucket |
Delete bucket, all local aliases, and orphaned keys | bucket_id |
Yes | No |
provision_additional_key |
Create a tier-specific key for an existing provisioned bucket | new_key_name, bucket_id, local_alias, key_tier (rw|ro) |
No | Yes |
rotate_customer_key |
Create a new key, transfer permissions, delete old key | old_key_id, new_key_name, bucket_id, local_alias, key_tier (all|rw|ro) |
No | Yes |
Data-plane operations (long-running)
These commands talk directly to the local Garage S3 endpoint via the agent's built-in SigV4 client (no docker exec, no boto3). They require S3 credentials in the params. See Data-plane operations.
| Command | Description | Confirm | Sensitive output |
|---|---|---|---|
bucket_clear |
Bulk-delete every object in a bucket | Yes | Yes |
bucket_set_cors |
Set CORS rules on a bucket | No | Yes |
walk_bucket_stats |
Count objects and bytes under a prefix | No | Yes |
Aliases
Garage has two kinds of bucket aliases. Both appear in the manifest; both are first-class in the command surface.
Global aliases are cluster-wide names for a bucket. Every global alias is unique across the entire Garage cluster. Customers can reach a bucket by global alias on any key that has permission. Use bucket_alias_global_add / bucket_alias_global_remove to manage them.
Local aliases are per-key names. A local alias documents on key A can coexist with a totally different bucket called documents on key B. Local aliases are what Storm Cellar uses for per-customer naming — the customer's bucket appears as display_name from their own key, without that name being claimed cluster-wide. Use bucket_alias_local_add / bucket_alias_local_remove to manage them.
In the manifest:
- Every bucket entry carries a
bucket_id(the 16-char Garage UUID). This is the join key for dashboard reconciliation — aliases are not unique across tenants and cannot be used for tenant attribution. - The per-bucket
aliasesfield lists both global and local aliases. Local aliases include thekey_idthey're scoped to.
Garage's orphan rule: a bucket must have at least one alias (global or local) at all times. You cannot remove the last alias from a live bucket. delete_provisioned_bucket handles this automatically (attaches a temporary global alias if needed, then deletes the bucket through that reference); plain bucket_delete does not — try to delete a bucket with no aliases via bucket_delete and Garage refuses.
Scheduled state
Garage state is collected once on connection (included in the register payload) and refreshed every state_push_interval_seconds thereafter (default 30s). Each refresh runs garage status, garage stats, garage key list, and garage bucket info for each bucket, and includes the result in the next metrics.push payload.
Bucket state includes size, object count, key permissions, website hosting status (website_access, website_index_document, website_error_document), and quotas.
All keys are included in state — both bucket-linked keys (with permissions) and unlinked keys. The top-level keys list contains every key by ID and name; per-bucket key references include permissions.
On-demand refresh
After a mutation (bucket create, key create, etc.), the dashboard can dispatch garage_refresh to trigger an immediate state collection. The agent collects fresh state, sends a command.result confirming success, then immediately sends a metrics.push with the updated Garage data. This avoids waiting up to 30 seconds for the next scheduled refresh.
Long-running commands in the garage group also trigger this auto-refresh on success — see Manifest contract below.
Manifest contract (Storm-side reconciler)
The dashboard side (Storm Cellar) treats this Garage-state payload as a manifest and the agent-reported view as the source of truth. Storm-side CustomerBucket / CustomerKey rows are projections of what the manifest last reported. The full design lives in the dashboard repo at _architecture/specs/storm-pulse-manifest-foundation.md.
What this means for the Pulse contract:
- The state collector is load-bearing. Bucket/key/permission/alias data must be complete and accurate per push. A bucket Garage has but the manifest omits will be reconciled away on the dashboard side. A bucket the manifest reports but Garage doesn't will be flagged as a divergence on the dashboard.
bucket_id(16-char Garage UUID) is the join key for tenant attribution on the dashboard side. It must be present per bucket entry. Aliases are not unique across tenants and cannot be used as join keys.- Per-bucket key list is also load-bearing. The dashboard's per-key reconciler joins on
key_idper bucket. A key Storm has on a bucket that the manifest's per-bucket key list omits will be reconciled away (subject to a 30s grace window for in-flight rotations). Catches force-revoke, ops-sidekey delete. - Cadence default is 30s.
state_push_interval_secondsdefaults to 30 inprompt_garage_values. Bypass-path operations (internal admin actions, direct Garage CLI) reconcile in ≤30s. Older deployments initialized with the 300 default should be retoggled to 30. - Auto-refresh after long-running commands is implemented in
agent._post_success_hook. Every successful long-running command in thegaragegroup triggers an immediatemetrics.pushcarrying fresh state. Customer-initiated ops reconcile in <1s. - No new envelope is required. The existing
metrics.pushalready carries the manifest shape. The Storm-side reconciler consumes it viacellar_relay.relay_customer_metrics.
Customer bucket provisioning
Storm Cellar provisions customer buckets through four long-running commands that orchestrate multi-step Garage operations. Each is a single dispatchable unit, but internally runs a sequence of garage CLI calls and rolls back on partial failure. The orchestration model exists because no single Garage primitive does what Cellar needs (e.g. "create a bucket with a tenant-scoped local alias and an admin key in one atomic action"), and because partial state from a half-failed multi-step would diverge from the manifest.
garage_provision_customer_bucket
Creates a bucket, attaches a local alias scoped to a new admin key, and grants the key full access. The output secret rides in the result payload (sensitive output flag is set).
Internal step order:
- Create bucket with a throwaway global alias (Garage requires an alias to exist; a throwaway lets us avoid claiming a customer name globally).
- Create the admin key with
key_name_admin. - Grant the admin key all permissions on the bucket.
- Attach the local alias
display_name(scoped to the admin key) — this is the customer-facing reference. - Remove the throwaway global alias.
On failure at any step, the orchestrator rolls back already-completed steps in reverse and reports failure_reason naming the failed step (e.g. admin_key_create_failed, unalias_throwaway_failed). If rollback itself fails, failure_reason="rollback_failed" and the dashboard surfaces a manual-cleanup state.
garage_delete_provisioned_bucket
Deletes a provisioned bucket, all its local aliases, and any keys that no longer have access to other buckets.
Internal step order:
bucket info <bucket_id>— enumerate existing aliases and keys with permissions.- If no global alias exists, attach a temporary one (required by Garage's "every bucket must have an alias" rule before delete is permitted).
- Detach every local alias via
bucket unalias --local <key> <alias>. - Delete the bucket via the temporary or existing global alias reference.
- For each key found in step 1, check
key info— if the key now has zero buckets, delete it. Shared keys (still attached to other buckets) are left alone.
Failed key deletes in step 5 are logged as manual_cleanup_required and do not fail the overall command. The bucket is gone; the orphaned key is a minor cleanup item, not a divergence.
garage_provision_additional_key
Creates a tier-specific key (rw or ro) and attaches it to an already-provisioned bucket with the customer's local alias name.
Internal step order:
- Create the key with
new_key_name. - Grant
bucket_allow_rworbucket_allow_ropermissions onbucket_id(perkey_tier). - Attach
local_aliasscoped to the new key.
Rollback on failure unwinds in reverse. Failure reasons: invalid_key_tier, new_key_create_failed, new_key_permission_grant_failed, new_key_alias_attach_failed, rollback_failed.
garage_rotate_customer_key
Replaces an existing key with a freshly created one, transfers permissions and the local alias, then deletes the old key. Used when a customer regenerates credentials.
Internal step order:
- Create the new key with
new_key_name. - Grant permissions per
key_tier(allmirrors the old key's tier, otherwiserworro). - Attach
local_aliasscoped to the new key. - Delete the old key.
Failure reasons: new_key_create_failed, new_key_permission_grant_failed, new_key_alias_attach_failed, old_key_delete_failed, rollback_failed.
Race window: between step 3 and step 4, both old and new keys briefly have access to the bucket. The dashboard's manifest reconciler tolerates this with a 30s grace window (see Manifest contract).
Data-plane operations
Three long-running commands talk directly to the local Garage S3 endpoint rather than the admin CLI: garage_bucket_clear, garage_bucket_set_cors, and garage_walk_bucket_stats. They share a purpose-built SigV4 S3 client at stormpulse/garage/s3.py — stdlib + cryptography only, no boto3 (a 30MB dependency for what amounts to a handful of HTTP operations). They also share an envelope pattern: customer-controlled S3 credentials ride in the command.request params, are used for the job's lifetime, and never persist. See Security for the secret-handling contract.
All three follow the long-running lifecycle (see Protocol Specification — Long-running commands) and all three accept these five base params:
| Param | Description |
|---|---|
bucket_name |
Bucket to operate on |
s3_endpoint |
Garage S3 endpoint URL (e.g. http://localhost:3900) |
region |
S3 region for SigV4 signing |
access_key_id |
Customer S3 access key ID |
secret_access_key |
Customer S3 secret access key |
garage_bucket_clear
Bulk-deletes every object in a bucket. Garage's CLI does not expose a "clear bucket" primitive — every clear is a series of S3 DeleteObject calls.
Lifecycle:
command.progress(stage"starting") — credential pre-flight: HeadBucket. Bad credentials produce an immediate terminalcommand.resultwithfailure_reason="auth_failed"before any delete is attempted.command.progress(stage"starting") — full paginated list to computetotal. Until listing finishes,totalisnull.command.progress(stage"running") — once per batch of 1000 deleted objects, withcurrentadvancing towardtotal.command.progress(stage"finalizing") — summary computation.command.result— terminal. Carries summary fields at the top of the payload.
Terminal payload extras:
| Field | Type | Description |
|---|---|---|
deleted_count |
int | Objects successfully deleted. |
failed_count |
int | Objects that failed to delete (zero on full success). |
errors |
array | Up to 10 per-object errors. Each entry has Key, Code, Message. Truncated for wire-payload sanity. |
duration_seconds |
float | Wall-clock duration of the job. |
error |
string | Human-readable failure summary (only present on failure). |
Failure modes:
failure_reason |
When | Counts behavior |
|---|---|---|
auth_failed |
HeadBucket returned 403 / SignatureDoesNotMatch / InvalidAccessKeyId. No delete attempted. |
All counts are 0. Dashboard rate-limiter increments. |
partial_failure |
DeleteObjects reported per-object errors. The bucket was partially cleared. | deleted_count and failed_count reflect what actually happened. Dashboard leaves bucket DB stats untouched; customer retries. |
os_error |
List or Delete failed at HTTP level (network, server error). | Counts reflect work completed before the error. |
agent_disconnected |
Set by the dashboard when the agent's WebSocket closes mid-job. The agent itself emits no terminal result on disconnect — cancelled jobs die silently. | Dashboard's responsibility, not the agent's. |
A clear that fails partway through is naturally idempotent under retry: re-running on the same bucket continues from whatever objects remain. The agent does not persist intermediate state — there is no resume-from-checkpoint, just retry-from-scratch (which is cheap because list-and-delete is the same code path).
garage_bucket_set_cors
Sets CORS rules on a bucket via S3 PutBucketCors. Required for browser-side uploads from custom domains.
Extra param:
| Param | Description |
|---|---|
origins |
JSON array of origin strings (e.g. ["https://example.com", "https://*.example.com"]). Pattern allows the bracket/quote/wildcard set needed for valid CORS origins. |
Failure modes: auth_failed (HeadBucket rejected creds before the PUT), os_error (PutBucketCors failed at HTTP level).
garage_walk_bucket_stats
Counts objects and sums bytes under a key prefix. Used by Cellar to report per-prefix storage usage to customers without pulling per-object stats out of Garage's metrics surface.
Extra params:
| Param | Description |
|---|---|
prefix |
Key prefix to walk. Empty string ("") walks the whole bucket. |
max_objects |
Cap on the number of objects to count. Default "100000". If the prefix has more, the walk stops early and truncated=true in the result. |
Terminal payload extras:
| Field | Type | Description |
|---|---|---|
count |
int | Object count under the prefix. |
bytes |
int | Sum of Size for all objects walked. |
truncated |
bool | true if count reached max_objects before the prefix was exhausted. Dashboard treats truncated results as lower bounds. |
duration_seconds |
float | Wall-clock duration of the walk. |
error |
string | Human-readable failure summary (only present on failure). |
Failure modes: auth_failed, os_error.
Security
garage_key_create returns the new API key's secret in command.result.stdout. This secret:
- Is never logged at any level (DEBUG, INFO, WARNING, ERROR) by the agent
- Travels over mTLS — encrypted in transit
- Is displayed once by the dashboard and never stored
When a key is created from the dashboard, the secret is shown once in the sidebar. Closing the sidebar discards it permanently. Save it immediately.
Customer secrets in data-plane command params
The three data-plane commands — garage_bucket_clear, garage_bucket_set_cors, and garage_walk_bucket_stats — carry a customer-controlled S3 secret in their command.request params. This is required because they hit the data plane, not the admin plane, and the agent does not hold customer S3 credentials at rest.
Storm Pulse handles these secrets as follows:
- They travel in the HMAC-signed
command.requestenvelope, encrypted in transit by mTLS. - The agent constructs a
GarageS3Clientfrom the param, uses it for the job's lifetime, and drops the reference when the function returns. Python's GC reclaims the memory. - The secret is never written to disk, never logged, and never appears in the terminal
command.resultpayload. - Every command in this family sets
sensitive_output = true, so any future addition of stdout to the result will be filtered from agent logs. - The
secret_access_keyparam's regex pattern (.+) accepts any non-empty string. The agent does not validate secret format — that's the dashboard's responsibility before dispatch.
New long-running commands that need the same pattern should follow this approach (params + sensitive_output = true + per-job client construction) rather than introducing standing credentials in the agent's config.
Troubleshooting
| Symptom | Check |
|---|---|
| "Garage node detected" not in logs | Is /opt/garage/garage.toml present? Is enabled = true in [garage]? |
| Garage state missing from dashboard | Check state_push_interval_seconds — state refreshes on schedule, not immediately |
Commands fail with not_found |
Is the garaged container running? Is container_name correct in config? |
garage_key_create returns empty secret |
Secret was already displayed and discarded — delete the key and create a new one |
| Garage state shows stale data | Dispatch garage_refresh from the dashboard, or restart stormpulse to force re-collection on register |
Data-plane command returns auth_failed |
Customer's S3 access key was wrong, was revoked, or doesn't have permission on the bucket. Verify via key info <key_id>. |
delete_provisioned_bucket returns bucket_not_empty |
Bucket has objects. Dispatch bucket_clear first, then retry delete. |
Provisioning command returns rollback_failed |
A multi-step orchestration failed and the rollback also failed. Partial state exists in Garage — inspect with bucket info / key list and clean up manually. Dashboard will surface a divergence on the next manifest push. |
delete_provisioned_bucket logs manual_cleanup_required for a key |
A key delete failed during step 5 (orphan-key cleanup) but the bucket itself is gone. The orphaned key has no bucket access — safe to ignore, or delete with garage_key_delete. |