Deployment Rings
Deployment rings are how AutoCom safely runs different module versions for different tenants on the same shared platform. Each ring is an independent slice of the cluster (its own pods, its own pinned module versions) and tenants are routed to exactly one ring at a time.
This page explains the design, the operator runbook, and the trade-offs. For background on the underlying versioning system that rings sit on top of, read Module Versioning Overview first.
The Problem Rings Solve
PHP class loading is process-global. Once Composer's autoloader binds Modules\Orders\OrderController to a file, that mapping is locked for the life of the PHP process — and Octane processes live for hours, serving thousands of requests across many tenants. You physically cannot have two versions of the same class in the same process.
So per-request, per-tenant version swapping is not a thing in PHP. The constraint has to move to a higher boundary: different processes for different versions.
That's what rings are. Each ring is its own set of api/horizon/nginx pods running its own pinned module versions. Tenants are routed to a specific ring's pods at the ingress layer based on a single column in the central database.
Architecture
┌──────────────────────────────────┐
│ Shared (single instance) │
│ ├── PostgreSQL (per-tenant DBs) │
│ ├── Redis │
│ ├── MinIO / S3 │
│ ├── Frontend (one Next.js image) │
│ ├── Docs │
│ └── indian-post (microservice) │
└──────────────────────────────────┘
▲
│ central DB has tenants.module_ring_id
│
┌─────────────────┬───────┴────────┬─────────────────┐
│ │ │ │
ring: stable ring: edge ring: canary ring: experimental
───────────── ───────────── ───────────── ─────────────
k8s namespace: autocom-edge autocom-canary autocom-experimental
autocom
manifest: manifest: manifest: manifest:
stable.lock.json edge.lock.json canary.lock. experimental.lock
json .json
Pods: Pods: Pods: Pods:
api × N api × M api × 2 api × 1
horizon × N horizon × M horizon × 1 horizon × 1
nginx × 2 nginx × 2 nginx × 1 nginx × 1
Tenants: Tenants: Tenants: Tenants:
- acme - widgets - test-canary - internal-qa
- foo - bar
- … - …
Per-ring vs shared services
| Component | Per-ring | Shared | Why |
|---|---|---|---|
| api | ✓ | Loads module code at boot — needs ring-specific code | |
| horizon | ✓ | Same Octane workers, same module code | |
| nginx | ✓ | Routes to the ring's api pods specifically | |
| frontend | ✓ | Same UI for everyone; API hostname is ring-aware | |
| docs | ✓ | Same documentation for everyone | |
| redis | ✓ | Cache; tenant-isolated by key prefix | |
| postgres (CNPG) | ✓ | Tenant data; per-tenant DBs inside one cluster | |
| indian-post | ✓ | Stateless microservice | |
| minio | ✓ | Storage; tenant-isolated by bucket prefix |
The shared layer is shared because it's either stateful (DB, MinIO) or version-irrelevant (frontend, docs, microservices). A tenant's data lives in its tenant DB regardless of which ring serves it.
K8s manifest layout
The base manifests are split into two subdirectories so ring overlays can pull in only the per-ring services without duplicating shared infra:
k8s/
├── base/
│ ├── kustomization.yaml ← meta-kustomization that includes both subdirs
│ ├── namespace.yaml
│ ├── per-ring/ ← one set per ring
│ │ ├── kustomization.yaml
│ │ ├── api.yaml
│ │ ├── horizon.yaml
│ │ ├── nginx.yaml ← uses nginxinc/nginx-unprivileged for PSA restricted
│ │ └── network-policies.yaml
│ ├── shared/ ← single instance regardless of ring count
│ │ ├── kustomization.yaml
│ │ ├── frontend.yaml
│ │ ├── docs.yaml
│ │ ├── redis.yaml
│ │ └── indian-post.yaml
│ └── resource-quota.yaml
└── overlays/
├── shared/ ← deploy the shared layer once on cluster bootstrap
│ └── kustomization.yaml
└── rings/
├── _shared-aliases/ ← kustomize component with ExternalName services
│ ├── kustomization.yaml ← bridges non-stable rings to shared services
│ └── aliases.yaml
├── stable/ ← deploys into 'autocom' (no aliases needed)
├── edge/ ← deploys into 'autocom-edge' (uses _shared-aliases)
└── canary/ ← deploys into 'autocom-canary' (uses _shared-aliases)
The existing single-deployment overlays (local, production, staging) reference k8s/base/ and pull in both layers — backward-compatible. Ring overlays reference k8s/base/per-ring/ only.
The _shared-aliases component creates ExternalName services in each non-stable ring's namespace that point at the real services in autocom. So redis, autocom-db-rw, frontend, docs, indian-post, minio etc. are reachable from a canary pod by short name — no app-config changes needed.
Environment variables that steer a pod to its ring
Each per-ring pod is shaped by two env vars injected from the ring's overlay configmap-ring.yaml:
| Variable | Purpose | Resolution |
|---|---|---|
RING_NAME |
Ring identity reported by ModuleLoaderService::ringName() and logged on every module load |
Defaults to stable if unset |
MODULE_LOCK_PATH |
Explicit path to the lock file this pod should read (takes precedence over the per-ring default) | Defaults to modules/manifests/{RING_NAME}.lock.json, then falls back to the legacy modules/manifest.lock.json |
ModuleLoaderService::ringLockPath() resolves in this order:
env('MODULE_LOCK_PATH')if set — explicit override winsmodules/manifests/{ringName()}.lock.jsonif that file existsmodules/manifest.lock.jsonas a final fallback
Set MODULE_LOCK_PATH directly when you need a pod to read a non-default lock file — e.g. to test a promotion before applying it, or to run an ephemeral ring off a one-off manifest.
Hostname Convention
Ring routing happens at the ingress layer based on hostname:
acme.stable.acme.io → stable ring (autocom namespace)
acme.edge.acme.io → edge ring (autocom-edge namespace)
acme.canary.acme.io → canary ring (autocom-canary namespace)
acme.acme.io → backward-compat alias for stable ring
The shorter <tenant>.acme.io form is preserved as a backward-compatible alias for the stable ring so existing tenant URLs keep working. New ring-specific hostnames are issued only when a tenant moves off stable.
To verify which ring a request is hitting:
curl http://acme.canary.acme.io/api/health
{
"status": "ok",
"ring": {
"name": "canary",
"lock_path": "modules/manifests/canary.lock.json"
},
"services": { "database": "up", "redis": "up" }
}
The ring.name field in /api/health is the operator's primary sanity check.
Data Model
module_rings table (central DB)
module_rings
├── id
├── name unique short identifier (stable, edge, canary)
├── display_name UI label
├── description
├── manifest_path path to this ring's lock file
├── k8s_namespace which k8s namespace this ring lives in
├── k8s_service in-cluster nginx service hostname
├── hostname_segment ring segment for public URLs (e.g. "canary")
├── promotion_order 0=most conservative, higher=more bleeding edge
├── is_active whether tenants can be assigned here
├── is_default true for the default ring (exactly one)
└── timestamps
Seeded automatically with one row: stable. New rings are created via:
php artisan tinker
# ModuleRing::create([
# 'name' => 'edge',
# 'display_name' => 'Edge',
# 'manifest_path' => 'modules/manifests/edge.lock.json',
# 'k8s_namespace' => 'autocom-edge',
# 'k8s_service' => 'nginx.autocom-edge.svc.cluster.local',
# 'hostname_segment' => 'edge',
# 'promotion_order' => 10,
# 'is_active' => true,
# ]);
tenants.module_ring_id column
A foreign key into module_rings. Default for all existing tenants: the stable ring. New tenants default to the ring marked is_default = true.
Operator Runbook
See current state
# All rings + tenant counts
php artisan ring:list
# Detail for a specific ring + the tenants in it
php artisan ring:show stable --tenants
# Detail + the modules pinned in this ring's manifest
php artisan ring:show edge --modules
Move a tenant to a different ring
php artisan ring:assign acme canary
# (interactive confirmation)
# Skip confirmation
php artisan ring:assign acme canary --force
# Add an audit reason (logged to the operation history)
php artisan ring:assign acme canary --reason="opted into beta program"
The change takes effect on the next request from that tenant. In-flight requests on the old ring complete normally. Long-running connections (Reverb websockets) drop and reconnect to the new ring.
Promote a module version from one ring to another
# Promote orders @ whatever-version-edge-has → stable
php artisan ring:promote orders --from=edge --to=stable
# Pin a specific version directly (no source ring lookup)
php artisan ring:promote orders --with-version=1.5.0 --to=edge
# Skip confirmation
php artisan ring:promote orders --from=edge --to=stable --force
This rewrites the destination ring's manifests/<ring>.lock.json file with the new entry. It does NOT trigger any deploy itself — you commit the change to git, push to main, and CI rolls out the destination ring on the next pipeline run.
Add a new ring
End-to-end runbook for spinning up a new ring (e.g. canary). Steps 1–5 are one-time-per-ring; step 6 is for every tenant move thereafter.
Step 1: Register the ring in the database
kubectl exec -n autocom deployment/api -- php artisan tinker --execute='
\App\Models\ModuleRing::create([
"name" => "canary",
"display_name" => "Canary",
"description" => "Bleeding edge — internal QA tenants only",
"manifest_path" => "modules/manifests/canary.lock.json",
"k8s_namespace" => "autocom-canary",
"k8s_service" => "nginx.autocom-canary.svc.cluster.local",
"hostname_segment" => "canary",
"promotion_order" => 20,
"is_active" => true,
"is_default" => false,
]);
'
Step 2: Initialize the canary lock file
cp modules/manifests/stable.lock.json modules/manifests/canary.lock.json
git add modules/manifests/canary.lock.json
git commit -m "rings: initialize canary lock file from stable"
git push gitlab-main main
Step 3: Create the namespace + copy required secrets
The ring's pods need the same image-pull, app, and OAuth secrets that exist in the autocom namespace:
kubectl create namespace autocom-canary
# Copy gitlab-registry image pull secret
kubectl get secret gitlab-registry -n autocom -o yaml | \
sed 's/namespace: autocom/namespace: autocom-canary/' | \
kubectl apply -f -
# Copy autocom-secrets (APP_KEY, DB credentials, registry token, etc.)
kubectl get secret autocom-secrets -n autocom -o yaml | \
sed 's/namespace: autocom/namespace: autocom-canary/' | \
kubectl apply -f -
# Copy passport-keys for OAuth
kubectl get secret passport-keys -n autocom -o yaml | \
sed 's/namespace: autocom/namespace: autocom-canary/' | \
kubectl apply -f -
# Copy the autocom-config ConfigMap (DB host, Redis host, app settings)
kubectl get configmap autocom-config -n autocom -o yaml | \
sed 's/namespace: autocom/namespace: autocom-canary/' | \
kubectl apply -f -
This copy is one-time per ring. Nothing syncs it. If you rotate
APP_KEY, add a new secret, or update the config map inautocom, the canary ring keeps serving the old values until an operator manually re-copies every object above. There is no auto-sync (no External Secrets, no Sealed Secrets wiring, no Reflector). If you add a second or third ring, write yourself a short shell script that does the copy for all of them at once so you don't forget one during a rotation.
Step 4: Apply the canary ring overlay
kubectl apply -k k8s/overlays/rings/canary
This creates 3 deployments (api, horizon, nginx) tagged with ring: canary, plus the 8 ExternalName service aliases that bridge the canary namespace to the shared services (db, redis, minio, frontend, docs, indian-post) running in autocom.
Step 5: Verify the ring is alive and reporting itself
kubectl rollout status deployment/api -n autocom-canary --timeout=120s
# Hit the canary api directly — should report ring=canary
kubectl exec -n autocom-canary deployment/api -- \
curl -s http://localhost:8000/api/health | jq .ring
# Compare to stable
kubectl exec -n autocom deployment/api -- \
curl -s http://localhost:8000/api/health | jq .ring
If both return the same ring name, the overlay didn't apply correctly. Most common cause: the api pod fell back to env-default stable because RING_NAME wasn't injected — check kubectl describe pod -n autocom-canary -l app=api | grep RING_NAME.
Step 6: Move tenants in
php artisan ring:assign acme canary
# or with audit reason:
php artisan ring:assign acme canary --reason="opted into beta program" --force
The change takes effect on the tenant's next request. No data migration, no downtime, no app restart.
Promote via CI (audit-trail-friendly)
For changes that need an audit trail, use the manual CI promotion pipeline instead of the artisan command:
- In the GitLab UI, go to
autocommerce/main→ Pipelines → Run Pipeline - Set variables:
PROMOTE_ALIAS=ordersPROMOTE_FROM=canaryPROMOTE_TO=edge
- Click Run
The ring-promote job rewrites the destination manifest, commits with a message like promote(edge): orders 1.5.0 → 1.6.0, and pushes to main. The git history becomes your full audit log.
Ring Promotion Strategy
A typical flow for shipping a module version safely:
Developer ships Orders v1.6.0
│
▼
CI release pipeline produces ShippingIndiaPost-1.6.0.zip
│
▼
Operator: ring:promote orders --to=canary --with-version=1.6.0
│
▼
Canary ring deploys → 1-3 internal tenants run it for 24-48h
│
▼
Operator: ring:promote orders --from=canary --to=edge
│
▼
Edge ring deploys → opt-in customers run it for ~1 week
│
▼
Operator: ring:promote orders --from=edge --to=stable
│
▼
Stable ring deploys → all remaining tenants run it
Each promotion step is reversible: roll the destination ring's manifest back, redeploy. The package registry retains every released version forever, so rollback is just a manifest-file edit.
Trade-offs
What rings give you
- Real isolation. A bad version on canary cannot crash, OOM, or starve tenants on stable — they're literally in different OS processes.
- Independent scaling. Hot tenants on one ring don't affect cold tenants on another. Each ring scales on its own metrics.
- Trivially incremental. Day 1 has exactly one ring (stable), which is what you have today. Add a canary ring only when a customer needs it.
- Tenant promotion is one DB row update. Move a tenant from stable → canary → next request lands on the new ring.
- Industry-standard pattern. Microsoft 365 (Insider/Beta/Current/Monthly Enterprise/Semi-Annual), GitHub Enterprise, GitLab.com, Notion, Linear — every major SaaS doing safe rollouts uses rings.
What rings do NOT solve
-
Cross-ring data sharing in real time. Tenants on different rings can't directly call each other's modules. They share the same Postgres so their data is reachable, but their code is isolated. This is by design — that's where the safety comes from.
-
In-place tenant promotion without a request gap. Moving a tenant from stable → edge takes effect on the next request. There's a tiny window where in-flight requests on the old ring complete normally and new requests start going to the new ring. Fine for everything except long-running websockets, which drop and reconnect.
-
Different DB schemas per ring. Tenant DBs are shared across rings. If Orders 1.5.0 adds a column, that column has to be backward-compatible (additive only) so Orders 1.4.0 on the stable ring doesn't choke. This is a real discipline cost — it's how every SaaS doing rolling upgrades operates.
-
Frontend versioning. The frontend is single-version. If you need different UI per ring, that's a separate problem (feature flags or per-ring frontend deploys).
-
Operational cost. N rings means N×(api+horizon+nginx) pods. Day-1 cost is one ring (zero overhead). Adding a second ring roughly doubles the per-tenant pod budget. Plan capacity accordingly.
Verifying Ring Assignment
The /api/health endpoint exposes the current ring. Operators can verify routing in 5 seconds:
# Hit the same tenant via different hostnames — different rings respond
$ curl http://acme.stable.acme.io/api/health | jq .ring
{ "name": "stable", "lock_path": "modules/manifests/stable.lock.json" }
$ curl http://acme.canary.acme.io/api/health | jq .ring
{ "name": "canary", "lock_path": "modules/manifests/canary.lock.json" }
If both return the same ring name, your ingress routing isn't wired up correctly. Check:
kubectl get ingress -A— is the canary ring's ingress present?kubectl get pods -n autocom-canary— are the canary pods running?kubectl logs -n autocom-canary deployment/api | grep ringName— what RING_NAME did the pod boot with?
Quick Reference
# Inspect
php artisan ring:list # all rings
php artisan ring:show stable --tenants --modules # one ring, full detail
# Assign tenants
php artisan ring:assign acme canary
php artisan ring:assign acme canary --reason="opted in"
# Promote module versions across rings
php artisan ring:promote orders --from=canary --to=edge
php artisan ring:promote orders --with-version=1.5.0 --to=edge --force
# Generate per-ring manifests
php artisan module:lock --ring=stable
php artisan module:lock --ring=edge
php artisan module:verify --ring=canary
# Apply ring overlays to the cluster
kubectl apply -k k8s/overlays/rings/stable
kubectl apply -k k8s/overlays/rings/edge
kubectl apply -k k8s/overlays/rings/canary
# Verify which ring a request is hitting
curl http://<tenant>.<ring>.acme.io/api/health | jq .ring
Known Pitfalls
Lessons from the rings rollout that aren't obvious until they bite.
PSA restricted + image filesystem writes
If a cluster enforces Pod Security Admission restricted, every pod must run as a non-root user. Two services in the per-ring manifests hit this and need specific fixes:
- nginx: the default
nginx:alpineruns its entrypoint as root and chowns/var/cache/nginx/*during boot. Under PSA restricted that fails withchown: Operation not permitted.k8s/base/per-ring/nginx.yamlusesnginxinc/nginx-unprivileged:alpineinstead (listens on8080, not80). Don't swap it back. - api / horizon (Octane):
backend/Dockerfile.octanechowns the whole/apptree to UID 1000 at build time. The first attempt only chownedstorage/andbootstrap/cache/, which looked complete but missed/app/public/frankenphp-worker.php— Octane'sInstallsFrankenPhpDependencieswrites that file on first boot and fails withPermission deniedotherwise. If you fork the image, keep thechown -R 1000:1000 /appline in place.
Octane memory ceiling scales with module count
Horizon loads every module's service provider at boot. With 23 modules (AI, Workflows, WMS, Reseller*, all the themes…) the steady-state footprint peaks around 900Mi–1Gi. The per-ring manifest sets limits.memory: 1536Mi with that headroom; earlier values of 384Mi and 768Mi both tripped OOMKilled during module bootstrap. If you add another ~10 modules, re-profile and bump again.
Secrets don't sync between rings
See the note under Step 3 of "Add a new ring" — every secret and config map you need in a new ring namespace is copied manually and stays stale until you copy it again. This is the single biggest operational gotcha for multi-ring setups. A follow-up to wire External Secrets or the Reflector controller into the shared overlay would close this.
No auto-rollback on failed hooks
php artisan module:install and module:upgrade do NOT auto-rollback if the module's onInstall / onUpgrade hook throws. The content swap commits before the hook runs, so a failed hook leaves the module installed in an inconsistent state. The operator has to manually run module:rollback <alias> after investigating. See Module Lifecycle → Error handling for the full semantics.
See Also
- Module Versioning Workflow — the per-module release cycle that produces the artifacts rings consume
- Module Versioning — Overview & Reference — the full reference for the underlying versioning system
- Module Lifecycle — the
onInstall/onUpgradehooks that fire during ring promotions