Performance & Load Testing

Performance Journey

AutoCom went through a systematic optimization process, benchmarked with k6 at each step:

Stage	Config	Users	req/s	p50	p95
Baseline	PHP-FPM, 1 Postgres	50	27	343ms	2.44s
+ Octane	FrankenPHP, 1 Postgres	50	102	4.75ms	12ms
+ Stress	FrankenPHP, 1 Postgres	200	144	59ms	442ms
+ 500 users	FrankenPHP, 1 Postgres	500	201	933ms	2.61s
+ CloudNativePG	FrankenPHP, 3 Postgres	500	456	122ms	1.43s

Total improvement: 17x throughput, 10x concurrent user capacity.

What Made the Difference

Laravel Octane + FrankenPHP (5-10x)

PHP-FPM boots Laravel from scratch on every request — 30-50ms overhead for config loading, service container, module registration. Octane boots once, keeps everything in memory.

PHP-FPM:  Request → Boot Laravel (30ms) → Your Code (5ms) → Response = 35ms
Octane:   Request → Your Code (5ms) → Response = 5ms

See Octane Deployment Guide for setup and trade-offs.

CloudNativePG Read/Write Split (2.3x)

Single Postgres was the bottleneck at 200+ users. CloudNativePG with 1 primary + 2 replicas distributes reads:

All SELECT queries → replicas (load balanced)
All INSERT/UPDATE/DELETE → primary
sticky => true prevents stale reads within same request
MarkRecentWrite middleware forces primary reads for 5 seconds after writes

See Backup & Recovery for CloudNativePG configuration.

HPA Autoscaling

Horizontal Pod Autoscaler scales API pods from 1 to 5 based on CPU:

kubectl autoscale deployment api --min=1 --max=5 --cpu-percent=60

Under the 500-user stress test, HPA scaled from 1 → 5 pods within 30 seconds.

Load Testing with k6

Load tests live in a separate GitLab repo: autocommerce/load-tests.

Setup

brew install k6
git clone git@gitlab.wexron.io:autocommerce/load-tests.git
cd load-tests

Running Tests

# Get an auth token
TOKEN=$(kubectl exec -n autocom deployment/api -- php artisan tinker --execute="
\$t = \App\Models\Tenant::first(); tenancy()->initialize(\$t);
echo trim(\App\Models\User::first()->createToken('k6')->accessToken);
" 2>&1 | grep "eyJ" | tr -d '\n\r ')

# Smoke test (1 user, verify everything works)
k6 run scripts/smoke.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

# Load test (50 users, 4 minutes)
k6 run scripts/load.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

# Stress test (ramp to 200 users)
k6 run scripts/stress.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

# Extreme stress (500 users)
k6 run scripts/stress-500.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

Test Scenarios

Script	Users	Duration	Purpose
`smoke.js`	1	30s	Verify endpoints work
`load.js`	50	4m	Normal production load
`stress.js`	10→200	9m	Find degradation point
`stress-500.js`	50→500	9m	Extreme stress test

Thresholds

Tests use these default thresholds:

Metric	Load test	Stress test
p95 response	< 500ms	< 2000ms
p99 response	< 1000ms	< 3000ms
Error rate	< 1%	< 10%

Optimization Checklist

When performance degrades, check in this order:

Is Octane running? — kubectl logs -n autocom deployment/api --tail=1 should show FrankenPHP
How many workers? — kubectl exec deployment/api -- php artisan octane:status
Is caching working? — Check X-Cache: HIT header on GET responses
Database connections? — kubectl exec deployment/postgres -- psql -c "SELECT count(*) FROM pg_stat_activity"
Replication lag? — kubectl get cluster autocom-db -n autocom
Pod resources? — kubectl top pods -n autocom
HPA scaled? — kubectl get hpa -n autocom

Response Caching

The CacheApiResponse middleware auto-caches all GET requests with smart TTLs:

Endpoint	TTL	Reason
`/modules`, `/roles`, `/permissions`	60-120s	Rarely changes
`/orders`, `/customers`, `/products`	5s	Fresh but absorbs burst
`/notifications`, `/dashboard`	10s	Semi-real-time
`/translate/languages`, `/presets`	5min-1hr	Static data
`/auth/`, `/install/`	Never	Mutations/security

Skip cache with Cache-Control: no-cache header.

Results Tracking

Load test results are tracked in a dedicated GitLab repo: autocommerce/load-tests.

The RESULTS.md file documents every run with:

Date, config, and changes made
Side-by-side metric comparisons (throughput, p50, p95, error rate)
Analysis of what improved or degraded
Next action items

This provides a historical record of performance decisions and their measured impact.

Infrastructure Stack

Component	Technology	Purpose
App server	Laravel Octane + FrankenPHP	Zero-bootstrap API serving
Database	CloudNativePG (1 primary + 2 replicas)	Read/write split, auto-failover
Cache/Queue	Valkey 8 (Redis-compatible, BSD-3 license)	Sessions, cache, queues
Proxy	Nginx	CORS, gzip, reverse proxy to Octane
Autoscaling	K8s HPA	3→10 API pods based on CPU
Backups	MinIO + CloudNativePG WAL archiving	6-hour snapshots, 7-day PITR

Why Valkey over Redis?

AutoCom uses Valkey instead of Redis for open-source compliance:

License: BSD-3 (fully open source) vs Redis's RSAL/SSPL dual license
Compatibility: Drop-in replacement — same protocol, same commands, same Laravel driver
Performance: On par with Redis 7, with additional improvements in Valkey 8
Governance: Linux Foundation project, community-driven