Performance & Load Testing

Performance Journey

AutoCom went through a systematic optimization process, benchmarked with k6 at each step:

Stage Config Users req/s p50 p95
Baseline PHP-FPM, 1 Postgres 50 27 343ms 2.44s
+ Octane FrankenPHP, 1 Postgres 50 102 4.75ms 12ms
+ Stress FrankenPHP, 1 Postgres 200 144 59ms 442ms
+ 500 users FrankenPHP, 1 Postgres 500 201 933ms 2.61s
+ CloudNativePG FrankenPHP, 3 Postgres 500 456 122ms 1.43s

Total improvement: 17x throughput, 10x concurrent user capacity.

What Made the Difference

Laravel Octane + FrankenPHP (5-10x)

PHP-FPM boots Laravel from scratch on every request — 30-50ms overhead for config loading, service container, module registration. Octane boots once, keeps everything in memory.

PHP-FPM:  Request → Boot Laravel (30ms) → Your Code (5ms) → Response = 35ms
Octane:   Request → Your Code (5ms) → Response = 5ms

See Octane Deployment Guide for setup and trade-offs.

CloudNativePG Read/Write Split (2.3x)

Single Postgres was the bottleneck at 200+ users. CloudNativePG with 1 primary + 2 replicas distributes reads:

  • All SELECT queries → replicas (load balanced)
  • All INSERT/UPDATE/DELETE → primary
  • sticky => true prevents stale reads within same request
  • MarkRecentWrite middleware forces primary reads for 5 seconds after writes

See Backup & Recovery for CloudNativePG configuration.

HPA Autoscaling

Horizontal Pod Autoscaler scales API pods from 1 to 5 based on CPU:

kubectl autoscale deployment api --min=1 --max=5 --cpu-percent=60

Under the 500-user stress test, HPA scaled from 1 → 5 pods within 30 seconds.

Load Testing with k6

Load tests live in a separate GitLab repo: autocommerce/load-tests.

Setup

brew install k6
git clone git@gitlab.wexron.io:autocommerce/load-tests.git
cd load-tests

Running Tests

# Get an auth token
TOKEN=$(kubectl exec -n autocom deployment/api -- php artisan tinker --execute="
\$t = \App\Models\Tenant::first(); tenancy()->initialize(\$t);
echo trim(\App\Models\User::first()->createToken('k6')->accessToken);
" 2>&1 | grep "eyJ" | tr -d '\n\r ')

# Smoke test (1 user, verify everything works)
k6 run scripts/smoke.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

# Load test (50 users, 4 minutes)
k6 run scripts/load.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

# Stress test (ramp to 200 users)
k6 run scripts/stress.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

# Extreme stress (500 users)
k6 run scripts/stress-500.js -e BASE_URL=http://localhost:30080 -e TENANT_ID=demo -e AUTH_TOKEN="$TOKEN"

Test Scenarios

Script Users Duration Purpose
smoke.js 1 30s Verify endpoints work
load.js 50 4m Normal production load
stress.js 10→200 9m Find degradation point
stress-500.js 50→500 9m Extreme stress test

Thresholds

Tests use these default thresholds:

Metric Load test Stress test
p95 response < 500ms < 2000ms
p99 response < 1000ms < 3000ms
Error rate < 1% < 10%

Optimization Checklist

When performance degrades, check in this order:

  1. Is Octane running?kubectl logs -n autocom deployment/api --tail=1 should show FrankenPHP
  2. How many workers?kubectl exec deployment/api -- php artisan octane:status
  3. Is caching working? — Check X-Cache: HIT header on GET responses
  4. Database connections?kubectl exec deployment/postgres -- psql -c "SELECT count(*) FROM pg_stat_activity"
  5. Replication lag?kubectl get cluster autocom-db -n autocom
  6. Pod resources?kubectl top pods -n autocom
  7. HPA scaled?kubectl get hpa -n autocom

Response Caching

The CacheApiResponse middleware auto-caches all GET requests with smart TTLs:

Endpoint TTL Reason
/modules, /roles, /permissions 60-120s Rarely changes
/orders, /customers, /products 5s Fresh but absorbs burst
/notifications, /dashboard 10s Semi-real-time
/translate/languages, /presets 5min-1hr Static data
/auth/*, /install/* Never Mutations/security

Skip cache with Cache-Control: no-cache header.

Results Tracking

Load test results are tracked in a dedicated GitLab repo: autocommerce/load-tests.

The RESULTS.md file documents every run with:

  • Date, config, and changes made
  • Side-by-side metric comparisons (throughput, p50, p95, error rate)
  • Analysis of what improved or degraded
  • Next action items

This provides a historical record of performance decisions and their measured impact.

Infrastructure Stack

Component Technology Purpose
App server Laravel Octane + FrankenPHP Zero-bootstrap API serving
Database CloudNativePG (1 primary + 2 replicas) Read/write split, auto-failover
Cache/Queue Valkey 8 (Redis-compatible, BSD-3 license) Sessions, cache, queues
Proxy Nginx CORS, gzip, reverse proxy to Octane
Autoscaling K8s HPA 3→10 API pods based on CPU
Backups MinIO + CloudNativePG WAL archiving 6-hour snapshots, 7-day PITR

Why Valkey over Redis?

AutoCom uses Valkey instead of Redis for open-source compliance:

  • License: BSD-3 (fully open source) vs Redis's RSAL/SSPL dual license
  • Compatibility: Drop-in replacement — same protocol, same commands, same Laravel driver
  • Performance: On par with Redis 7, with additional improvements in Valkey 8
  • Governance: Linux Foundation project, community-driven