Status
Accepted -- In active development. PostgreSQL migration completed February 2026, replacing the initial better-sqlite3 prototype database.
Context
Torn.com is a text-based MMORPG with a deep faction warfare system. Factions coordinate wars, organized crimes, chain attacks, and internal economics through the game's web interface and API. The game provides a comprehensive API, but no built-in tools for the operational complexity that competitive factions deal with daily:
- War tracking is manual. During ranked wars, leaders alt-tab between the game and spreadsheets to track who hit whom, calculate respect earned, and determine fair reward payouts. Wars can last hours with hundreds of attacks.
- Organized crime coordination is fragile. OCs require specific member slots filled with the right stats. Leaders track readiness via Discord pings and memory. Members get missed, OCs expire, rewards are lost.
- Financial accounting does not exist. Post-war, leaders promise item payouts based on contribution. Calculating who earned what, fetching current market prices, and distributing rewards is a multi-hour manual process prone to errors and disputes.
- Member analytics are scattered. Understanding who is active, who is contributing, and who needs attention requires manually pulling data from multiple API endpoints and cross-referencing timestamps.
The Torn API provides all the raw data. What is missing is a system that polls it, aggregates it, and presents it as operational intelligence.
Requirements
- Real-time war scoreboard -- Live attack feed during ranked wars with per-member stats, updated within 30 seconds of an attack landing
- Organized crime management -- Track OC slots, member readiness, and completion status with notifications when OCs are ready to initiate
- Automated payout calculation -- Post-war reward distribution based on configurable contribution formulas with market price lookups
- Member analytics -- Activity tracking, contribution history, and performance trends over time
- Multi-faction isolation -- Multiple factions can use the platform without seeing each other's data
- Role-based access control -- Permissions mapped to in-game faction positions, not a separate permission system
- Discord integration -- Webhook notifications for war events, OC readiness, and chain milestones
- Audit trail -- Log who changed what, especially for payout modifications and permission changes
Constraints
- The Torn API enforces a hard rate limit of 100 requests per minute per API key
- API keys are the only authentication mechanism -- there are no OAuth flows or session tokens from the game
- Members provide their personal API keys, which must be stored securely (they grant access to sensitive in-game data)
- The operator (me) runs this for my faction and potentially allied factions -- it is not a SaaS product, but must support multi-tenant isolation
- Runs on a single VPS behind the existing Cloudflare Tunnel + Traefik + Authentik stack
- Budget is homelab-scale -- no managed database services or paid infrastructure
Decision
Build a full-stack web application using Next.js 16 with the App Router, PostgreSQL for persistence, Redis for caching and rate limiting, and BullMQ for background job processing. Authentication uses Torn API key verification against the game's API, with JWT sessions for subsequent requests. Background workers handle all API polling independently from the web server process.
Architecture Overview
Browser
|
v
Cloudflare Tunnel
|
v
Traefik + Authentik
|
v
+-------------------------------------------+
| Next.js App Server |
| (App Router, API routes, SSR, RSC) |
+-----+-----------+----------------+--------+
| | |
v v v
+---------+ +---------+ +-------------+
|Postgres | | Redis | | BullMQ |
| 17 | | 7 | | Queues |
|(primary | |(cache, | |(jobs, retry,|
| store) | | rate | | scheduling) |
| | | limits, | | |
| | | sessions)| | |
+---------+ +---------+ +------+------+
|
+----------------+----------------+
| | |
v v v
+-----------+ +------------+ +-----------+
| Torn API | |Aggregation | | Discord |
| Worker | | Worker | | Worker |
| (polling, | | (rollups, | |(webhooks, |
| dedup, | | rankings) | | alerts) |
| ingest) | | | | |
+-----------+ +------------+ +-----------+
|
v
Torn.com API
(100 req/min/key)
Component Responsibilities
| Component | Responsibility | Runtime |
|---|---|---|
| Next.js App Server | SSR, API routes, authentication, RBAC enforcement, frontend rendering | Node.js 22 (App Router) |
| PostgreSQL 17 | Primary data store for all persistent state -- factions, users, wars, attacks, OCs, payouts, audit log | Containerized |
| Redis 7 | API response caching, rate limit token buckets, JWT session store, BullMQ queue backend | Containerized |
| Torn API Worker | Polls Torn API for member data, war attacks, OC status, market prices; deduplicates and persists | BullMQ worker process |
| Aggregation Worker | Daily rollups at 3 AM, per-member stat aggregation, ranking calculations | BullMQ worker process |
| Discord Worker | Sends webhook notifications for war events, OC readiness, chain milestones | BullMQ worker process |
| Prisma ORM | Schema definition, migrations, type-safe database queries, seed data | Build-time + runtime |
Data Model
Core Entities
| Entity | Purpose | Key Relationships |
|---|---|---|
| Faction | Tenant boundary -- all queries scoped by faction ID | Has many Users, RankedWars, OrganizedCrimes |
| User | Faction member with encrypted API key and assigned permissions | Belongs to Faction, has many WarAttacks |
| RankedWar | A tracked war instance with metadata, scores, and lifecycle state | Belongs to Faction, has many WarAttacks, WarMemberStats |
| WarAttack | Individual attack record (denormalized from API) with dedup via timestamp | Belongs to RankedWar |
| WarMemberStats | Aggregated per-member stats for a specific war | Belongs to RankedWar, references User |
| WarReward | Item rewards logged by leaders post-war | Belongs to RankedWar |
| WarPayout | Calculated payout per member based on contribution formula | Belongs to RankedWar, references User |
| OrganizedCrime | OC instance with type, slots, readiness, and outcome | Belongs to Faction |
| OCParticipant | Member assigned to an OC slot with readiness status | Belongs to OrganizedCrime, references User |
| ChainReport | Chain tracking session with duration and hit count | Belongs to Faction |
| ChainMemberStats | Per-member contribution to a chain | Belongs to ChainReport, references User |
| RankConfig | RBAC configuration mapping in-game positions to permissions | Belongs to Faction |
| DiscordConfig | Webhook URLs and notification preferences per faction | Belongs to Faction |
| AuditLog | Immutable log of sensitive actions with actor, action, and diff | Belongs to Faction |
Multi-Tenant Isolation
Every database query is scoped by faction ID. This is enforced at the Prisma client level through middleware that injects a where: { factionId } clause on all read operations and validates faction ownership on all writes. There is no cross-faction data access path.
Faction A Faction B
+-----------------------+ +-----------------------+
| Users (A) | | Users (B) |
| Wars (A) | | Wars (B) |
| Attacks (A) | | Attacks (B) |
| OCs (A) | | OCs (B) |
| Audit Log (A) | | Audit Log (B) |
+-----------------------+ +-----------------------+
| |
+----------+ +-----------+
| |
v v
+----------------+
| PostgreSQL |
| (shared DB, |
| logical |
| isolation) |
+----------------+
Authentication and Authorization
Authentication Flow
User Next.js Torn API
| | |
| POST /api/auth | |
| { apiKey } | |
|--------------------->| |
| | GET /user?key=... |
| |----------------------->|
| | { player_id, name, |
| | faction } |
| |<-----------------------|
| | |
| | Encrypt API key |
| | (AES-256-GCM) |
| | Store/update user |
| | Issue JWT pair |
| | |
| { accessToken, | |
| refreshToken } | |
|<---------------------| |
API keys are encrypted with AES-256-GCM using a unique initialization vector per key before storage. After the initial authentication, the key is never sent back to the frontend. The encrypted key is only decrypted server-side when the background workers need to make Torn API calls on behalf of the user.
RBAC Model
The permission system maps directly to in-game faction positions rather than inventing a separate role hierarchy. This means the access model is immediately intuitive to faction leaders who already understand their game's permission structure.
17 granular permissions cover all sensitive operations:
| Permission | Controls |
|---|---|
war.view |
View war scoreboard and attack history |
war.manage |
Start/stop war tracking, modify war settings |
war.payout |
View and modify payout calculations |
oc.view |
View organized crime status and slots |
oc.manage |
Assign members to OC slots, initiate OCs |
chain.view |
View chain reports and member stats |
chain.manage |
Start/stop chain tracking |
member.view |
View member profiles and activity |
member.manage |
Modify member settings, force-refresh data |
finance.view |
View faction financial summaries |
finance.manage |
Log rewards, approve payouts |
discord.view |
View Discord webhook configuration |
discord.manage |
Configure webhook URLs, notification rules |
config.view |
View faction settings and rank configuration |
config.manage |
Modify rank-permission mappings |
audit.view |
View audit log |
admin |
Full access, manage faction registration |
Faction leaders configure which in-game position maps to which permissions via the RankConfig entity. Middleware on every API route checks the requesting user's permissions before proceeding.
Background Workers
Torn API Worker
The largest worker at approximately 910 lines. Responsible for all communication with the Torn API.
Polling Schedule:
| Endpoint | Interval | Rationale |
|---|---|---|
| Members list | 5 minutes | Activity detection, stat tracking |
| War attacks (attacksfull) | 30 seconds | Near-real-time war scoreboard |
| OC status | 2 minutes | Readiness notifications |
| Market prices | 15 minutes | Payout value calculations |
Rate Limiting:
The Torn API allows 100 requests per minute per key. The worker implements a token bucket algorithm backed by Redis:
Redis Key: ratelimit:{apiKey}
Algorithm: Token bucket (100 tokens, refill 100/min)
Before each API call:
1. MULTI
2. Check token count
3. Decrement if available
4. EXEC
5. If no tokens: delay until next refill window
Multiple factions share the same worker process but each faction's polling uses its own designated API key, keeping rate limits isolated per key.
Incremental Data Fetching:
War attack data uses timestamp-based deduplication. Each poll records the latest attack timestamp. The next poll requests only attacks after that timestamp. This avoids reprocessing thousands of duplicate records during a multi-hour war.
Poll N: GET attacksfull?from={lastTimestamp} -> 12 new attacks
Poll N+1: GET attacksfull?from={newTimestamp} -> 3 new attacks
Poll N+2: GET attacksfull?from={newTimestamp} -> 0 new attacks (quiet period)
Aggregation Worker
Runs daily at 3 AM UTC. Rolls up raw attack data into per-member statistics, calculates rankings, and generates summary records for the analytics dashboard. Approximately 461 lines.
Operations:
- Per-member daily activity rollups
- War contribution rankings (respect earned, attacks made, defends)
- Chain participation percentages
- Trend calculations (7-day, 30-day rolling averages)
Discord Worker
Listens on the BullMQ queue for notification events and dispatches them as Discord webhook messages. Approximately 358 lines.
Notification Types:
- War started/ended with final scores
- Attack milestones (every 50 attacks, or configurable threshold)
- OC ready to initiate
- Chain bonus hit milestones
- Payout report published
Each faction configures their own webhook URL and which notification types they want. The worker respects Discord's rate limits (30 messages per channel per minute).
Deployment Topology
All components run as Docker containers on a VPS behind the existing infrastructure stack:
VPS Host (behind Cloudflare Tunnel)
├── Traefik (reverse proxy, TLS)
├── Authentik (identity, SSO)
└── Docker Compose: c2-faction-hq
├── app (Next.js, non-root, node:22-alpine)
├── worker (BullMQ workers, same image, different entrypoint)
├── postgres (PostgreSQL 17, persistent volume)
├── redis (Redis 7, AOF persistence)
└── migrate (Prisma migrations, runs once on deploy)
Docker Configuration
Multi-stage builds keep the production image lean:
Stage 1: deps (install node_modules)
Stage 2: build (next build, prisma generate)
Stage 3: runtime (node:22-alpine, non-root user, copy build artifacts only)
The worker service uses the same Docker image as app but overrides the entrypoint to run the BullMQ worker process instead of the Next.js server. This keeps the build pipeline simple -- one image, two services.
The migrate service runs prisma migrate deploy on startup and exits. Docker Compose depends_on with health checks ensures Postgres is ready before migrations run, and the app waits for migrations to complete before accepting traffic.
Trust Boundaries
Boundary 1: Browser to Next.js
All client-server communication uses HTTPS (terminated at Traefik). The Next.js API layer enforces:
- JWT validation on every authenticated request
- CSRF protection via Origin header checking on all state-changing requests
- Rate limiting at 5 attempts per minute per IP on the login endpoint
- Input validation with Zod schemas on all API payloads
Boundary 2: Next.js to PostgreSQL
The application connects to PostgreSQL over the Docker internal network. Prisma parameterizes all queries, eliminating SQL injection. The database is not exposed on any external port.
Boundary 3: Workers to Torn API
Background workers decrypt user API keys from the database to make Torn API calls. The decrypted key exists only in memory for the duration of the API call. Keys are never logged, never cached in Redis, and never written to disk unencrypted.
Boundary 4: Workers to Discord
Discord webhook URLs are stored in the database per faction. The Discord worker posts to these URLs over HTTPS. Webhook URLs are treated as secrets -- they are not exposed to non-admin users in the frontend.
Boundary 5: Multi-Tenant Data Isolation
Faction ID scoping is the primary isolation mechanism. Every query goes through Prisma middleware that enforces faction boundaries. There is no API endpoint that accepts a raw faction ID from the client -- the faction ID is always derived from the authenticated user's JWT claims.
Security Posture
What is hardened
- API keys encrypted at rest with AES-256-GCM (unique IV per key, encryption key from environment variable)
- API keys never returned to the frontend after initial authentication
- JWT access tokens expire in 24 hours, refresh tokens in 7 days
- CSRF validation on all state-changing requests (Origin header check)
- Login rate limiting: 5 attempts per minute per IP
- Auth middleware on all API routes -- no unprotected state-changing endpoints
- Security headers: Content-Security-Policy, X-Frame-Options: DENY, X-Content-Type-Options: nosniff, Strict-Transport-Security with preload
- Docker containers run as non-root user
- PostgreSQL and Redis listen only on Docker internal network
- Audit logging for all sensitive operations (permission changes, payout modifications, API key rotations)
- Prisma parameterized queries (no raw SQL string concatenation)
What is monitored
- Authentication events (success, failure, rate limit hits) logged to application log
- Torn API rate limit consumption tracked in Redis (alerts if consistently above 80% utilization)
- BullMQ job failure rates and retry counts
- Container health checks with restart policies
- Traefik access logs capture all HTTP traffic patterns
What is not yet hardened
- Automated database backups are not in place (planned: pg_dump CronJob to encrypted off-site storage)
- No horizontal scaling -- single app instance, single worker instance (acceptable for current faction size, would need connection pooling and job partitioning to scale)
- Test coverage is minimal (planned: integration tests for auth flow, RBAC enforcement, and payout calculation logic)
- No WAF rules specific to the application (relies on Cloudflare's generic WAF rules upstream)
- Secrets rotation is manual (planned: automated rotation for JWT signing keys and encryption keys with re-encryption migration)
Alternatives Considered
SolidStart or Remix instead of Next.js
Both were evaluated. SolidStart is what I use for my portfolio site and is excellent for content-heavy, animation-rich pages. Remix has strong data loading patterns.
Chose Next.js because:
- The App Router's server component model maps well to a data-heavy dashboard with frequent server-side data fetching
- The middleware system provides a clean integration point for auth and RBAC on every route
- The ecosystem for complex dashboard applications is more mature -- better library support for data tables, charts, and form handling
- Server Actions simplify mutation patterns for the many form-heavy pages (payout editing, OC assignment, rank configuration)
This is a CRUD-heavy operational tool, not a marketing site. Framework maturity and ecosystem depth matter more than rendering performance here.
Raw SQL or Drizzle instead of Prisma
Raw SQL with a query builder (like Kysely) or Drizzle ORM were considered.
Chose Prisma because:
- The schema file serves as living documentation of the data model -- 14 entities with complex relationships are easier to reason about in Prisma schema format than in raw migration files
- Type-safe queries catch relationship errors at build time, not at runtime
- The migration system handles schema evolution cleanly across development and production
- Middleware hooks provide a natural place for multi-tenant query scoping
- The tradeoff is query performance -- Prisma generates more complex SQL than hand-written queries. For this application's data volume (thousands of records, not millions), this is acceptable.
Cron jobs instead of BullMQ
The simplest approach would be node-cron scheduling API polls inside the Next.js process.
Chose BullMQ because:
- Reliability. If a polling job fails (Torn API down, network timeout), BullMQ retries with configurable backoff. Cron jobs just silently skip the interval.
- Observability. BullMQ provides job status, failure reasons, processing times, and retry counts. Cron jobs are fire-and-forget.
- Isolation. Workers run in a separate process from the web server. A slow or stuck aggregation job cannot block API responses or SSR rendering.
- Concurrency control. BullMQ manages concurrent job execution. Preventing two instances of the same poll from running simultaneously is built into the queue semantics, not bolted on with lock files.
Polling from Next.js API routes instead of separate workers
An alternative was to trigger Torn API polling from API route handlers -- either on a schedule via Vercel Cron (if deployed there) or via client-initiated requests.
Chose separate worker process because:
- API routes have execution time limits in most deployment environments
- Polling multiple factions with overlapping schedules requires persistent process state (last poll timestamps, rate limit counters)
- A crashed web server should not stop background data collection, and vice versa
- Resource contention between SSR rendering and heavy data processing is eliminated by process isolation
Consequences
Positive
- Real-time operational intelligence. War leaders see a live scoreboard instead of manually checking the game. Decisions about who to encourage, who is underperforming, and when to push for bonus rewards are informed by data, not guesswork.
- Fair payout calculations. Contribution-based formulas with market price lookups replace manual spreadsheet math. Disputes decrease because the calculation is transparent and auditable.
- Automated coordination. OC readiness notifications and chain milestone alerts reduce the "herding cats" problem. Members know when they are needed without leaders pinging them individually.
- Security-first API key handling. Encrypted storage, no frontend exposure, and audit logging mean members can trust the platform with their API keys -- which is the single biggest adoption barrier for any Torn third-party tool.
- Multi-tenant from day one. Adding allied factions is a configuration change, not an architecture change.
Negative
- Operational complexity. Five containers (app, worker, postgres, redis, migrate) is more to operate than a single SQLite-backed application. The PostgreSQL migration added deployment complexity but was necessary for concurrent access and proper query capabilities.
- Torn API dependency. The entire platform is only as reliable as the Torn API. Downtime, rate limit changes, or API deprecations directly impact the application. No mitigation exists beyond caching stale data and graceful degradation.
- Single-instance limitation. The current architecture does not horizontally scale. If multiple large factions generate enough load to saturate one VPS, the path forward requires connection pooling (PgBouncer), worker partitioning, and potentially a second app instance behind a load balancer. None of this is in place.
- Minimal test coverage. The rapid development pace prioritized features over tests. The RBAC system, payout calculations, and auth flow are the highest-risk untested code paths.
Neutral
- PostgreSQL over SQLite was the right move. The migration from better-sqlite3 to PostgreSQL added short-term effort but resolved concurrent access issues, enabled proper full-text search, and provides a path to automated backups and replication if needed.
- Tailwind CSS 4 handles the dashboard UI efficiently. The utility-first approach suits a data-dense application where custom design is less important than consistent, functional layouts.
- BullMQ's Redis dependency was already present for caching and sessions. Adding queue semantics to an existing Redis instance added no new infrastructure.
Revision History
| Date | Change |
|---|---|
| 2026-02-17 | Initial architecture decision, documenting post-PostgreSQL migration state |