C2 Faction Headquarters Architecture | Architecture

Status

Accepted -- In active development. PostgreSQL migration completed February 2026, replacing the initial better-sqlite3 prototype database.

Context

Torn.com is a text-based MMORPG with a deep faction warfare system. Factions coordinate wars, organized crimes, chain attacks, and internal economics through the game's web interface and API. The game provides a comprehensive API, but no built-in tools for the operational complexity that competitive factions deal with daily:

War tracking is manual. During ranked wars, leaders alt-tab between the game and spreadsheets to track who hit whom, calculate respect earned, and determine fair reward payouts. Wars can last hours with hundreds of attacks.
Organized crime coordination is fragile. OCs require specific member slots filled with the right stats. Leaders track readiness via Discord pings and memory. Members get missed, OCs expire, rewards are lost.
Financial accounting does not exist. Post-war, leaders promise item payouts based on contribution. Calculating who earned what, fetching current market prices, and distributing rewards is a multi-hour manual process prone to errors and disputes.
Member analytics are scattered. Understanding who is active, who is contributing, and who needs attention requires manually pulling data from multiple API endpoints and cross-referencing timestamps.

The Torn API provides all the raw data. What is missing is a system that polls it, aggregates it, and presents it as operational intelligence.

Requirements

Real-time war scoreboard -- Live attack feed during ranked wars with per-member stats, updated within 30 seconds of an attack landing
Organized crime management -- Track OC slots, member readiness, and completion status with notifications when OCs are ready to initiate
Automated payout calculation -- Post-war reward distribution based on configurable contribution formulas with market price lookups
Member analytics -- Activity tracking, contribution history, and performance trends over time
Multi-faction isolation -- Multiple factions can use the platform without seeing each other's data
Role-based access control -- Permissions mapped to in-game faction positions, not a separate permission system
Discord integration -- Webhook notifications for war events, OC readiness, and chain milestones
Audit trail -- Log who changed what, especially for payout modifications and permission changes

Constraints

The Torn API enforces a hard rate limit of 100 requests per minute per API key
API keys are the only authentication mechanism -- there are no OAuth flows or session tokens from the game
Members provide their personal API keys, which must be stored securely (they grant access to sensitive in-game data)
The operator (me) runs this for my faction and potentially allied factions -- it is not a SaaS product, but must support multi-tenant isolation
Runs on a single VPS behind the existing Cloudflare Tunnel + Traefik + Authentik stack
Budget is homelab-scale -- no managed database services or paid infrastructure

Decision

Build a full-stack web application using Next.js 16 with the App Router, PostgreSQL for persistence, Redis for caching and rate limiting, and BullMQ for background job processing. Authentication uses Torn API key verification against the game's API, with JWT sessions for subsequent requests. Background workers handle all API polling independently from the web server process.

Architecture Overview

                     Browser
                        |
                        v
               Cloudflare Tunnel
                        |
                        v
                Traefik + Authentik
                        |
                        v
    +-------------------------------------------+
    |            Next.js App Server              |
    |  (App Router, API routes, SSR, RSC)        |
    +-----+-----------+----------------+--------+
          |           |                |
          v           v                v
    +---------+  +---------+   +-------------+
    |Postgres |  |  Redis  |   |   BullMQ    |
    |  17     |  |    7    |   |   Queues    |
    |(primary |  |(cache,  |   |(jobs, retry,|
    | store)  |  | rate    |   | scheduling) |
    |         |  | limits, |   |             |
    |         |  | sessions)|  |             |
    +---------+  +---------+   +------+------+
                                      |
                     +----------------+----------------+
                     |                |                |
                     v                v                v
              +-----------+   +------------+   +-----------+
              | Torn API  |   |Aggregation |   | Discord   |
              |  Worker   |   |  Worker    |   |  Worker   |
              | (polling, |   | (rollups,  |   |(webhooks, |
              |  dedup,   |   |  rankings) |   | alerts)   |
              |  ingest)  |   |            |   |           |
              +-----------+   +------------+   +-----------+
                     |
                     v
              Torn.com API
              (100 req/min/key)

Component Responsibilities

Component	Responsibility	Runtime
Next.js App Server	SSR, API routes, authentication, RBAC enforcement, frontend rendering	Node.js 22 (App Router)
PostgreSQL 17	Primary data store for all persistent state -- factions, users, wars, attacks, OCs, payouts, audit log	Containerized
Redis 7	API response caching, rate limit token buckets, JWT session store, BullMQ queue backend	Containerized
Torn API Worker	Polls Torn API for member data, war attacks, OC status, market prices; deduplicates and persists	BullMQ worker process
Aggregation Worker	Daily rollups at 3 AM, per-member stat aggregation, ranking calculations	BullMQ worker process
Discord Worker	Sends webhook notifications for war events, OC readiness, chain milestones	BullMQ worker process
Prisma ORM	Schema definition, migrations, type-safe database queries, seed data	Build-time + runtime

Data Model

Core Entities

Entity	Purpose	Key Relationships
Faction	Tenant boundary -- all queries scoped by faction ID	Has many Users, RankedWars, OrganizedCrimes
User	Faction member with encrypted API key and assigned permissions	Belongs to Faction, has many WarAttacks
RankedWar	A tracked war instance with metadata, scores, and lifecycle state	Belongs to Faction, has many WarAttacks, WarMemberStats
WarAttack	Individual attack record (denormalized from API) with dedup via timestamp	Belongs to RankedWar
WarMemberStats	Aggregated per-member stats for a specific war	Belongs to RankedWar, references User
WarReward	Item rewards logged by leaders post-war	Belongs to RankedWar
WarPayout	Calculated payout per member based on contribution formula	Belongs to RankedWar, references User
OrganizedCrime	OC instance with type, slots, readiness, and outcome	Belongs to Faction
OCParticipant	Member assigned to an OC slot with readiness status	Belongs to OrganizedCrime, references User
ChainReport	Chain tracking session with duration and hit count	Belongs to Faction
ChainMemberStats	Per-member contribution to a chain	Belongs to ChainReport, references User
RankConfig	RBAC configuration mapping in-game positions to permissions	Belongs to Faction
DiscordConfig	Webhook URLs and notification preferences per faction	Belongs to Faction
AuditLog	Immutable log of sensitive actions with actor, action, and diff	Belongs to Faction

Multi-Tenant Isolation

Every database query is scoped by faction ID. This is enforced at the Prisma client level through middleware that injects a where: { factionId } clause on all read operations and validates faction ownership on all writes. There is no cross-faction data access path.

Faction A                    Faction B
+-----------------------+    +-----------------------+
| Users (A)             |    | Users (B)             |
| Wars (A)              |    | Wars (B)              |
| Attacks (A)           |    | Attacks (B)           |
| OCs (A)               |    | OCs (B)               |
| Audit Log (A)         |    | Audit Log (B)         |
+-----------------------+    +-----------------------+
         |                            |
         +----------+    +-----------+
                    |    |
                    v    v
              +----------------+
              |  PostgreSQL    |
              |  (shared DB,   |
              |   logical      |
              |   isolation)   |
              +----------------+

Authentication and Authorization

Authentication Flow

   User                  Next.js                 Torn API
    |                      |                        |
    |  POST /api/auth      |                        |
    |  { apiKey }          |                        |
    |--------------------->|                        |
    |                      |  GET /user?key=...     |
    |                      |----------------------->|
    |                      |  { player_id, name,    |
    |                      |    faction }            |
    |                      |<-----------------------|
    |                      |                        |
    |                      |  Encrypt API key       |
    |                      |  (AES-256-GCM)         |
    |                      |  Store/update user     |
    |                      |  Issue JWT pair         |
    |                      |                        |
    |  { accessToken,      |                        |
    |    refreshToken }    |                        |
    |<---------------------|                        |

API keys are encrypted with AES-256-GCM using a unique initialization vector per key before storage. After the initial authentication, the key is never sent back to the frontend. The encrypted key is only decrypted server-side when the background workers need to make Torn API calls on behalf of the user.

RBAC Model

The permission system maps directly to in-game faction positions rather than inventing a separate role hierarchy. This means the access model is immediately intuitive to faction leaders who already understand their game's permission structure.

17 granular permissions cover all sensitive operations:

Permission	Controls
`war.view`	View war scoreboard and attack history
`war.manage`	Start/stop war tracking, modify war settings
`war.payout`	View and modify payout calculations
`oc.view`	View organized crime status and slots
`oc.manage`	Assign members to OC slots, initiate OCs
`chain.view`	View chain reports and member stats
`chain.manage`	Start/stop chain tracking
`member.view`	View member profiles and activity
`member.manage`	Modify member settings, force-refresh data
`finance.view`	View faction financial summaries
`finance.manage`	Log rewards, approve payouts
`discord.view`	View Discord webhook configuration
`discord.manage`	Configure webhook URLs, notification rules
`config.view`	View faction settings and rank configuration
`config.manage`	Modify rank-permission mappings
`audit.view`	View audit log
`admin`	Full access, manage faction registration

Faction leaders configure which in-game position maps to which permissions via the RankConfig entity. Middleware on every API route checks the requesting user's permissions before proceeding.

Background Workers

Torn API Worker

The largest worker at approximately 910 lines. Responsible for all communication with the Torn API.

Polling Schedule:

Endpoint	Interval	Rationale
Members list	5 minutes	Activity detection, stat tracking
War attacks (attacksfull)	30 seconds	Near-real-time war scoreboard
OC status	2 minutes	Readiness notifications
Market prices	15 minutes	Payout value calculations

Rate Limiting:

The Torn API allows 100 requests per minute per key. The worker implements a token bucket algorithm backed by Redis:

Redis Key: ratelimit:{apiKey}
Algorithm: Token bucket (100 tokens, refill 100/min)

Before each API call:
  1. MULTI
  2. Check token count
  3. Decrement if available
  4. EXEC
  5. If no tokens: delay until next refill window

Multiple factions share the same worker process but each faction's polling uses its own designated API key, keeping rate limits isolated per key.

Incremental Data Fetching:

War attack data uses timestamp-based deduplication. Each poll records the latest attack timestamp. The next poll requests only attacks after that timestamp. This avoids reprocessing thousands of duplicate records during a multi-hour war.

Poll N:   GET attacksfull?from={lastTimestamp}  -> 12 new attacks
Poll N+1: GET attacksfull?from={newTimestamp}   ->  3 new attacks
Poll N+2: GET attacksfull?from={newTimestamp}   ->  0 new attacks (quiet period)

Aggregation Worker

Runs daily at 3 AM UTC. Rolls up raw attack data into per-member statistics, calculates rankings, and generates summary records for the analytics dashboard. Approximately 461 lines.

Operations:

Per-member daily activity rollups
War contribution rankings (respect earned, attacks made, defends)
Chain participation percentages
Trend calculations (7-day, 30-day rolling averages)

Discord Worker

Listens on the BullMQ queue for notification events and dispatches them as Discord webhook messages. Approximately 358 lines.

Notification Types:

War started/ended with final scores
Attack milestones (every 50 attacks, or configurable threshold)
OC ready to initiate
Chain bonus hit milestones
Payout report published

Each faction configures their own webhook URL and which notification types they want. The worker respects Discord's rate limits (30 messages per channel per minute).

Deployment Topology

All components run as Docker containers on a VPS behind the existing infrastructure stack:

VPS Host (behind Cloudflare Tunnel)
├── Traefik (reverse proxy, TLS)
├── Authentik (identity, SSO)
└── Docker Compose: c2-faction-hq
    ├── app (Next.js, non-root, node:22-alpine)
    ├── worker (BullMQ workers, same image, different entrypoint)
    ├── postgres (PostgreSQL 17, persistent volume)
    ├── redis (Redis 7, AOF persistence)
    └── migrate (Prisma migrations, runs once on deploy)

Docker Configuration

Multi-stage builds keep the production image lean:

Stage 1: deps      (install node_modules)
Stage 2: build     (next build, prisma generate)
Stage 3: runtime   (node:22-alpine, non-root user, copy build artifacts only)

The worker service uses the same Docker image as app but overrides the entrypoint to run the BullMQ worker process instead of the Next.js server. This keeps the build pipeline simple -- one image, two services.

The migrate service runs prisma migrate deploy on startup and exits. Docker Compose depends_on with health checks ensures Postgres is ready before migrations run, and the app waits for migrations to complete before accepting traffic.

Trust Boundaries

Boundary 1: Browser to Next.js

All client-server communication uses HTTPS (terminated at Traefik). The Next.js API layer enforces:

JWT validation on every authenticated request
CSRF protection via Origin header checking on all state-changing requests
Rate limiting at 5 attempts per minute per IP on the login endpoint
Input validation with Zod schemas on all API payloads

Boundary 2: Next.js to PostgreSQL

The application connects to PostgreSQL over the Docker internal network. Prisma parameterizes all queries, eliminating SQL injection. The database is not exposed on any external port.

Boundary 3: Workers to Torn API

Background workers decrypt user API keys from the database to make Torn API calls. The decrypted key exists only in memory for the duration of the API call. Keys are never logged, never cached in Redis, and never written to disk unencrypted.

Boundary 4: Workers to Discord

Discord webhook URLs are stored in the database per faction. The Discord worker posts to these URLs over HTTPS. Webhook URLs are treated as secrets -- they are not exposed to non-admin users in the frontend.

Boundary 5: Multi-Tenant Data Isolation

Faction ID scoping is the primary isolation mechanism. Every query goes through Prisma middleware that enforces faction boundaries. There is no API endpoint that accepts a raw faction ID from the client -- the faction ID is always derived from the authenticated user's JWT claims.

Security Posture

What is hardened

API keys encrypted at rest with AES-256-GCM (unique IV per key, encryption key from environment variable)
API keys never returned to the frontend after initial authentication
JWT access tokens expire in 24 hours, refresh tokens in 7 days
CSRF validation on all state-changing requests (Origin header check)
Login rate limiting: 5 attempts per minute per IP
Auth middleware on all API routes -- no unprotected state-changing endpoints
Security headers: Content-Security-Policy, X-Frame-Options: DENY, X-Content-Type-Options: nosniff, Strict-Transport-Security with preload
Docker containers run as non-root user
PostgreSQL and Redis listen only on Docker internal network
Audit logging for all sensitive operations (permission changes, payout modifications, API key rotations)
Prisma parameterized queries (no raw SQL string concatenation)

What is monitored

Authentication events (success, failure, rate limit hits) logged to application log
Torn API rate limit consumption tracked in Redis (alerts if consistently above 80% utilization)
BullMQ job failure rates and retry counts
Container health checks with restart policies
Traefik access logs capture all HTTP traffic patterns

What is not yet hardened

Automated database backups are not in place (planned: pg_dump CronJob to encrypted off-site storage)
No horizontal scaling -- single app instance, single worker instance (acceptable for current faction size, would need connection pooling and job partitioning to scale)
Test coverage is minimal (planned: integration tests for auth flow, RBAC enforcement, and payout calculation logic)
No WAF rules specific to the application (relies on Cloudflare's generic WAF rules upstream)
Secrets rotation is manual (planned: automated rotation for JWT signing keys and encryption keys with re-encryption migration)

Alternatives Considered

SolidStart or Remix instead of Next.js

Both were evaluated. SolidStart is what I use for my portfolio site and is excellent for content-heavy, animation-rich pages. Remix has strong data loading patterns.

Chose Next.js because:

The App Router's server component model maps well to a data-heavy dashboard with frequent server-side data fetching
The middleware system provides a clean integration point for auth and RBAC on every route
The ecosystem for complex dashboard applications is more mature -- better library support for data tables, charts, and form handling
Server Actions simplify mutation patterns for the many form-heavy pages (payout editing, OC assignment, rank configuration)

This is a CRUD-heavy operational tool, not a marketing site. Framework maturity and ecosystem depth matter more than rendering performance here.

Raw SQL or Drizzle instead of Prisma

Raw SQL with a query builder (like Kysely) or Drizzle ORM were considered.

Chose Prisma because:

The schema file serves as living documentation of the data model -- 14 entities with complex relationships are easier to reason about in Prisma schema format than in raw migration files
Type-safe queries catch relationship errors at build time, not at runtime
The migration system handles schema evolution cleanly across development and production
Middleware hooks provide a natural place for multi-tenant query scoping
The tradeoff is query performance -- Prisma generates more complex SQL than hand-written queries. For this application's data volume (thousands of records, not millions), this is acceptable.

Cron jobs instead of BullMQ

The simplest approach would be node-cron scheduling API polls inside the Next.js process.

Chose BullMQ because:

Reliability. If a polling job fails (Torn API down, network timeout), BullMQ retries with configurable backoff. Cron jobs just silently skip the interval.
Observability. BullMQ provides job status, failure reasons, processing times, and retry counts. Cron jobs are fire-and-forget.
Isolation. Workers run in a separate process from the web server. A slow or stuck aggregation job cannot block API responses or SSR rendering.
Concurrency control. BullMQ manages concurrent job execution. Preventing two instances of the same poll from running simultaneously is built into the queue semantics, not bolted on with lock files.

Polling from Next.js API routes instead of separate workers

An alternative was to trigger Torn API polling from API route handlers -- either on a schedule via Vercel Cron (if deployed there) or via client-initiated requests.

Chose separate worker process because:

API routes have execution time limits in most deployment environments
Polling multiple factions with overlapping schedules requires persistent process state (last poll timestamps, rate limit counters)
A crashed web server should not stop background data collection, and vice versa
Resource contention between SSR rendering and heavy data processing is eliminated by process isolation

Consequences

Positive

Real-time operational intelligence. War leaders see a live scoreboard instead of manually checking the game. Decisions about who to encourage, who is underperforming, and when to push for bonus rewards are informed by data, not guesswork.
Fair payout calculations. Contribution-based formulas with market price lookups replace manual spreadsheet math. Disputes decrease because the calculation is transparent and auditable.
Automated coordination. OC readiness notifications and chain milestone alerts reduce the "herding cats" problem. Members know when they are needed without leaders pinging them individually.
Security-first API key handling. Encrypted storage, no frontend exposure, and audit logging mean members can trust the platform with their API keys -- which is the single biggest adoption barrier for any Torn third-party tool.
Multi-tenant from day one. Adding allied factions is a configuration change, not an architecture change.

Negative

Operational complexity. Five containers (app, worker, postgres, redis, migrate) is more to operate than a single SQLite-backed application. The PostgreSQL migration added deployment complexity but was necessary for concurrent access and proper query capabilities.
Torn API dependency. The entire platform is only as reliable as the Torn API. Downtime, rate limit changes, or API deprecations directly impact the application. No mitigation exists beyond caching stale data and graceful degradation.
Single-instance limitation. The current architecture does not horizontally scale. If multiple large factions generate enough load to saturate one VPS, the path forward requires connection pooling (PgBouncer), worker partitioning, and potentially a second app instance behind a load balancer. None of this is in place.
Minimal test coverage. The rapid development pace prioritized features over tests. The RBAC system, payout calculations, and auth flow are the highest-risk untested code paths.

Neutral

PostgreSQL over SQLite was the right move. The migration from better-sqlite3 to PostgreSQL added short-term effort but resolved concurrent access issues, enabled proper full-text search, and provides a path to automated backups and replication if needed.
Tailwind CSS 4 handles the dashboard UI efficiently. The utility-first approach suits a data-dense application where custom design is less important than consistent, functional layouts.
BullMQ's Redis dependency was already present for caching and sessions. Adding queue semantics to an existing Redis instance added no new infrastructure.

Revision History

Date	Change
2026-02-17	Initial architecture decision, documenting post-PostgreSQL migration state

Status#

Context#

Requirements#

Constraints#

Decision#

Architecture Overview#

Component Responsibilities#

Data Model#

Core Entities#

Multi-Tenant Isolation#

Authentication and Authorization#

Authentication Flow#

RBAC Model#

Background Workers#

Torn API Worker#

Aggregation Worker#

Discord Worker#

Deployment Topology#

Docker Configuration#

Trust Boundaries#

Boundary 1: Browser to Next.js#

Boundary 2: Next.js to PostgreSQL#

Boundary 3: Workers to Torn API#

Boundary 4: Workers to Discord#

Boundary 5: Multi-Tenant Data Isolation#

Security Posture#

What is hardened#

What is monitored#

What is not yet hardened#

Alternatives Considered#

SolidStart or Remix instead of Next.js#

Raw SQL or Drizzle instead of Prisma#

Cron jobs instead of BullMQ#

Polling from Next.js API routes instead of separate workers#

Consequences#

Positive#

Negative#

Neutral#

Revision History#