Skip to main content

70. Unified Worker for All Consumers

Status: Accepted Date: 2025-07-06

Context

Our system uses BullMQ for many different asynchronous tasks across various modules (Admin, Apollo, Minerva, etc.). Each of these tasks requires a "consumer" or "worker" process to listen to a queue and execute the job. We need to decide on a deployment strategy for these consumers. We could deploy each consumer as a separate microservice, or we could group them together.

Decision

We will adopt a Unified Worker (or "Fat Worker") deployment model. We will create a single, comprehensive mercury-worker application whose sole purpose is to host and run all the queue consumers from across the entire Mercury system.

This worker application will import the necessary modules and providers from all other parts of the application to provide the context needed for each consumer to do its job. This single mercury-worker artifact will be the engine for all background job processing.

Consequences

Positive:

  • Simplified Deployment & Operations: We only have to deploy, monitor, and scale one type of worker process instead of dozens of different ones. This massively simplifies our operational overhead.
  • Resource Efficiency: A single, larger worker process can often be more resource-efficient (in terms of memory and CPU) than running many small, separate processes, as they can share common dependencies and connections.
  • Centralized Management: All background processing logic is co-located and managed in a single place, making it easier to reason about the system's overall capacity and behavior.

Negative:

  • Tightly Coupled Deployment: All consumers are bundled together. A change to one consumer requires a restart of the entire worker process, which temporarily stops processing for all other tasks. A bug in one consumer (like a memory leak) can affect all other consumers running in the same process.
  • Single Point of Failure: If the unified worker process crashes, all background job processing across the entire system stops.
  • Complex Dependency Management: The unified worker has a large number of dependencies, as it needs to import modules from all over the application. This can make its dependency graph complex.

Mitigation:

  • High Availability Deployment: The mercury-worker will be deployed as a container with multiple replicas. If one instance crashes, the others continue processing jobs, ensuring high availability. BullMQ's queueing ensures no jobs are lost during a brief outage of one worker.
  • Robust Error Handling: Each consumer will be wrapped in its own try/catch block to ensure that an error in one job does not crash the entire worker process. We will implement strict quality gates, including unit and integration tests for each consumer.
  • Modular Imports: We will leverage NestJS's modular architecture. The worker application will simply be a composition of modules, keeping the top-level wiring clean, even if the underlying dependency graph is large.
  • Monitoring: We will heavily monitor the resource usage (CPU, memory) of the worker process to detect issues like memory leaks early.