84. PostgreSQL and Redis for Data Storage
Status: Accepted Date: 2025-07-06
Context
The Mercury system needs to store two main types of data:
- Core business data: Structured, relational data that requires transactional integrity, such as user accounts, trading positions, tournament results, and system configuration. Data integrity is critical for these entities.
- Volatile/Cache data: Temporary or ephemeral data used for caching, session management, and background job queueing. This data needs to be accessed with very low latency, but some data loss is acceptable in the event of a crash.
We need to choose the right database technologies for these two distinct use cases.
Decision
We will adopt a dual-database architecture using PostgreSQL as our primary relational database and Redis as our in-memory data store.
-
PostgreSQL will be the single source of truth for all core, transactional data. We will use it for its robustness, reliability, support for complex queries, and strong ACID guarantees. We will interact with it through an ORM (MikroORM) to provide a type-safe interface.
-
Redis will be used for all non-persistent or semi-persistent data that requires high performance. Its primary roles will be:
- Caching: Caching the results of expensive queries or frequently accessed data.
- Queueing: Serving as the backend for our BullMQ job queue system (
adr://queue-based-processing). - Session Storage: Storing user session information for fast retrieval.
Consequences
Positive:
- Best Tool for the Job: This approach uses each database for what it excels at. PostgreSQL provides data integrity and powerful querying, while Redis provides extremely high performance for volatile data.
- Robust and Mature Technologies: Both PostgreSQL and Redis are industry-standard, battle-tested, and have huge ecosystems of tools and documentation.
- Clear Separation of Concerns: The roles of the two databases are clear. Core business logic interacts with PostgreSQL; performance and infrastructure concerns are handled by Redis.
- Improved Performance: Offloading caching and queueing to Redis significantly reduces the load on the primary PostgreSQL database, allowing it to perform better for its core task of transactional data management.
Negative:
- Increased Operational Overhead: We now have two database systems to deploy, monitor, manage, and back up, which increases operational complexity compared to a single-database solution.
- Data Synchronization Issues: When caching data from PostgreSQL in Redis, there is a potential for the cache to become stale if not invalidated correctly.
Mitigation:
- Managed Services: We will use managed cloud services (e.g., Amazon RDS for PostgreSQL, ElastiCache for Redis) to offload the majority of the operational burden of managing these databases.
- Systematic Cache Invalidation: We will implement a systematic approach to cache invalidation. For example, any service that updates data in PostgreSQL will be responsible for explicitly invalidating the corresponding Redis cache keys.
- Clear Guidelines: We will have clear development guidelines on what data belongs in which database. Only data that is truly ephemeral or a performance-critical cache belongs in Redis. Everything else must reside in PostgreSQL.