Files
nuc/.artifacts/2026-02-01_17-15_liquidgym-database-engines.md
Alejandro Gutiérrez 390eda1595 Initial commit - NUC server configuration and docs
- CLAUDE.md: Server instructions and service reference
- docs/: Persistent documentation (architecture, guides)
- .artifacts/: Session-generated notes
- playwriter-browser/: Remote browser container config

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:49:20 +00:00

5.1 KiB

LiquidGym Database Engines Reference

Date: 2026-02-01 17:15 Context: Reference guide for LiquidGym's multi-engine SQL testing infrastructure

Overview

LiquidGym is a multi-database testing environment designed to verify that analytical queries work identically across different database engines. This ensures engine-agnostic query generation.

Engine Tiers

Core (Always Started)

Engine Image Port Purpose
PostgreSQL 16 postgres:16 5433 Primary test database with sample datasets
CloudBeaver dbeaver/cloudbeaver 8978 Web-based database UI

Tier 1: Essential Engines

Different SQL dialects for cross-engine testing.

Engine Image Port Description
ClickHouse clickhouse/clickhouse-server 8123 (HTTP), 9000 (Native) Column-oriented OLAP database. Extremely fast for analytics on billions of rows. Used by Cloudflare, Uber, eBay. Best for: logs, metrics, time-series analytics.
MySQL 8 mysql:8 3306 World's most popular open-source RDBMS. Tests MySQL-specific SQL dialect.

Tier 2: Distributed & Specialized

Engine Image Port Description
Trino trinodb/trino 8084 Distributed SQL query engine. Queries data across multiple sources (Postgres, S3, Kafka) with single SQL. No storage - just a query layer.
StarRocks starrocks/allin1-ubuntu 9030 (MySQL), 8030 (HTTP) MPP analytics database. Sub-second queries on large datasets. Powers BI dashboards. Fork of Apache Doris with performance improvements.
TimescaleDB timescale/timescaledb:latest-pg16 5434 PostgreSQL extension for time-series data. Auto-partitions by time. Perfect for IoT, metrics, events. Familiar Postgres SQL.

Tier 3: Advanced/Specialized

Engine Image Port Description
Apache Doris apache/doris:doris-all-in-one-2.1.0 9031 (MySQL), 8031 (HTTP) Real-time analytical database. MySQL-compatible. Good for real-time dashboards and ad-hoc queries.
Apache Druid apache/druid:26.0.0 8888 Real-time OLAP for sub-second slice-and-dice analytics. Powers Airbnb, Netflix, Alibaba dashboards. Best for: high-concurrency, low-latency queries.
Apache Spark apache/spark:3.5.0 7077 (Master), 8085 (UI) Distributed compute engine for big data. ML pipelines, ETL, batch processing. Overkill for small datasets.

Observability Stack

Tool Image Port Description
Grafana grafana/grafana 3005 Visualization & dashboards. Query any data source, create alerts. Login: admin/liquidgym
Prometheus prom/prometheus 9090 Metrics collection & alerting. Scrapes metrics from all engines.
Redis redis:7-alpine 6379 In-memory cache. Used for session storage, caching query results.

Usage

cd ~/Desktop/liquidgym/infra

# Start core only (Postgres + CloudBeaver)
docker compose up -d

# Start with Tier 1 engines (+ ClickHouse, MySQL)
docker compose --profile tier1 up -d

# Start with Tier 2 engines (+ Trino, StarRocks, TimescaleDB)
docker compose --profile tier2 up -d

# Start with Tier 3 engines (+ Doris, Spark)
docker compose --profile tier3 up -d

# Start observability stack (+ Prometheus, Grafana, Redis)
docker compose --profile observability up -d

# Start everything
docker compose --profile all up -d

# Load sample datasets
docker compose --profile loader up

Sample Datasets

Dataset Description Tables
Northwind Classic MS Access sample - orders, products, customers 14
Pagila DVD rental store (PostgreSQL port of Sakila) 29
Chinook Digital media store - artists, albums, tracks 11
AdventureWorks Microsoft sample - sales, HR, production 68
Employees Large HR dataset with 300K+ employee records 6
LEGO LEGO sets, parts, themes, colors 8
Netflix Netflix titles catalog 1

When to Use Each Engine

Use Case Recommended Engine
General OLTP PostgreSQL, MySQL
Analytics on large datasets ClickHouse, StarRocks
Time-series / IoT TimescaleDB
Real-time dashboards Druid, Doris
Query across multiple DBs Trino
Big data / ML pipelines Spark
Caching Redis

Resource Requirements

Profile RAM CPU Disk
Core 1GB 1 1GB
+ Tier 1 6GB 2 3GB
+ Tier 2 10GB 4 5GB
+ Tier 3 16GB+ 6+ 10GB+
+ Observability +2GB +1 +1GB

NUC Migration Status

The following have been migrated to NUC and no longer need local volumes:

Service NUC Location Status
PostgreSQL (datasets) 192.168.1.3:5433 Migrated
MySQL 192.168.1.3:3306 Migrated

Tier 1-3 engines remain local-only for development testing.

  • LiquidGym project: ~/Desktop/liquidgym/infra/
  • Docker Compose: ~/Desktop/liquidgym/infra/docker-compose.yml
  • Datasets: ~/Desktop/liquidgym/infra/datasets/