CloudBeaver database manager guide, Ecija intranet deployment, Gitea-Coolify auto-deploy and integration docs, monitoring setup with presentation, remote access guide, security architecture, and Turbostarter deployment procedure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6.2 KiB
6.2 KiB
NUC Monitoring & Recovery Setup
Date: 2026-02-02 22:20 Context: Complete monitoring stack deployment with auto-recovery and remote access
Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ MONITORING STACK │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ scrape ┌─────────────┐ │
│ │ OpenWrt │◄─────────────│ Prometheus │ │
│ │ :9100 │ │ :9091 │ │
│ └─────────────┘ └──────┬──────┘ │
│ │ │
│ ┌─────────────┐ scrape │ ┌─────────────┐ │
│ │ NUC Node │◄───────────────────┤ │ Alertmanager│ │
│ │ Exporter │ │ │ :9093 │ │
│ │ :9100 │ ┌─────┴────┐ └──────┬──────┘ │
│ └─────────────┘ │ Grafana │ │ │
│ │ :3333 │ ▼ │
│ └──────────┘ ┌───────────┐ │
│ │ntfy Bridge│ │
│ │ :9095 │ │
│ └─────┬─────┘ │
│ │ │
│ ▼ │
│ ntfy.sh/nuc-watchdog │
└─────────────────────────────────────────────────────────────────────┘
Access URLs
| Service | Local URL | Remote URL | Credentials |
|---|---|---|---|
| Grafana | http://192.168.1.3:3333 | https://alezmad-nuc.tail58f5ad.ts.net | admin / nucmonitoring |
| Prometheus | http://192.168.1.3:9091 | - | - |
| Alertmanager | http://192.168.1.3:9093 | - | - |
| OpenWrt Metrics | http://192.168.1.1:9100 | - | - |
Prometheus Targets
| Job | Target | Scrape Interval |
|---|---|---|
prometheus |
localhost:9090 | 15s |
nuc-node |
192.168.1.3:9100 | 15s |
openwrt |
192.168.1.1:9100 | 30s |
Alert Rules
NUC Alerts (/opt/monitoring/alert_rules.yml)
| Alert | Condition | Severity |
|---|---|---|
| NUCDown | up{job="nuc-node"} == 0 for 1m |
critical |
| HighCPULoad | CPU > 80% for 5m | warning |
| HighMemoryUsage | Memory > 85% for 5m | warning |
| DiskSpaceLow | Disk > 85% for 5m | warning |
OpenWrt Alerts
| Alert | Condition | Severity |
|---|---|---|
| OpenWrtDown | up{job="openwrt"} == 0 for 1m |
critical |
| OpenWrtHighLoad | Load > 2 for 5m | warning |
Auto-Recovery Layers
| Layer | Component | Action | Trigger |
|---|---|---|---|
| 1 | OpenWrt Monitor | WoL packet | HTTP+Ping fail (3x) |
| 2 | Hardware Watchdog | Auto-reboot | System freeze |
| 3 | Kernel Panic | Auto-reboot (10s) | Kernel panic |
| 4 | WoL | Wake from power off | Manual or script |
Configuration Files
NUC (/opt/monitoring/)
prometheus.yml- Prometheus configurationalert_rules.yml- Alert rulesalertmanager.yml- Alertmanager → ntfy bridgealertmanager-ntfy-bridge.py- Webhook translator
OpenWrt (/opt/)
monitor-nuc.sh- Health check daemon (0.5Hz)
Systemd Services
prometheus-node-exporter.service- NUC metricsalertmanager-ntfy-bridge.service- Alert translator
Notifications
ntfy Topic: nuc-watchdog
Subscribe via:
- App: ntfy (iOS/Android)
- Web: https://ntfy.sh/nuc-watchdog
Alert Sources:
- OpenWrt watchdog (direct to ntfy.sh)
- Alertmanager → bridge → ntfy.sh
Remote Access
Tailscale (NUC)
- Funnel URL: https://alezmad-nuc.tail58f5ad.ts.net
- Exposes: Grafana (:3333)
WireGuard (OpenWrt)
- Endpoint: 5.224.196.245:51820
- VPN Subnet: 10.10.10.0/24
- Config:
~/wireguard/home-vpn.conf
Grafana Setup
Data Source: Prometheus (http://prometheus:9090)
Imported Dashboards:
- Node Exporter Full (ID: 1860)
Maintenance Commands
# Check Prometheus targets
curl -s http://192.168.1.3:9091/api/v1/targets | jq '.data.activeTargets[].health'
# Check active alerts
curl -s http://192.168.1.3:9093/api/v2/alerts | jq '.[].labels.alertname'
# Restart monitoring stack
ssh nuc "docker restart prometheus-r0wg4gwoow44kkkc8skc4kwg alertmanager-r0wg4gwoow44kkkc8skc4kwg grafana-r0wg4gwoow44kkkc8skc4kwg"
# Check OpenWrt monitor
ssh root@192.168.1.1 "ps | grep monitor"
# Test alert
curl -X POST http://192.168.1.3:9093/api/v2/alerts -H "Content-Type: application/json" \
-d '[{"labels":{"alertname":"TestAlert","severity":"warning"},"annotations":{"summary":"Test"}}]'
# Manual ntfy test
curl -d "Test message" https://ntfy.sh/nuc-watchdog
Container UUIDs (Coolify)
| Service | UUID |
|---|---|
| Monitoring Stack | r0wg4gwoow44kkkc8skc4kwg |
Related
- OpenWrt NUC Monitor:
/opt/monitor-nuc.sh - Kernel panic config:
/etc/sysctl.d/99-auto-reboot.conf - WireGuard config:
~/wireguard/home-vpn.conf