mirror of
https://github.com/r3bo0tbx1/tor-guard-relay.git
synced 2026-04-06 00:32:04 +02:00
Major refactor of Docker Compose configurations and tooling enhancements. - ✨ Add `gen-auth` script for generating Tor Control Port credentials - 🐳 Refactor Docker Compose templates: - Add native healthcheck configurations to all relay/bridge files - Standardize security capabilities (drop ALL, add SETUID/SETGID) - Remove verbose comments to streamline template usage - Update volume definitions for better data persistence - 🔧 Update base dependencies: - Alpine Linux -> 3.23.0 - Golang -> 1.25.5-alpine - 🧹 Standardize ENV variable names across all configurations
456 lines
11 KiB
Markdown
456 lines
11 KiB
Markdown
# 📊 Monitoring Guide
|
|
|
|
Guide to monitoring your Tor Guard Relay with external tools. The >=v1.1.1 ultra-optimized build (~20 MB) does not include built-in Prometheus metrics endpoints, but provides multiple alternatives for monitoring.
|
|
|
|
---
|
|
|
|
## 📋 Overview
|
|
|
|
**What Changed in v1.1.1:**
|
|
- ❌ Removed built-in `metrics-http` server (to reduce image size)
|
|
- ❌ Removed Python-based dashboard
|
|
- ✅ Kept `health` tool for JSON status output
|
|
- ✅ Kept `status` tool for human-readable output
|
|
- ✅ Enhanced logging for external monitoring integration
|
|
|
|
**Monitoring Options:**
|
|
1. Docker health checks (built-in)
|
|
2. `health` tool with external scrapers
|
|
3. Log file monitoring
|
|
4. External Prometheus exporters
|
|
5. Cloud monitoring services
|
|
|
|
---
|
|
|
|
## 🚀 Quick Start Options
|
|
|
|
### Option 1: Docker Health Checks (Simplest)
|
|
|
|
Built-in Docker health checks automatically monitor relay status:
|
|
|
|
```bash
|
|
# Check health status
|
|
docker inspect --format='{{.State.Health.Status}}' tor-relay
|
|
|
|
# Get health history (requires jq on host)
|
|
docker inspect --format='{{json .State.Health}}' tor-relay | jq
|
|
```
|
|
|
|
**Healthcheck Configuration:**
|
|
```yaml
|
|
# Already included in all compose templates
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "/usr/local/bin/healthcheck.sh"]
|
|
interval: 10m
|
|
timeout: 15s
|
|
start_period: 30s
|
|
retries: 3
|
|
```
|
|
|
|
**Monitoring:**
|
|
- Use `docker events` to watch health changes
|
|
- Integrate with Docker monitoring tools (Portainer, Netdata, etc.)
|
|
- Alert on `health_status: unhealthy` events
|
|
|
|
---
|
|
|
|
### Option 2: Health Tool with External Scraper
|
|
|
|
Use the `health` tool's JSON output with your monitoring system:
|
|
|
|
**Setup Simple HTTP Wrapper:**
|
|
```bash
|
|
# Create simple health endpoint with netcat
|
|
while true; do
|
|
HEALTH=$(docker exec tor-relay health)
|
|
echo -e "HTTP/1.1 200 OK\r\nContent-Type: application/json\r\n\r\n$HEALTH" | nc -l -p 9100
|
|
done
|
|
```
|
|
|
|
**Or use a proper wrapper script:**
|
|
```python
|
|
#!/usr/bin/env python3
|
|
from flask import Flask, jsonify
|
|
import subprocess
|
|
import json
|
|
|
|
app = Flask(__name__)
|
|
|
|
@app.route('/health')
|
|
def health():
|
|
result = subprocess.run(
|
|
['docker', 'exec', 'tor-relay', 'health'],
|
|
capture_output=True,
|
|
text=True
|
|
)
|
|
return jsonify(json.loads(result.stdout))
|
|
|
|
@app.route('/metrics')
|
|
def metrics():
|
|
health_data = subprocess.run(
|
|
['docker', 'exec', 'tor-relay', 'health'],
|
|
capture_output=True,
|
|
text=True
|
|
)
|
|
data = json.loads(health_data.stdout)
|
|
|
|
# Convert to Prometheus format
|
|
metrics = f"""# HELP tor_relay_up Relay is running
|
|
# TYPE tor_relay_up gauge
|
|
tor_relay_up {{nickname="{data['nickname']}"}} {1 if data['status'] == 'up' else 0}
|
|
|
|
# HELP tor_relay_bootstrap_percent Bootstrap completion
|
|
# TYPE tor_relay_bootstrap_percent gauge
|
|
tor_relay_bootstrap_percent {{nickname="{data['nickname']}"}} {data['bootstrap']}
|
|
|
|
# HELP tor_relay_errors Error count
|
|
# TYPE tor_relay_errors gauge
|
|
tor_relay_errors {{nickname="{data['nickname']}"}} {data['errors']}
|
|
"""
|
|
return metrics, 200, {'Content-Type': 'text/plain; charset=utf-8'}
|
|
|
|
if __name__ == '__main__':
|
|
app.run(host='0.0.0.0', port=9100)
|
|
```
|
|
|
|
**Prometheus Configuration:**
|
|
```yaml
|
|
scrape_configs:
|
|
- job_name: 'tor-relay'
|
|
static_configs:
|
|
- targets: ['localhost:9100']
|
|
scrape_interval: 60s
|
|
```
|
|
|
|
---
|
|
|
|
### Option 3: Log File Monitoring
|
|
|
|
Monitor Tor logs directly for events and errors:
|
|
|
|
**Filebeat Configuration:**
|
|
```yaml
|
|
# filebeat.yml
|
|
filebeat.inputs:
|
|
- type: log
|
|
enabled: true
|
|
paths:
|
|
- /var/lib/docker/volumes/tor-guard-logs/_data/notices.log
|
|
fields:
|
|
service: tor-relay
|
|
multiline:
|
|
pattern: '^\['
|
|
negate: true
|
|
match: after
|
|
|
|
output.elasticsearch:
|
|
hosts: ["localhost:9200"]
|
|
index: "tor-relay-logs-%{+yyyy.MM.dd}"
|
|
```
|
|
|
|
**Promtail Configuration (for Loki):**
|
|
```yaml
|
|
# promtail-config.yml
|
|
server:
|
|
http_listen_port: 9080
|
|
|
|
positions:
|
|
filename: /tmp/positions.yaml
|
|
|
|
clients:
|
|
- url: http://localhost:3100/loki/api/v1/push
|
|
|
|
scrape_configs:
|
|
- job_name: tor-relay
|
|
static_configs:
|
|
- targets:
|
|
- localhost
|
|
labels:
|
|
job: tor-relay
|
|
__path__: /var/lib/docker/volumes/tor-guard-logs/_data/*.log
|
|
```
|
|
|
|
**Key Log Patterns to Monitor:**
|
|
```bash
|
|
# Bootstrap complete
|
|
grep "Bootstrapped 100%" /var/log/tor/notices.log
|
|
|
|
# ORPort reachability
|
|
grep "Self-testing indicates your ORPort is reachable" /var/log/tor/notices.log
|
|
|
|
# Errors
|
|
grep "\[err\]" /var/log/tor/notices.log
|
|
|
|
# Warnings
|
|
grep "\[warn\]" /var/log/tor/notices.log
|
|
|
|
# Bandwidth self-test
|
|
grep "bandwidth self-test...done" /var/log/tor/notices.log
|
|
```
|
|
|
|
---
|
|
|
|
### Option 4: External Prometheus Exporters
|
|
|
|
Use dedicated Tor exporters that parse Tor control port:
|
|
|
|
**Option A: tor_exporter**
|
|
```bash
|
|
# Install tor_exporter
|
|
docker run -d --name tor-exporter \
|
|
--network host \
|
|
ghcr.io/atx/prometheus-tor_exporter:latest \
|
|
--tor.control-address=127.0.0.1:9051
|
|
|
|
# Add to Prometheus
|
|
scrape_configs:
|
|
- job_name: 'tor-exporter'
|
|
static_configs:
|
|
- targets: ['localhost:9099']
|
|
```
|
|
|
|
**Option B: Custom exporter from health tool**
|
|
|
|
See Option 2 above for Python Flask example.
|
|
|
|
---
|
|
|
|
### Option 5: Cloud Monitoring Services
|
|
|
|
**DataDog:**
|
|
```yaml
|
|
# datadog.yaml
|
|
logs:
|
|
- type: file
|
|
path: /var/lib/docker/volumes/tor-guard-logs/_data/notices.log
|
|
service: tor-relay
|
|
source: tor
|
|
|
|
checks:
|
|
http_check:
|
|
instances:
|
|
- name: tor-relay-health
|
|
url: http://localhost:9100/health
|
|
timeout: 5
|
|
```
|
|
|
|
**New Relic:**
|
|
```yaml
|
|
integrations:
|
|
- name: nri-docker
|
|
env:
|
|
DOCKER_API_VERSION: v1.40
|
|
interval: 60s
|
|
|
|
- name: nri-flex
|
|
config:
|
|
name: tor-relay-health
|
|
apis:
|
|
- event_type: TorRelayHealth
|
|
commands:
|
|
- run: docker exec tor-relay health
|
|
split: none
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 Monitoring Metrics
|
|
|
|
**Available from `health` tool:**
|
|
```json
|
|
{
|
|
"status": "up|down|error",
|
|
"pid": 123,
|
|
"uptime": "2d 14h 30m",
|
|
"bootstrap": 0-100,
|
|
"reachable": "true|false",
|
|
"errors": 0,
|
|
"nickname": "MyRelay",
|
|
"fingerprint": "ABCD..."
|
|
}
|
|
```
|
|
|
|
**Key Metrics to Track:**
|
|
- `status` - Relay health (up/down/error)
|
|
- `bootstrap` - Connection progress (0-100%)
|
|
- `reachable` - ORPort accessibility
|
|
- `errors` - Error count
|
|
- `uptime` - Relay uptime
|
|
|
|
**From Logs:**
|
|
- Bootstrap events
|
|
- ORPort reachability tests
|
|
- Bandwidth usage
|
|
- Connection counts (if exit relay)
|
|
- Warning and error messages
|
|
|
|
---
|
|
|
|
## 🔔 Alerting
|
|
|
|
### Simple Shell Script Alert
|
|
```bash
|
|
#!/bin/bash
|
|
# check-tor-relay.sh
|
|
# Requires: jq installed on host (apt install jq / brew install jq)
|
|
|
|
STATUS=$(docker exec tor-relay health | jq -r '.status')
|
|
BOOTSTRAP=$(docker exec tor-relay health | jq -r '.bootstrap')
|
|
ERRORS=$(docker exec tor-relay health | jq -r '.errors')
|
|
|
|
if [ "$STATUS" != "up" ]; then
|
|
echo "CRITICAL: Tor relay is $STATUS"
|
|
# Send alert via email, Slack, Discord, etc.
|
|
curl -X POST https://hooks.slack.com/services/YOUR/WEBHOOK \
|
|
-d "{\"text\": \"Tor relay is $STATUS\"}"
|
|
exit 2
|
|
fi
|
|
|
|
if [ "$BOOTSTRAP" -lt 100 ]; then
|
|
echo "WARNING: Bootstrap at $BOOTSTRAP%"
|
|
exit 1
|
|
fi
|
|
|
|
if [ "$ERRORS" -gt 0 ]; then
|
|
echo "WARNING: $ERRORS errors detected"
|
|
exit 1
|
|
fi
|
|
|
|
echo "OK: Relay healthy, $BOOTSTRAP% bootstrapped"
|
|
exit 0
|
|
```
|
|
|
|
**Run with cron:**
|
|
```cron
|
|
*/5 * * * * /usr/local/bin/check-tor-relay.sh
|
|
```
|
|
|
|
### Docker Events Alert
|
|
```bash
|
|
#!/bin/bash
|
|
# watch-docker-health.sh
|
|
# Requires: jq installed on host (apt install jq / brew install jq)
|
|
|
|
docker events --filter 'event=health_status' --format '{{json .}}' | while read event; do
|
|
STATUS=$(echo $event | jq -r '.Actor.Attributes."health_status"')
|
|
CONTAINER=$(echo $event | jq -r '.Actor.Attributes.name')
|
|
|
|
if [ "$STATUS" = "unhealthy" ]; then
|
|
echo "ALERT: $CONTAINER is unhealthy"
|
|
# Send notification
|
|
fi
|
|
done
|
|
```
|
|
|
|
---
|
|
|
|
## 🏗️ Example: Complete Monitoring Stack
|
|
|
|
**Docker Compose with external monitoring:**
|
|
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
tor-relay:
|
|
image: r3bo0tbx1/onion-relay:latest
|
|
container_name: tor-relay
|
|
network_mode: host
|
|
volumes:
|
|
- ./relay.conf:/etc/tor/torrc:ro
|
|
- tor-data:/var/lib/tor
|
|
- tor-logs:/var/log/tor
|
|
healthcheck:
|
|
test: ["CMD", "tor", "--verify-config", "-f", "/etc/tor/torrc"]
|
|
interval: 10m
|
|
timeout: 15s
|
|
retries: 3
|
|
|
|
# Health exporter (Python wrapper)
|
|
health-exporter:
|
|
image: python:3.11-slim
|
|
container_name: tor-health-exporter
|
|
network_mode: host
|
|
volumes:
|
|
- ./health-exporter.py:/app/exporter.py
|
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
|
command: python /app/exporter.py
|
|
depends_on:
|
|
- tor-relay
|
|
|
|
# Prometheus
|
|
prometheus:
|
|
image: prom/prometheus:latest
|
|
container_name: prometheus
|
|
ports:
|
|
- "9090:9090"
|
|
volumes:
|
|
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
|
|
- prometheus-data:/prometheus
|
|
command:
|
|
- '--config.file=/etc/prometheus/prometheus.yml'
|
|
- '--storage.tsdb.retention.time=30d'
|
|
|
|
# Grafana
|
|
grafana:
|
|
image: grafana/grafana:latest
|
|
container_name: grafana
|
|
ports:
|
|
- "3000:3000"
|
|
environment:
|
|
- GF_SECURITY_ADMIN_PASSWORD=admin
|
|
volumes:
|
|
- grafana-data:/var/lib/grafana
|
|
|
|
volumes:
|
|
tor-data:
|
|
tor-logs:
|
|
prometheus-data:
|
|
grafana-data:
|
|
```
|
|
|
|
---
|
|
|
|
## 💡 Tips & Best Practices
|
|
|
|
1. **Use `health` tool** - JSON output perfect for automation
|
|
2. **Monitor Docker health** - Built-in, no extra tools needed
|
|
3. **Alert on status changes** - Watch for `status != "up"`
|
|
4. **Track bootstrap** - New relays take 5-15 minutes
|
|
5. **Monitor logs** - Tor logs are comprehensive and informative
|
|
6. **External exporters** - Use tor_exporter for detailed metrics
|
|
7. **Keep it simple** - Don't over-complicate monitoring
|
|
|
|
---
|
|
|
|
## 📚 Related Documentation
|
|
|
|
- [Tools Reference](./TOOLS.md) - Built-in diagnostic tools
|
|
- [Deployment Guide](./DEPLOYMENT.md) - Installation and configuration
|
|
- [Performance Guide](./PERFORMANCE.md) - Optimization tips
|
|
|
|
---
|
|
|
|
## ❓ FAQ
|
|
|
|
**Q: Why was the built-in metrics endpoint removed?**
|
|
A: To achieve the ultra-small image size (~20 MB). The metrics server required Python, Flask, and other dependencies (~25+ MB). External monitoring is more flexible anyway.
|
|
|
|
**Q: Can I still use Prometheus?**
|
|
A: Yes! Use the Python wrapper example above, or a dedicated Tor exporter like `prometheus-tor_exporter`.
|
|
|
|
**Q: What's the simplest monitoring option?**
|
|
A: Docker health checks + the `health` tool. No additional infrastructure needed.
|
|
|
|
**Q: How do I monitor multiple relays?**
|
|
A: Run the health exporter on each host, or use log aggregation (Loki, ELK, Datadog).
|
|
|
|
**Q: Where can I find the logs?**
|
|
A:
|
|
- Inside container: `/var/log/tor/notices.log`
|
|
- On host: `/var/lib/docker/volumes/tor-guard-logs/_data/notices.log`
|
|
- Via docker: `docker logs tor-relay`
|
|
|
|
---
|
|
|
|
**Last Updated:** December 2025 | **Version:** 1.1.3 |