mirror of https://github.com/r3bo0tbx1/tor-guard-relay.git synced 2026-04-06 00:32:04 +02:00

Files

rE-Bo0t.bx1 a28ce0a4e6 ✨ feat: add gen-auth tool and refactor compose templates

Major refactor of Docker Compose configurations and tooling enhancements.

- ✨ Add `gen-auth` script for generating Tor Control Port credentials
- 🐳 Refactor Docker Compose templates:
  - Add native healthcheck configurations to all relay/bridge files
  - Standardize security capabilities (drop ALL, add SETUID/SETGID)
  - Remove verbose comments to streamline template usage
  - Update volume definitions for better data persistence
- 🔧 Update base dependencies:
  - Alpine Linux -> 3.23.0
  - Golang -> 1.25.5-alpine
- 🧹 Standardize ENV variable names across all configurations

2025-12-05 04:37:19 +08:00

14 KiB

Raw Blame History

⚡ Performance Tuning & Optimization - Tor Guard Relay

Complete guide to optimizing CPU, memory, bandwidth, and network performance for your Tor relay.

Performance Baseline
CPU Optimization
Memory Management
Bandwidth Optimization
Network Tuning
Monitoring & Metrics
Benchmarking
Troubleshooting

Performance Baseline

System Requirements by Relay Tier

Tier	CPU	RAM	Bandwidth	Use Case
Entry	1 core	512 MB	10–50 Mbps	Home lab, testing
Standard	2 cores	1–2 GB	50–500 Mbps	Production guard relay
High-Capacity	4+ cores	4+ GB	500+ Mbps	High-traffic relay
Enterprise	8+ cores	8+ GB	1 Gbps+	Multiple relays

Expected Resource Usage (Steady State)

Resource	Entry	Standard	High-Cap	Notes
CPU	5–15%	10–25%	20–40%	Varies by traffic
Memory	80–150 MB	200–400 MB	500+ MB	Increases with connections
Bandwidth	5–50 Mbps	50–500 Mbps	500+ Mbps	Depends on limits
Disk I/O	Light	Moderate	Heavy	Monitor during bootstrap

CPU Optimization

1. Allocate CPU Cores

By default, Tor uses all available cores. Restrict or optimize as needed.

Check Current Allocation

# View Tor config
docker exec guard-relay grep -i numcpus /etc/tor/torrc

# View system CPUs
docker exec guard-relay nproc

Configure CPU Cores in relay.conf

# Use specific number of cores (example: 4 cores)
NumCPUs 4

# Or auto-detect (default, recommended)
NumCPUs 0

For Docker Compose

services:
  tor-guard-relay:
    # ... other config
    deploy:
      resources:
        limits:
          cpus: '4.0'  # Limit to 4 cores
        reservations:
          cpus: '2.0'  # Reserve 2 cores minimum

2. CPU Prioritization

Ensure Tor gets fair CPU scheduling.

# View current CPU usage
docker stats guard-relay --no-stream

# Show detailed CPU metrics
docker exec guard-relay ps aux | grep tor

3. Disable Unnecessary Features

# Disable directory service (if not needed)
# DirPort 0

# Keep SOCKS disabled (we're a relay, not a client)
SocksPort 0

# Disable bridge operation (if running guard relay)
BridgeRelay 0

4. Optimize Connection Handling

# Maximum simultaneous connections
# Default usually fine, but can tune:
# MaxClientCircuitsPending 100

# Connection timeout (default 15 minutes)
# CircuitIdleTimeout 900

Memory Management

1. Monitor Memory Usage

# Real-time memory monitoring
docker stats guard-relay

# View memory trends over 1 hour
watch -n 60 'docker exec guard-relay ps aux | grep tor | grep -v grep'

# Historical memory usage
docker exec guard-relay cat /proc/meminfo

2. Set Memory Limits in Docker Compose

services:
  tor-guard-relay:
    deploy:
      resources:
        limits:
          memory: 2G        # Hard limit
        reservations:
          memory: 1G        # Guaranteed allocation

3. Configure Tor Memory Settings

# MaxMemInQueues - Maximum total memory for circuit queues
# Default: 512 MB (usually fine)
MaxMemInQueues 512 MB

# When memory hits threshold, new circuits rejected
# Prevents OOM (out of memory) crashes

4. Handle Memory Leaks

Monitor for gradual increase:

#!/bin/bash
# Save as: /usr/local/bin/monitor-memory-growth.sh

CONTAINER="guard-relay"
INTERVAL=300  # 5 minutes

while true; do
  MEMORY=$(docker exec "$CONTAINER" ps aux | \
    grep '[t]or ' | awk '{print $6}' | head -1)
  
  echo "$(date): Memory = ${MEMORY}KB"
  sleep $INTERVAL
done

Run and observe for 24 hours:

/usr/local/bin/monitor-memory-growth.sh | tee /tmp/memory-log.txt

# Analyze growth rate
tail -20 /tmp/memory-log.txt

Bandwidth Optimization

1. Understand Bandwidth Limits

# Average bandwidth (sustained rate)
RelayBandwidthRate 100 MBytes

# Burst bandwidth (temporary spikes)
RelayBandwidthBurst 200 MBytes

2. Set Realistic Limits

Calculate your limits based on ISP:

Available Bandwidth: 1000 Mbps (ISP plan)
Usable for Tor: 50% (leave headroom for other services)
= 500 Mbps

Convert to MBytes/s: 500 Mbps ÷ 8 = 62.5 MBytes/s

Recommended:
- RelayBandwidthRate 50 MBytes
- RelayBandwidthBurst 100 MBytes

3. Bandwidth Accounting

Limit total monthly traffic:

# Monthly accounting window
# Starts on the 1st at UTC midnight
AccountingStart month 1 00:00

# Maximum data (upload + download combined)
AccountingMax 1000 GB

4. Monitor Actual Bandwidth Usage

# Real-time bandwidth stats
docker exec guard-relay tail -f /var/log/tor/notices.log | grep "bandwidth"

# Historical bandwidth usage
docker exec guard-relay grep "bandwidth" /var/log/tor/notices.log | tail -20

5. Optimize for Your Network

For Home Networks

# Conservative settings for residential connections
RelayBandwidthRate 10 MBytes
RelayBandwidthBurst 20 MBytes

For VPS with Unmetered Bandwidth

# Maximize contribution
RelayBandwidthRate 500 MBytes
RelayBandwidthBurst 1000 MBytes

For Datacenters with Traffic Shaping

# Match provider limits
RelayBandwidthRate 100 MBytes  # ISP limit
RelayBandwidthBurst 150 MBytes

Network Tuning

1. Enable IPv6 (if available)

In relay.conf:

# Dual-stack support
ORPort 9001
ORPort [::]:9001

# Directory port for IPv6
DirPort 9030

Verify IPv6 is working:

docker exec guard-relay curl -6 -s https://icanhazip.com
# Should return IPv6 address

docker exec guard-relay curl -4 -s https://icanhazip.com
# Should return IPv4 address

2. Optimize TCP Settings

On the host system (for Docker host):

# Increase TCP connection backlog
sudo sysctl -w net.core.somaxconn=65535

# Increase listen queue length
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=65535

# Enable TCP keepalives
sudo sysctl -w net.ipv4.tcp_keepalives_intvl=60

# Make permanent
echo "net.core.somaxconn=65535" | sudo tee -a /etc/sysctl.conf
echo "net.ipv4.tcp_max_syn_backlog=65535" | sudo tee -a /etc/sysctl.conf

3. Firewall Optimization

Ensure firewall rules don't throttle traffic:

# UFW example
sudo ufw status

# High performance rules
sudo iptables -I INPUT -p tcp --dport 9001 -j ACCEPT

# Save rules
sudo iptables-save > /etc/iptables/rules.v4

4. DNS Performance

Configure Tor to use fast DNS:

# Use Google DNS (example)
ServerDNSListenAddress 127.0.0.1:53
ServerDNSResolvConfFile /etc/resolv.conf

Verify DNS resolution is fast:

# Test DNS response time
time docker exec guard-relay tor --resolve example.com

Monitoring & Metrics

=v1.1.1 uses external monitoring with the health JSON API for minimal image size and maximum security.

1. JSON Health API

Get relay metrics via the health tool:

# Get full health status (raw JSON)
docker exec guard-relay health

# Parse with jq (requires jq on host)
docker exec guard-relay health | jq .

# Check specific metrics
docker exec guard-relay health | jq .bootstrap      # Bootstrap percentage
docker exec guard-relay health | jq .reachable      # ORPort reachability
docker exec guard-relay health | jq .uptime_seconds # Uptime

Example JSON output:

{
  "status": "up",
  "bootstrap": 100,
  "reachable": true,
  "fingerprint": "1234567890ABCDEF...",
  "nickname": "MyRelay",
  "uptime_seconds": 86400
}

2. Prometheus Integration (External)

Use the health tool with Prometheus node_exporter textfile collector:

Create metrics exporter script:

#!/bin/bash
# /usr/local/bin/tor-metrics-exporter.sh
# Requires: jq on host (apt install jq / brew install jq)

HEALTH=$(docker exec guard-relay health)

echo "$HEALTH" | jq -r '
  "tor_bootstrap_percent \(.bootstrap)",
  "tor_reachable \(if .reachable then 1 else 0 end)",
  "tor_uptime_seconds \(.uptime_seconds // 0)"
' > /var/lib/node_exporter/textfile_collector/tor.prom

Run via cron every 5 minutes:

chmod +x /usr/local/bin/tor-metrics-exporter.sh
crontab -e
*/5 * * * * /usr/local/bin/tor-metrics-exporter.sh

3. Set Up Prometheus Scraping

prometheus.yml:

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'node_exporter'  # Scrapes textfile collector
    static_configs:
      - targets: ['localhost:9035']
    metrics_path: '/metrics'

4. Create Grafana Dashboard

Key metrics to track:

# Bandwidth rates
rate(tor_relay_bytes_read_total[5m])
rate(tor_relay_bytes_written_total[5m])

# Connection counts
tor_relay_connections

# CPU usage
rate(process_cpu_seconds_total[5m])

# Memory usage
process_resident_memory_bytes / 1024 / 1024

Benchmarking

Baseline Test (New Relay)

Run after initial bootstrap to establish baseline.

#!/bin/bash
# Save as: /usr/local/bin/benchmark-relay.sh

CONTAINER="guard-relay"
DURATION=300  # 5 minutes

echo "=== Tor Relay Benchmark ==="
echo "Duration: $DURATION seconds"
echo ""

# Capture initial state
MEM_START=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $6}')
CPU_START=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $3}')

echo "Starting metrics..."
echo "Initial Memory: ${MEM_START}KB"
echo "Initial CPU: ${CPU_START}%"
echo ""

# Run for duration
sleep $DURATION

# Capture final state
MEM_END=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $6}')
CPU_END=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $3}')

# Bandwidth
BW_READ=$(docker exec $CONTAINER grep "bandwidth" /var/log/tor/notices.log | tail -1)
BW_WRITE=$(docker logs $CONTAINER 2>&1 | grep "bandwidth" | tail -1)

echo "=== Results ==="
echo "Memory Delta: $(( MEM_END - MEM_START ))KB"
echo "CPU Usage: ${CPU_END}%"
echo "Last Bandwidth Report:"
echo "  Read: $BW_READ"
echo "  Write: $BW_WRITE"
echo ""
echo "Timestamp: $(date)"

Run benchmark:

chmod +x /usr/local/bin/benchmark-relay.sh
/usr/local/bin/benchmark-relay.sh

Compare Against Benchmarks

Metric	Entry	Standard	High-Cap
5-min avg CPU	<15%	10–25%	20–40%
5-min avg MEM	<200 MB	200–500 MB	500+ MB
Active Connections	<100	100–500	500–2000
Bootstrap Time	10–30 min	10–30 min	10–30 min

Troubleshooting

High CPU Usage

Symptoms: CPU consistently >50%

Diagnosis:

# Check if relay is under heavy load
docker stats guard-relay --no-stream

# View top processes inside container
docker exec guard-relay ps aux --sort=-%cpu

# Check Tor config for tuning issues
docker exec guard-relay grep -E "NumCPUs|MaxClientCircuitsPending" /etc/tor/torrc

Solutions:

# Limit CPU cores
NumCPUs 2  # Instead of auto

# Reduce allowed circuits
MaxClientCircuitsPending 50  # Default is usually 100

High Memory Usage

Symptoms: Memory >75% of limit, or constantly increasing

Diagnosis:

# Check memory trend
docker exec guard-relay free -h

# Look for memory leak signs in logs
docker logs guard-relay 2>&1 | grep -i "memory\|oom"

# Check MaxMemInQueues setting
docker exec guard-relay grep MaxMemInQueues /etc/tor/torrc

Solutions:

# Reduce max in-flight data
MaxMemInQueues 256 MB  # More conservative

# Or increase if system has capacity
MaxMemInQueues 1024 MB  # If you have 8+ GB RAM

Low Bandwidth Usage

Symptoms: Bandwidth well below configured limits

Diagnosis:

# Check configured limits
docker exec guard-relay grep "RelayBandwidth" /etc/tor/torrc

# Check actual usage
docker logs guard-relay 2>&1 | grep "Average"

# Verify ORPort is reachable
docker exec guard-relay status | grep "reachable"
# Or use JSON health check
docker exec guard-relay health | jq .reachable

Solutions:

Give relay time to build reputation (2–8 weeks for full capacity)
Increase bandwidth limits if you have capacity
Check firewall isn't limiting traffic
Verify network connectivity is stable

Connection Pool Exhaustion

Symptoms: "Too many open files" errors

Diagnosis:

# Check file descriptor usage
docker exec guard-relay cat /proc/sys/fs/file-max
docker exec guard-relay ulimit -n

Solutions:

# Increase container file descriptor limit
docker run -d \
  --ulimit nofile=65535:65535 \
  # ... other options
  r3bo0tbx1/onion-relay:latest

Best Practices

✅ DO

✅ Monitor metrics continuously - Use Prometheus + Grafana
✅ Start conservative, scale gradually - Begin with lower bandwidth limits
✅ Test configuration changes - Benchmark before/after
✅ Keep logs rotating - Prevent disk fill
✅ Plan for peak load - Size hardware for bursts, not average
✅ Document your settings - Know why you tuned each parameter

❌ DON'T

❌ Don't max out bandwidth day 1 - New relays need reputation first
❌ Don't ignore resource limits - OOM kills are hard to debug
❌ Don't tune blindly - Always measure, then adjust
❌ Don't forget IPv6 - Half the network could be IPv6

Reference

Key Configuration Parameters:

# CPU
NumCPUs 4

# Memory
MaxMemInQueues 512 MB

# Bandwidth
RelayBandwidthRate 100 MBytes
RelayBandwidthBurst 200 MBytes

# Connections
MaxClientCircuitsPending 100

# Network
ORPort 9001
ORPort [::]:9001
DirPort 9030

Quick Performance Checklist:

CPU allocation set appropriately
Memory limits configured
Bandwidth limits realistic
IPv6 enabled (if available)
Metrics enabled for monitoring
Prometheus scraping configured
Alerts set for resource thresholds
Baseline benchmarks recorded

14 KiB Raw Blame History Unescape Escape

⚡ Performance Tuning & Optimization - Tor Guard Relay

Table of Contents

Performance Baseline

System Requirements by Relay Tier

Expected Resource Usage (Steady State)

CPU Optimization

1. Allocate CPU Cores

Check Current Allocation

Configure CPU Cores in relay.conf

For Docker Compose

2. CPU Prioritization

3. Disable Unnecessary Features

4. Optimize Connection Handling

Memory Management

1. Monitor Memory Usage

2. Set Memory Limits in Docker Compose

3. Configure Tor Memory Settings

4. Handle Memory Leaks

Bandwidth Optimization

1. Understand Bandwidth Limits

2. Set Realistic Limits

3. Bandwidth Accounting

4. Monitor Actual Bandwidth Usage

5. Optimize for Your Network

For Home Networks

For VPS with Unmetered Bandwidth

For Datacenters with Traffic Shaping

Network Tuning

1. Enable IPv6 (if available)

2. Optimize TCP Settings

3. Firewall Optimization

4. DNS Performance

Monitoring & Metrics

1. JSON Health API

2. Prometheus Integration (External)

3. Set Up Prometheus Scraping

4. Create Grafana Dashboard

Benchmarking

Baseline Test (New Relay)

Compare Against Benchmarks

Troubleshooting

High CPU Usage

High Memory Usage

Low Bandwidth Usage

Connection Pool Exhaustion

Best Practices

✅ DO

❌ DON'T

Reference

Support

14 KiB

Raw Blame History