# ⚡ Performance Tuning & Optimization - Tor Guard Relay Complete guide to optimizing CPU, memory, bandwidth, and network performance for your Tor relay. --- ## Table of Contents - [Performance Baseline](#performance-baseline) - [CPU Optimization](#cpu-optimization) - [Memory Management](#memory-management) - [Bandwidth Optimization](#bandwidth-optimization) - [Network Tuning](#network-tuning) - [Monitoring & Metrics](#monitoring--metrics) - [Benchmarking](#benchmarking) - [Troubleshooting](#troubleshooting) --- ## Performance Baseline ### System Requirements by Relay Tier | Tier | CPU | RAM | Bandwidth | Use Case | |------|-----|-----|-----------|----------| | **Entry** | 1 core | 512 MB | 10–50 Mbps | Home lab, testing | | **Standard** | 2 cores | 1–2 GB | 50–500 Mbps | Production guard relay | | **High-Capacity** | 4+ cores | 4+ GB | 500+ Mbps | High-traffic relay | | **Enterprise** | 8+ cores | 8+ GB | 1 Gbps+ | Multiple relays | ### Expected Resource Usage (Steady State) | Resource | Entry | Standard | High-Cap | Notes | |----------|-------|----------|----------|-------| | CPU | 5–15% | 10–25% | 20–40% | Varies by traffic | | Memory | 80–150 MB | 200–400 MB | 500+ MB | Increases with connections | | Bandwidth | 5–50 Mbps | 50–500 Mbps | 500+ Mbps | Depends on limits | | Disk I/O | Light | Moderate | Heavy | Monitor during bootstrap | --- ## CPU Optimization ### 1. Allocate CPU Cores By default, Tor uses all available cores. Restrict or optimize as needed. #### Check Current Allocation ```bash # View Tor config docker exec guard-relay grep -i numcpus /etc/tor/torrc # View system CPUs docker exec guard-relay nproc ``` #### Configure CPU Cores in relay.conf ```conf # Use specific number of cores (example: 4 cores) NumCPUs 4 # Or auto-detect (default, recommended) NumCPUs 0 ``` #### For Docker Compose ```yaml services: tor-guard-relay: # ... other config deploy: resources: limits: cpus: '4.0' # Limit to 4 cores reservations: cpus: '2.0' # Reserve 2 cores minimum ``` ### 2. CPU Prioritization Ensure Tor gets fair CPU scheduling. ```bash # View current CPU usage docker stats guard-relay --no-stream # Show detailed CPU metrics docker exec guard-relay ps aux | grep tor ``` ### 3. Disable Unnecessary Features ```conf # Disable directory service (if not needed) # DirPort 0 # Keep SOCKS disabled (we're a relay, not a client) SocksPort 0 # Disable bridge operation (if running guard relay) BridgeRelay 0 ``` ### 4. Optimize Connection Handling ```conf # Maximum simultaneous connections # Default usually fine, but can tune: # MaxClientCircuitsPending 100 # Connection timeout (default 15 minutes) # CircuitIdleTimeout 900 ``` --- ## Memory Management ### 1. Monitor Memory Usage ```bash # Real-time memory monitoring docker stats guard-relay # View memory trends over 1 hour watch -n 60 'docker exec guard-relay ps aux | grep tor | grep -v grep' # Historical memory usage docker exec guard-relay cat /proc/meminfo ``` ### 2. Set Memory Limits in Docker Compose ```yaml services: tor-guard-relay: deploy: resources: limits: memory: 2G # Hard limit reservations: memory: 1G # Guaranteed allocation ``` ### 3. Configure Tor Memory Settings ```conf # MaxMemInQueues - Maximum total memory for circuit queues # Default: 512 MB (usually fine) MaxMemInQueues 512 MB # When memory hits threshold, new circuits rejected # Prevents OOM (out of memory) crashes ``` ### 4. Handle Memory Leaks **Monitor for gradual increase:** ```bash #!/bin/bash # Save as: /usr/local/bin/monitor-memory-growth.sh CONTAINER="guard-relay" INTERVAL=300 # 5 minutes while true; do MEMORY=$(docker exec "$CONTAINER" ps aux | \ grep '[t]or ' | awk '{print $6}' | head -1) echo "$(date): Memory = ${MEMORY}KB" sleep $INTERVAL done ``` Run and observe for 24 hours: ```bash /usr/local/bin/monitor-memory-growth.sh | tee /tmp/memory-log.txt # Analyze growth rate tail -20 /tmp/memory-log.txt ``` --- ## Bandwidth Optimization ### 1. Understand Bandwidth Limits ```conf # Average bandwidth (sustained rate) RelayBandwidthRate 100 MBytes # Burst bandwidth (temporary spikes) RelayBandwidthBurst 200 MBytes ``` ### 2. Set Realistic Limits **Calculate your limits based on ISP:** ``` Available Bandwidth: 1000 Mbps (ISP plan) Usable for Tor: 50% (leave headroom for other services) = 500 Mbps Convert to MBytes/s: 500 Mbps ÷ 8 = 62.5 MBytes/s Recommended: - RelayBandwidthRate 50 MBytes - RelayBandwidthBurst 100 MBytes ``` ### 3. Bandwidth Accounting **Limit total monthly traffic:** ```conf # Monthly accounting window # Starts on the 1st at UTC midnight AccountingStart month 1 00:00 # Maximum data (upload + download combined) AccountingMax 1000 GB ``` ### 4. Monitor Actual Bandwidth Usage ```bash # Real-time bandwidth stats docker exec guard-relay tail -f /var/log/tor/notices.log | grep "bandwidth" # Historical bandwidth usage docker exec guard-relay grep "bandwidth" /var/log/tor/notices.log | tail -20 ``` ### 5. Optimize for Your Network #### For Home Networks ```conf # Conservative settings for residential connections RelayBandwidthRate 10 MBytes RelayBandwidthBurst 20 MBytes ``` #### For VPS with Unmetered Bandwidth ```conf # Maximize contribution RelayBandwidthRate 500 MBytes RelayBandwidthBurst 1000 MBytes ``` #### For Datacenters with Traffic Shaping ```conf # Match provider limits RelayBandwidthRate 100 MBytes # ISP limit RelayBandwidthBurst 150 MBytes ``` --- ## Network Tuning ### 1. Enable IPv6 (if available) **In relay.conf:** ```conf # Dual-stack support ORPort 9001 ORPort [::]:9001 # Directory port for IPv6 DirPort 9030 ``` **Verify IPv6 is working:** ```bash docker exec guard-relay curl -6 -s https://icanhazip.com # Should return IPv6 address docker exec guard-relay curl -4 -s https://icanhazip.com # Should return IPv4 address ``` ### 2. Optimize TCP Settings **On the host system (for Docker host):** ```bash # Increase TCP connection backlog sudo sysctl -w net.core.somaxconn=65535 # Increase listen queue length sudo sysctl -w net.ipv4.tcp_max_syn_backlog=65535 # Enable TCP keepalives sudo sysctl -w net.ipv4.tcp_keepalives_intvl=60 # Make permanent echo "net.core.somaxconn=65535" | sudo tee -a /etc/sysctl.conf echo "net.ipv4.tcp_max_syn_backlog=65535" | sudo tee -a /etc/sysctl.conf ``` ### 3. Firewall Optimization **Ensure firewall rules don't throttle traffic:** ```bash # UFW example sudo ufw status # High performance rules sudo iptables -I INPUT -p tcp --dport 9001 -j ACCEPT # Save rules sudo iptables-save > /etc/iptables/rules.v4 ``` ### 4. DNS Performance **Configure Tor to use fast DNS:** ```conf # Use Google DNS (example) ServerDNSListenAddress 127.0.0.1:53 ServerDNSResolvConfFile /etc/resolv.conf ``` Verify DNS resolution is fast: ```bash # Test DNS response time time docker exec guard-relay tor --resolve example.com ``` --- ## Monitoring & Metrics >=v1.1.1 uses **external monitoring** with the `health` JSON API for minimal image size and maximum security. ### 1. JSON Health API Get relay metrics via the `health` tool: ```bash # Get full health status (raw JSON) docker exec guard-relay health # Parse with jq (requires jq on host) docker exec guard-relay health | jq . # Check specific metrics docker exec guard-relay health | jq .bootstrap # Bootstrap percentage docker exec guard-relay health | jq .reachable # ORPort reachability docker exec guard-relay health | jq .uptime # Uptime ``` **Example JSON output:** ```json { "status": "up", "pid": 1, "uptime": "1-00:00:00", "bootstrap": 100, "reachable": "true", "errors": 0, "fingerprint": "1234567890ABCDEF...", "nickname": "MyRelay" } ``` ### 2. Prometheus Integration (External) Use the `health` tool with Prometheus node_exporter textfile collector: **Create metrics exporter script:** ```bash #!/bin/bash # /usr/local/bin/tor-metrics-exporter.sh # Requires: jq on host (apt install jq / brew install jq) HEALTH=$(docker exec guard-relay health) echo "$HEALTH" | jq -r ' "tor_bootstrap_percent \(.bootstrap)", "tor_reachable \(if .reachable == "true" then 1 else 0 end)" ' > /var/lib/node_exporter/textfile_collector/tor.prom ``` **Run via cron every 5 minutes:** ```bash chmod +x /usr/local/bin/tor-metrics-exporter.sh crontab -e */5 * * * * /usr/local/bin/tor-metrics-exporter.sh ``` ### 3. Set Up Prometheus Scraping **prometheus.yml:** ```yaml global: scrape_interval: 15s scrape_configs: - job_name: 'node_exporter' # Scrapes textfile collector static_configs: - targets: ['localhost:9035'] metrics_path: '/metrics' ``` ### 4. Create Grafana Dashboard **Key metrics to track:** ```promql # Bandwidth rates rate(tor_relay_bytes_read_total[5m]) rate(tor_relay_bytes_written_total[5m]) # Connection counts tor_relay_connections # CPU usage rate(process_cpu_seconds_total[5m]) # Memory usage process_resident_memory_bytes / 1024 / 1024 ``` --- ## Benchmarking ### Baseline Test (New Relay) Run after initial bootstrap to establish baseline. ```bash #!/bin/bash # Save as: /usr/local/bin/benchmark-relay.sh CONTAINER="guard-relay" DURATION=300 # 5 minutes echo "=== Tor Relay Benchmark ===" echo "Duration: $DURATION seconds" echo "" # Capture initial state MEM_START=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $6}') CPU_START=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $3}') echo "Starting metrics..." echo "Initial Memory: ${MEM_START}KB" echo "Initial CPU: ${CPU_START}%" echo "" # Run for duration sleep $DURATION # Capture final state MEM_END=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $6}') CPU_END=$(docker exec $CONTAINER ps aux | grep '[t]or ' | awk '{print $3}') # Bandwidth BW_READ=$(docker exec $CONTAINER grep "bandwidth" /var/log/tor/notices.log | tail -1) BW_WRITE=$(docker logs $CONTAINER 2>&1 | grep "bandwidth" | tail -1) echo "=== Results ===" echo "Memory Delta: $(( MEM_END - MEM_START ))KB" echo "CPU Usage: ${CPU_END}%" echo "Last Bandwidth Report:" echo " Read: $BW_READ" echo " Write: $BW_WRITE" echo "" echo "Timestamp: $(date)" ``` Run benchmark: ```bash chmod +x /usr/local/bin/benchmark-relay.sh /usr/local/bin/benchmark-relay.sh ``` ### Compare Against Benchmarks | Metric | Entry | Standard | High-Cap | |--------|-------|----------|----------| | **5-min avg CPU** | <15% | 10–25% | 20–40% | | **5-min avg MEM** | <200 MB | 200–500 MB | 500+ MB | | **Active Connections** | <100 | 100–500 | 500–2000 | | **Bootstrap Time** | 10–30 min | 10–30 min | 10–30 min | --- ## Troubleshooting ### High CPU Usage **Symptoms:** CPU consistently >50% **Diagnosis:** ```bash # Check if relay is under heavy load docker stats guard-relay --no-stream # View top processes inside container docker exec guard-relay ps aux --sort=-%cpu # Check Tor config for tuning issues docker exec guard-relay grep -E "NumCPUs|MaxClientCircuitsPending" /etc/tor/torrc ``` **Solutions:** ```conf # Limit CPU cores NumCPUs 2 # Instead of auto # Reduce allowed circuits MaxClientCircuitsPending 50 # Default is usually 100 ``` ### High Memory Usage **Symptoms:** Memory >75% of limit, or constantly increasing **Diagnosis:** ```bash # Check memory trend docker exec guard-relay free -h # Look for memory leak signs in logs docker logs guard-relay 2>&1 | grep -i "memory\|oom" # Check MaxMemInQueues setting docker exec guard-relay grep MaxMemInQueues /etc/tor/torrc ``` **Solutions:** ```conf # Reduce max in-flight data MaxMemInQueues 256 MB # More conservative # Or increase if system has capacity MaxMemInQueues 1024 MB # If you have 8+ GB RAM ``` ### Low Bandwidth Usage **Symptoms:** Bandwidth well below configured limits **Diagnosis:** ```bash # Check configured limits docker exec guard-relay grep "RelayBandwidth" /etc/tor/torrc # Check actual usage docker logs guard-relay 2>&1 | grep "Average" # Verify ORPort is reachable docker exec guard-relay status | grep "reachable" # Or use JSON health check docker exec guard-relay health | jq .reachable ``` **Solutions:** - Give relay time to build reputation (2–8 weeks for full capacity) - Increase bandwidth limits if you have capacity - Check firewall isn't limiting traffic - Verify network connectivity is stable ### Connection Pool Exhaustion **Symptoms:** "Too many open files" errors **Diagnosis:** ```bash # Check file descriptor usage docker exec guard-relay cat /proc/sys/fs/file-max docker exec guard-relay ulimit -n ``` **Solutions:** ```bash # Increase container file descriptor limit docker run -d \ --ulimit nofile=65535:65535 \ # ... other options r3bo0tbx1/onion-relay:latest ``` --- ## Best Practices ### ✅ DO - ✅ **Monitor metrics continuously** - Use Prometheus + Grafana - ✅ **Start conservative, scale gradually** - Begin with lower bandwidth limits - ✅ **Test configuration changes** - Benchmark before/after - ✅ **Keep logs rotating** - Prevent disk fill - ✅ **Plan for peak load** - Size hardware for bursts, not average - ✅ **Document your settings** - Know why you tuned each parameter ### ❌ DON'T - ❌ **Don't max out bandwidth day 1** - New relays need reputation first - ❌ **Don't ignore resource limits** - OOM kills are hard to debug - ❌ **Don't tune blindly** - Always measure, then adjust - ❌ **Don't forget IPv6** - Half the network could be IPv6 --- ## Reference **Key Configuration Parameters:** ```conf # CPU NumCPUs 4 # Memory MaxMemInQueues 512 MB # Bandwidth RelayBandwidthRate 100 MBytes RelayBandwidthBurst 200 MBytes # Connections MaxClientCircuitsPending 100 # Network ORPort 9001 ORPort [::]:9001 DirPort 9030 ``` **Quick Performance Checklist:** - [ ] CPU allocation set appropriately - [ ] Memory limits configured - [ ] Bandwidth limits realistic - [ ] IPv6 enabled (if available) - [ ] Metrics enabled for monitoring - [ ] Prometheus scraping configured - [ ] Alerts set for resource thresholds - [ ] Baseline benchmarks recorded --- ## Support - 📖 [Backup Guide](./BACKUP.md) - 🚀 [Deployment Guide](./DEPLOYMENT.md) - 🐛 [Report Issues](https://github.com/r3bo0tbx1/tor-guard-relay/issues) - 💬 [Tor Performance Forum](https://forum.torproject.org/c/relay-operators)