Fix: Prevent cache memory leak with maxKeys limit and conditional caching #451

Closed
opened 2026-04-05 17:10:54 +02:00 by MrUnknownDE · 0 comments
Owner

Originally created by @djcrafts on 12/21/2025

Summary

Fixes #2120 by preventing unbounded cache growth through three key improvements:

  1. Adding a maxKeys limit to the NodeCache instance
  2. Skipping cache writes for unsuccessful lookups
  3. Adding periodic cache statistics logging for monitoring

Problem

After upgrading to 1.13.1, Pangolin was experiencing memory leaks, especially when GeoIP (MaxMind) database was configured. With ~10k requests per day, the cache would grow unbounded, eventually causing OOM kills on memory-constrained systems.

Root causes:

  1. NodeCache had no maxKeys limit, allowing unlimited growth
  2. GeoIP/ASN lookups were cached even when they returned undefined (e.g., when MaxMind DB not configured)
  3. No visibility into cache performance made the issue hard to diagnose

Solution

1. Add maxKeys limit (cache.ts)

export const cache = new NodeCache({
    stdTTL: 3600,
    checkperiod: 120,
    maxKeys: 10000  // Prevents unbounded growth
});
  • Sets maximum of 10,000 cached keys
  • Uses LRU eviction when limit is reached
  • With ~10k requests/day and 5min TTL, provides ample headroom

2. Skip caching undefined values

Before:

cachedCountryCode = await getCountryCodeForIp(ip);
cache.set(geoIpCacheKey, cachedCountryCode, 300); // Caches undefined too

After:

cachedCountryCode = await getCountryCodeForIp(ip);
if (cachedCountryCode) {  // Only cache successful lookups
    cache.set(geoIpCacheKey, cachedCountryCode, 300);
}

3. Add cache monitoring

setInterval(() => {
    const stats = cache.getStats();
    logger.debug(
        `Cache stats - Keys: ${stats.keys}, Hits: ${stats.hits}, Misses: ${stats.misses}, Hit rate: ...`
    );
}, 300000); // Every 5 minutes

Testing

  • TypeScript compilation successful (no type errors)
  • No functional changes to lookup logic
  • Backward compatible with existing cache usage
  • Cache statistics will be visible in debug logs

Impact

  • Memory usage: Capped at ~10k keys instead of unbounded growth
  • Performance: No degradation; LRU ensures frequently accessed IPs remain cached
  • Monitoring: Cache stats in logs help identify issues early
  • Undefined caching: Prevents wasting cache space on failed lookups
  • Addresses issue raised by @Ragnaruk about lack of cache stats logging
  • Addresses issue raised by @nath1416 about OOM kills on 1GB VPS
  • ASN lookup caching already had conditional caching; now GeoIP matches that pattern
*Originally created by @djcrafts on 12/21/2025* ### Summary Fixes #2120 by preventing unbounded cache growth through three key improvements: 1. Adding a `maxKeys` limit to the NodeCache instance 2. Skipping cache writes for unsuccessful lookups 3. Adding periodic cache statistics logging for monitoring ### Problem After upgrading to 1.13.1, Pangolin was experiencing memory leaks, especially when GeoIP (MaxMind) database was configured. With ~10k requests per day, the cache would grow unbounded, eventually causing OOM kills on memory-constrained systems. **Root causes:** 1. NodeCache had no `maxKeys` limit, allowing unlimited growth 2. GeoIP/ASN lookups were cached even when they returned `undefined` (e.g., when MaxMind DB not configured) 3. No visibility into cache performance made the issue hard to diagnose ### Solution #### 1. Add maxKeys limit (cache.ts) ```typescript export const cache = new NodeCache({ stdTTL: 3600, checkperiod: 120, maxKeys: 10000 // Prevents unbounded growth }); ``` - Sets maximum of 10,000 cached keys - Uses LRU eviction when limit is reached - With ~10k requests/day and 5min TTL, provides ample headroom #### 2. Skip caching undefined values **Before:** ```typescript cachedCountryCode = await getCountryCodeForIp(ip); cache.set(geoIpCacheKey, cachedCountryCode, 300); // Caches undefined too ``` **After:** ```typescript cachedCountryCode = await getCountryCodeForIp(ip); if (cachedCountryCode) { // Only cache successful lookups cache.set(geoIpCacheKey, cachedCountryCode, 300); } ``` #### 3. Add cache monitoring ```typescript setInterval(() => { const stats = cache.getStats(); logger.debug( `Cache stats - Keys: ${stats.keys}, Hits: ${stats.hits}, Misses: ${stats.misses}, Hit rate: ...` ); }, 300000); // Every 5 minutes ``` ### Testing - ✅ TypeScript compilation successful (no type errors) - ✅ No functional changes to lookup logic - ✅ Backward compatible with existing cache usage - ✅ Cache statistics will be visible in debug logs ### Impact - **Memory usage**: Capped at ~10k keys instead of unbounded growth - **Performance**: No degradation; LRU ensures frequently accessed IPs remain cached - **Monitoring**: Cache stats in logs help identify issues early - **Undefined caching**: Prevents wasting cache space on failed lookups ### Related - Addresses issue raised by @Ragnaruk about lack of cache stats logging - Addresses issue raised by @nath1416 about OOM kills on 1GB VPS - ASN lookup caching already had conditional caching; now GeoIP matches that pattern
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github/pangolin#451