feat(Logs): add Phase 5 optimizations for ClickHouse storage and query performance

2026-04-06 00:32:12 +02:00 · 2026-03-12 11:26:10 +00:00
parent 981f609b84
commit 46c9f36578
1 changed files with 254 additions and 0 deletions
--- a/Internal/Roadmap/Logs.md
+++ b/Internal/Roadmap/Logs.md
@@ -302,6 +302,260 @@ These changes directly improve the daily log investigation experience for every
 10. **Phase 3.3** - Drop Filters (moderate effort, cost optimization)
 11. **Phase 4.x** - Patterns, Shortcuts, Data Scrubbing (future)

+## Phase 5: ClickHouse Storage & Query Optimizations (P0) — Performance Foundation
+
+These optimizations address fundamental storage and indexing gaps in the `LogItem` table that directly impact search speed, data correctness, and operational cost. They should be prioritized alongside or before Phase 1 since search UX improvements are only as fast as the underlying queries.
+
+### Current Schema
+
+```
+Table:      LogItem
+Engine:     MergeTree
+Partition:  sipHash64(projectId) % 16
+Primary/Sort Keys: (projectId, time, serviceId)
+```
+
+| Column | ClickHouse Type | Notes |
+|--------|----------------|-------|
+| projectId | String | Tenant ID |
+| serviceId | String | Service ID |
+| time | DateTime | **Second precision only** |
+| timeUnixNano | Int128 | Nanosecond timestamp (not in sort key) |
+| severityText | String | Trace/Debug/Info/Warning/Error/Fatal |
+| severityNumber | Int32 | OTEL severity number (0-24) |
+| attributes | JSON | Flexible key-value store |
+| attributeKeys | Array(String) | Pre-extracted keys for discovery |
+| traceId | String | Optional trace correlation |
+| spanId | String | Optional span correlation |
+| body | String | Log message text |
+
+### 5.1 Add Skip Indexes for Full-Text Search on `body` (Critical)
+
+**Current**: `body ILIKE '%text%'` in `LogAggregationService.ts:244` performs a full scan of every row in the matching partitions. For a table with billions of rows, this is extremely slow.
+
+**Target**: Add a token-based bloom filter index so ClickHouse can skip granules that definitely don't contain the search term.
+
+**Implementation**:
+
+- Add a `tokenbf_v1` index on the `body` column in the `Log` model definition at `Common/Models/AnalyticsModels/Log.ts`
+- Extend `AnalyticsBaseModel` and `StatementGenerator` to support skip index definitions in the CREATE TABLE statement
+- Migration to add the index to existing tables via `ALTER TABLE LogItem ADD INDEX`
+
+```sql
+-- Target DDL addition
+INDEX idx_body body TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 4
+```
+
+**Expected improvement**: 10-100x faster text search on `body` for selective queries (terms that appear in a small fraction of granules). Non-selective queries (very common terms) still benefit from reduced I/O.
+
+**Files to modify**:
+- `Common/Models/AnalyticsModels/Log.ts` (add index definition)
+- `Common/Models/AnalyticsModels/AnalyticsBaseModel/AnalyticsBaseModel.ts` (support skip index metadata)
+- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit INDEX clause in CREATE TABLE)
+- `Worker/DataMigrations/` (new migration to add index to existing table)
+
+### 5.2 Add Skip Indexes for `traceId`, `spanId`, `severityText` (Critical)
+
+**Current**: Filtering by traceId or spanId (the primary way to correlate logs with distributed traces) requires scanning all data within the projectId+time range. Severity filtering has the same problem.
+
+**Target**: Add bloom filter and set indexes so ClickHouse can skip irrelevant granules.
+
+**Implementation**:
+
+```sql
+-- Bloom filters for high-cardinality string columns
+INDEX idx_trace_id traceId TYPE bloom_filter(0.01) GRANULARITY 1
+INDEX idx_span_id spanId TYPE bloom_filter(0.01) GRANULARITY 1
+
+-- Set index for low-cardinality severity (only ~7 distinct values)
+INDEX idx_severity severityText TYPE set(10) GRANULARITY 4
+```
+
+- Add these index definitions to the `Log` model
+- Create a data migration to apply indexes to existing tables
+
+**Expected improvement**:
+- traceId/spanId lookups: 50-1000x faster (bloom filter skips nearly all granules for a specific trace)
+- severity filtering: 2-5x faster (set index skips granules that don't contain the target severity)
+
+**Files to modify**:
+- `Common/Models/AnalyticsModels/Log.ts` (add index definitions)
+- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit INDEX clauses)
+- `Worker/DataMigrations/` (new migration)
+
+### 5.3 Upgrade `time` Column to `DateTime64(9)` (High)
+
+**Current**: The `time` column uses ClickHouse `DateTime` which has **1-second granularity**. Logs within the same second from the same service are stored in arbitrary order. The `timeUnixNano` field (Int128) preserves nanosecond precision but is not in the sort key, so it cannot be used for sub-second ordering.
+
+**Target**: Use `DateTime64(9)` (nanosecond precision) so the sort key naturally orders logs at sub-second resolution.
+
+**Implementation**:
+
+- Change the `time` column type from `TableColumnType.Date` to a new `TableColumnType.DateTime64` in the Log model
+- Add `DateTime64` support to `StatementGenerator` and the ClickHouse type mapping in `Statement.ts`
+- Update ingestion code in `OtelLogsIngestService.ts` to write DateTime64-compatible timestamps
+- Migration: `ALTER TABLE LogItem MODIFY COLUMN time DateTime64(9)` (this is a metadata-only operation in ClickHouse for MergeTree tables)
+- Consider whether `timeUnixNano` column can be deprecated after this change since `time` would carry the same precision
+
+**Impact**: Correct sub-second log ordering. Currently, logs from a burst of activity within the same second may appear in wrong order.
+
+**Files to modify**:
+- `Common/Models/AnalyticsModels/Log.ts` (change column type)
+- `Common/Types/AnalyticsDatabase/TableColumnType.ts` (add DateTime64 type)
+- `Common/Server/Utils/AnalyticsDatabase/Statement.ts` (add DateTime64 mapping)
+- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (handle DateTime64 in CREATE/SELECT)
+- `Telemetry/Services/OtelLogsIngestService.ts` (write DateTime64 timestamps)
+- `Worker/DataMigrations/` (new migration)
+
+### 5.4 Add TTL for Automatic Data Retention (High)
+
+**Current**: The ingestion service tracks `dataRetentionInDays` per service but there is **no TTL clause** on the ClickHouse table. Data is never automatically deleted, leading to unbounded storage growth.
+
+**Target**: ClickHouse-native TTL so old data is automatically dropped.
+
+**Implementation**:
+
+- Add a TTL clause to the LogItem table definition:
+  ```sql
+  TTL time + INTERVAL 30 DAY DELETE
+  ```
+- For per-tenant or per-service retention, options include:
+  - A single global TTL (simplest) with a background job that deletes data beyond the service-specific retention
+  - A `retentionDate` column populated at ingest time (`time + dataRetentionInDays`) with `TTL retentionDate DELETE`
+- Extend `AnalyticsBaseModel` to support TTL configuration
+- Migration to add TTL to existing table: `ALTER TABLE LogItem MODIFY TTL time + INTERVAL 30 DAY DELETE`
+
+**Impact**: Prevents disk exhaustion. Without TTL, the only way to remove old data is manual `ALTER TABLE DELETE` which is expensive and must be scheduled externally.
+
+**Files to modify**:
+- `Common/Models/AnalyticsModels/Log.ts` (add TTL config)
+- `Common/Models/AnalyticsModels/AnalyticsBaseModel/AnalyticsBaseModel.ts` (TTL support)
+- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit TTL clause)
+- `Worker/DataMigrations/` (new migration)
+
+### 5.5 Fix SQL Construction in `LogAggregationService` (High)
+
+**Current**: `LogAggregationService.appendCommonFilters()` at lines 206-249 constructs SQL via string interpolation for serviceIds, severityTexts, traceIds, and spanIds:
+
+```typescript
+// Current - string interpolation with manual escaping
+const idStrings = request.serviceIds.map(id => `'${id.toString()}'`);
+statement.append(` AND serviceId IN (${idStrings.join(",")})`);
+```
+
+While `ObjectID.toString()` is likely safe, severity and trace/span values come from user input and are only protected by a simple `escapeSingleQuotes` function. This is fragile and inconsistent with the parameterized approach used elsewhere.
+
+**Target**: Use parameterized queries for all filter values.
+
+**Implementation**:
+
+- Refactor `appendCommonFilters` to use the `Statement` class's parameterized query support for all IN clauses
+- Each value should be a named parameter with proper type annotation
+- Remove the `escapeSingleQuotes` helper since it would no longer be needed
+
+**Files to modify**:
+- `Common/Server/Services/LogAggregationService.ts` (refactor appendCommonFilters)
+
+### 5.6 Add ZSTD Compression on `body` Column (Medium)
+
+**Current**: The table uses ClickHouse's default LZ4 compression for all columns. The `body` column is typically the largest column (log messages are verbose text) and is highly compressible.
+
+**Target**: Use ZSTD compression on `body` for better compression ratio.
+
+**Implementation**:
+
+- Add codec specification to the `body` column: `body String CODEC(ZSTD(3))`
+- Extend `AnalyticsBaseModel`/`StatementGenerator` to support per-column codec specifications
+- Migration: `ALTER TABLE LogItem MODIFY COLUMN body String CODEC(ZSTD(3))`
+- Consider ZSTD for `attributes` column as well (JSON is also highly compressible)
+
+**Expected improvement**: 30-50% reduction in storage for the `body` column compared to LZ4, with minimal CPU overhead increase on reads.
+
+**Files to modify**:
+- `Common/Models/AnalyticsModels/Log.ts` (add codec config)
+- `Common/Types/AnalyticsDatabase/TableColumn.ts` (add codec property)
+- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit CODEC clause)
+- `Worker/DataMigrations/` (new migration)
+
+### 5.7 Add Projections for Histogram Queries (Medium)
+
+**Current**: `projections: []` is empty. Every histogram query (group by time bucket + severity) and facet query scans raw data and performs the aggregation from scratch.
+
+**Target**: ClickHouse projections that pre-aggregate data for the most common query patterns.
+
+**Implementation**:
+
+- Add a projection for histogram/severity aggregation:
+  ```sql
+  PROJECTION proj_severity_histogram (
+    SELECT
+      severityText,
+      toStartOfInterval(time, INTERVAL 1 MINUTE) AS minute,
+      count() AS cnt
+    ORDER BY (projectId, minute, severityText)
+  )
+  ```
+- Extend the existing `Projection` type at `Common/Types/AnalyticsDatabase/Projection.ts` to support full projection definitions
+- Wire projection creation into `StatementGenerator.toTableCreateStatement()`
+- Migration to materialize the projection on existing data: `ALTER TABLE LogItem MATERIALIZE PROJECTION proj_severity_histogram`
+
+**Expected improvement**: 5-10x faster histogram queries since ClickHouse reads the pre-aggregated projection instead of scanning raw log rows.
+
+**Files to modify**:
+- `Common/Models/AnalyticsModels/Log.ts` (define projections)
+- `Common/Types/AnalyticsDatabase/Projection.ts` (extend type)
+- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit PROJECTION clause)
+- `Worker/DataMigrations/` (new migration to materialize)
+
+### 5.8 Store Missing OpenTelemetry Log Fields (Low)
+
+**Current**: Several standard OTEL log record fields are dropped during ingestion:
+- `observedTimeUnixNano` — when the log was collected by the pipeline (useful for measuring ingestion lag)
+- `droppedAttributesCount` — signals data loss during collection
+- `flags` — log record flags (e.g., W3C trace flags)
+
+**Target**: Preserve these fields for full OTEL compliance and operational debugging.
+
+**Implementation**:
+
+- Add optional columns to the Log model:
+  - `observedTimeUnixNano` (LongNumber)
+  - `droppedAttributesCount` (Number, default 0)
+  - `flags` (Number, default 0)
+- Update `OtelLogsIngestService.processLogsAsync()` to extract and store these fields
+- Migration to add columns to existing table
+
+**Files to modify**:
+- `Common/Models/AnalyticsModels/Log.ts` (add columns)
+- `Telemetry/Services/OtelLogsIngestService.ts` (extract additional fields)
+- `Worker/DataMigrations/` (new migration)
+
+### 5.x Performance Impact Summary
+
+| Optimization | Query Pattern Improved | Expected Speedup | Effort |
+|-------------|----------------------|-------------------|--------|
+| 5.1 Body text index | Full-text search on body | 10-100x | Medium |
+| 5.2 traceId/spanId/severity indexes | Trace correlation, severity filter | 50-1000x (trace), 2-5x (severity) | Small |
+| 5.3 DateTime64 time column | Sub-second log ordering | Correctness fix | Medium |
+| 5.4 TTL data retention | Storage management | Prevents disk exhaustion | Small |
+| 5.5 Parameterized SQL | Security hardening | N/A (security fix) | Small |
+| 5.6 ZSTD compression | Storage cost | 30-50% less disk for body | Small |
+| 5.7 Histogram projections | Histogram and severity aggregation | 5-10x | Medium |
+| 5.8 Missing OTEL fields | OTEL compliance | N/A (completeness) | Small |
+
+### 5.x Recommended Sub-Phase Order
+
+1. **5.2** — Skip indexes for traceId/spanId/severity (smallest effort, largest query speedup)
+2. **5.1** — Body text index (biggest impact on user-facing search)
+3. **5.5** — Parameterized SQL fix (security, small effort)
+4. **5.4** — TTL data retention (operational necessity)
+5. **5.3** — DateTime64 upgrade (correctness)
+6. **5.6** — ZSTD compression (cost optimization)
+7. **5.7** — Projections (performance polish)
+8. **5.8** — Missing OTEL fields (completeness)
+
+---
+
 ## Verification

 For each feature: