From 46c9f36578a104678a67a8de21271fe28e7cd19c Mon Sep 17 00:00:00 2001 From: Nawaz Dhandala Date: Thu, 12 Mar 2026 11:26:10 +0000 Subject: [PATCH] feat(Logs): add Phase 5 optimizations for ClickHouse storage and query performance --- Internal/Roadmap/Logs.md | 254 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 254 insertions(+) diff --git a/Internal/Roadmap/Logs.md b/Internal/Roadmap/Logs.md index 7e6ae9c067..5683acdedc 100644 --- a/Internal/Roadmap/Logs.md +++ b/Internal/Roadmap/Logs.md @@ -302,6 +302,260 @@ These changes directly improve the daily log investigation experience for every 10. **Phase 3.3** - Drop Filters (moderate effort, cost optimization) 11. **Phase 4.x** - Patterns, Shortcuts, Data Scrubbing (future) +## Phase 5: ClickHouse Storage & Query Optimizations (P0) — Performance Foundation + +These optimizations address fundamental storage and indexing gaps in the `LogItem` table that directly impact search speed, data correctness, and operational cost. They should be prioritized alongside or before Phase 1 since search UX improvements are only as fast as the underlying queries. + +### Current Schema + +``` +Table: LogItem +Engine: MergeTree +Partition: sipHash64(projectId) % 16 +Primary/Sort Keys: (projectId, time, serviceId) +``` + +| Column | ClickHouse Type | Notes | +|--------|----------------|-------| +| projectId | String | Tenant ID | +| serviceId | String | Service ID | +| time | DateTime | **Second precision only** | +| timeUnixNano | Int128 | Nanosecond timestamp (not in sort key) | +| severityText | String | Trace/Debug/Info/Warning/Error/Fatal | +| severityNumber | Int32 | OTEL severity number (0-24) | +| attributes | JSON | Flexible key-value store | +| attributeKeys | Array(String) | Pre-extracted keys for discovery | +| traceId | String | Optional trace correlation | +| spanId | String | Optional span correlation | +| body | String | Log message text | + +### 5.1 Add Skip Indexes for Full-Text Search on `body` (Critical) + +**Current**: `body ILIKE '%text%'` in `LogAggregationService.ts:244` performs a full scan of every row in the matching partitions. For a table with billions of rows, this is extremely slow. + +**Target**: Add a token-based bloom filter index so ClickHouse can skip granules that definitely don't contain the search term. + +**Implementation**: + +- Add a `tokenbf_v1` index on the `body` column in the `Log` model definition at `Common/Models/AnalyticsModels/Log.ts` +- Extend `AnalyticsBaseModel` and `StatementGenerator` to support skip index definitions in the CREATE TABLE statement +- Migration to add the index to existing tables via `ALTER TABLE LogItem ADD INDEX` + +```sql +-- Target DDL addition +INDEX idx_body body TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 4 +``` + +**Expected improvement**: 10-100x faster text search on `body` for selective queries (terms that appear in a small fraction of granules). Non-selective queries (very common terms) still benefit from reduced I/O. + +**Files to modify**: +- `Common/Models/AnalyticsModels/Log.ts` (add index definition) +- `Common/Models/AnalyticsModels/AnalyticsBaseModel/AnalyticsBaseModel.ts` (support skip index metadata) +- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit INDEX clause in CREATE TABLE) +- `Worker/DataMigrations/` (new migration to add index to existing table) + +### 5.2 Add Skip Indexes for `traceId`, `spanId`, `severityText` (Critical) + +**Current**: Filtering by traceId or spanId (the primary way to correlate logs with distributed traces) requires scanning all data within the projectId+time range. Severity filtering has the same problem. + +**Target**: Add bloom filter and set indexes so ClickHouse can skip irrelevant granules. + +**Implementation**: + +```sql +-- Bloom filters for high-cardinality string columns +INDEX idx_trace_id traceId TYPE bloom_filter(0.01) GRANULARITY 1 +INDEX idx_span_id spanId TYPE bloom_filter(0.01) GRANULARITY 1 + +-- Set index for low-cardinality severity (only ~7 distinct values) +INDEX idx_severity severityText TYPE set(10) GRANULARITY 4 +``` + +- Add these index definitions to the `Log` model +- Create a data migration to apply indexes to existing tables + +**Expected improvement**: +- traceId/spanId lookups: 50-1000x faster (bloom filter skips nearly all granules for a specific trace) +- severity filtering: 2-5x faster (set index skips granules that don't contain the target severity) + +**Files to modify**: +- `Common/Models/AnalyticsModels/Log.ts` (add index definitions) +- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit INDEX clauses) +- `Worker/DataMigrations/` (new migration) + +### 5.3 Upgrade `time` Column to `DateTime64(9)` (High) + +**Current**: The `time` column uses ClickHouse `DateTime` which has **1-second granularity**. Logs within the same second from the same service are stored in arbitrary order. The `timeUnixNano` field (Int128) preserves nanosecond precision but is not in the sort key, so it cannot be used for sub-second ordering. + +**Target**: Use `DateTime64(9)` (nanosecond precision) so the sort key naturally orders logs at sub-second resolution. + +**Implementation**: + +- Change the `time` column type from `TableColumnType.Date` to a new `TableColumnType.DateTime64` in the Log model +- Add `DateTime64` support to `StatementGenerator` and the ClickHouse type mapping in `Statement.ts` +- Update ingestion code in `OtelLogsIngestService.ts` to write DateTime64-compatible timestamps +- Migration: `ALTER TABLE LogItem MODIFY COLUMN time DateTime64(9)` (this is a metadata-only operation in ClickHouse for MergeTree tables) +- Consider whether `timeUnixNano` column can be deprecated after this change since `time` would carry the same precision + +**Impact**: Correct sub-second log ordering. Currently, logs from a burst of activity within the same second may appear in wrong order. + +**Files to modify**: +- `Common/Models/AnalyticsModels/Log.ts` (change column type) +- `Common/Types/AnalyticsDatabase/TableColumnType.ts` (add DateTime64 type) +- `Common/Server/Utils/AnalyticsDatabase/Statement.ts` (add DateTime64 mapping) +- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (handle DateTime64 in CREATE/SELECT) +- `Telemetry/Services/OtelLogsIngestService.ts` (write DateTime64 timestamps) +- `Worker/DataMigrations/` (new migration) + +### 5.4 Add TTL for Automatic Data Retention (High) + +**Current**: The ingestion service tracks `dataRetentionInDays` per service but there is **no TTL clause** on the ClickHouse table. Data is never automatically deleted, leading to unbounded storage growth. + +**Target**: ClickHouse-native TTL so old data is automatically dropped. + +**Implementation**: + +- Add a TTL clause to the LogItem table definition: + ```sql + TTL time + INTERVAL 30 DAY DELETE + ``` +- For per-tenant or per-service retention, options include: + - A single global TTL (simplest) with a background job that deletes data beyond the service-specific retention + - A `retentionDate` column populated at ingest time (`time + dataRetentionInDays`) with `TTL retentionDate DELETE` +- Extend `AnalyticsBaseModel` to support TTL configuration +- Migration to add TTL to existing table: `ALTER TABLE LogItem MODIFY TTL time + INTERVAL 30 DAY DELETE` + +**Impact**: Prevents disk exhaustion. Without TTL, the only way to remove old data is manual `ALTER TABLE DELETE` which is expensive and must be scheduled externally. + +**Files to modify**: +- `Common/Models/AnalyticsModels/Log.ts` (add TTL config) +- `Common/Models/AnalyticsModels/AnalyticsBaseModel/AnalyticsBaseModel.ts` (TTL support) +- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit TTL clause) +- `Worker/DataMigrations/` (new migration) + +### 5.5 Fix SQL Construction in `LogAggregationService` (High) + +**Current**: `LogAggregationService.appendCommonFilters()` at lines 206-249 constructs SQL via string interpolation for serviceIds, severityTexts, traceIds, and spanIds: + +```typescript +// Current - string interpolation with manual escaping +const idStrings = request.serviceIds.map(id => `'${id.toString()}'`); +statement.append(` AND serviceId IN (${idStrings.join(",")})`); +``` + +While `ObjectID.toString()` is likely safe, severity and trace/span values come from user input and are only protected by a simple `escapeSingleQuotes` function. This is fragile and inconsistent with the parameterized approach used elsewhere. + +**Target**: Use parameterized queries for all filter values. + +**Implementation**: + +- Refactor `appendCommonFilters` to use the `Statement` class's parameterized query support for all IN clauses +- Each value should be a named parameter with proper type annotation +- Remove the `escapeSingleQuotes` helper since it would no longer be needed + +**Files to modify**: +- `Common/Server/Services/LogAggregationService.ts` (refactor appendCommonFilters) + +### 5.6 Add ZSTD Compression on `body` Column (Medium) + +**Current**: The table uses ClickHouse's default LZ4 compression for all columns. The `body` column is typically the largest column (log messages are verbose text) and is highly compressible. + +**Target**: Use ZSTD compression on `body` for better compression ratio. + +**Implementation**: + +- Add codec specification to the `body` column: `body String CODEC(ZSTD(3))` +- Extend `AnalyticsBaseModel`/`StatementGenerator` to support per-column codec specifications +- Migration: `ALTER TABLE LogItem MODIFY COLUMN body String CODEC(ZSTD(3))` +- Consider ZSTD for `attributes` column as well (JSON is also highly compressible) + +**Expected improvement**: 30-50% reduction in storage for the `body` column compared to LZ4, with minimal CPU overhead increase on reads. + +**Files to modify**: +- `Common/Models/AnalyticsModels/Log.ts` (add codec config) +- `Common/Types/AnalyticsDatabase/TableColumn.ts` (add codec property) +- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit CODEC clause) +- `Worker/DataMigrations/` (new migration) + +### 5.7 Add Projections for Histogram Queries (Medium) + +**Current**: `projections: []` is empty. Every histogram query (group by time bucket + severity) and facet query scans raw data and performs the aggregation from scratch. + +**Target**: ClickHouse projections that pre-aggregate data for the most common query patterns. + +**Implementation**: + +- Add a projection for histogram/severity aggregation: + ```sql + PROJECTION proj_severity_histogram ( + SELECT + severityText, + toStartOfInterval(time, INTERVAL 1 MINUTE) AS minute, + count() AS cnt + ORDER BY (projectId, minute, severityText) + ) + ``` +- Extend the existing `Projection` type at `Common/Types/AnalyticsDatabase/Projection.ts` to support full projection definitions +- Wire projection creation into `StatementGenerator.toTableCreateStatement()` +- Migration to materialize the projection on existing data: `ALTER TABLE LogItem MATERIALIZE PROJECTION proj_severity_histogram` + +**Expected improvement**: 5-10x faster histogram queries since ClickHouse reads the pre-aggregated projection instead of scanning raw log rows. + +**Files to modify**: +- `Common/Models/AnalyticsModels/Log.ts` (define projections) +- `Common/Types/AnalyticsDatabase/Projection.ts` (extend type) +- `Common/Server/Utils/AnalyticsDatabase/StatementGenerator.ts` (emit PROJECTION clause) +- `Worker/DataMigrations/` (new migration to materialize) + +### 5.8 Store Missing OpenTelemetry Log Fields (Low) + +**Current**: Several standard OTEL log record fields are dropped during ingestion: +- `observedTimeUnixNano` — when the log was collected by the pipeline (useful for measuring ingestion lag) +- `droppedAttributesCount` — signals data loss during collection +- `flags` — log record flags (e.g., W3C trace flags) + +**Target**: Preserve these fields for full OTEL compliance and operational debugging. + +**Implementation**: + +- Add optional columns to the Log model: + - `observedTimeUnixNano` (LongNumber) + - `droppedAttributesCount` (Number, default 0) + - `flags` (Number, default 0) +- Update `OtelLogsIngestService.processLogsAsync()` to extract and store these fields +- Migration to add columns to existing table + +**Files to modify**: +- `Common/Models/AnalyticsModels/Log.ts` (add columns) +- `Telemetry/Services/OtelLogsIngestService.ts` (extract additional fields) +- `Worker/DataMigrations/` (new migration) + +### 5.x Performance Impact Summary + +| Optimization | Query Pattern Improved | Expected Speedup | Effort | +|-------------|----------------------|-------------------|--------| +| 5.1 Body text index | Full-text search on body | 10-100x | Medium | +| 5.2 traceId/spanId/severity indexes | Trace correlation, severity filter | 50-1000x (trace), 2-5x (severity) | Small | +| 5.3 DateTime64 time column | Sub-second log ordering | Correctness fix | Medium | +| 5.4 TTL data retention | Storage management | Prevents disk exhaustion | Small | +| 5.5 Parameterized SQL | Security hardening | N/A (security fix) | Small | +| 5.6 ZSTD compression | Storage cost | 30-50% less disk for body | Small | +| 5.7 Histogram projections | Histogram and severity aggregation | 5-10x | Medium | +| 5.8 Missing OTEL fields | OTEL compliance | N/A (completeness) | Small | + +### 5.x Recommended Sub-Phase Order + +1. **5.2** — Skip indexes for traceId/spanId/severity (smallest effort, largest query speedup) +2. **5.1** — Body text index (biggest impact on user-facing search) +3. **5.5** — Parameterized SQL fix (security, small effort) +4. **5.4** — TTL data retention (operational necessity) +5. **5.3** — DateTime64 upgrade (correctness) +6. **5.6** — ZSTD compression (cost optimization) +7. **5.7** — Projections (performance polish) +8. **5.8** — Missing OTEL fields (completeness) + +--- + ## Verification For each feature: