mirror of
https://github.com/OneUptime/oneuptime.git
synced 2026-04-06 00:32:12 +02:00
feat: adjust chart height calculation for multiple charts and update overflow behavior
This commit is contained in:
@@ -180,8 +180,9 @@ const DashboardChartComponentElement: FunctionComponent<ComponentProps> = (
|
||||
);
|
||||
}
|
||||
|
||||
const numberOfCharts: number = queryConfigs.length || 1;
|
||||
let heightOfChart: number | undefined =
|
||||
(props.dashboardComponentHeightInPx || 0) - 100;
|
||||
((props.dashboardComponentHeightInPx || 0) - 100) / numberOfCharts;
|
||||
|
||||
if (heightOfChart < 0) {
|
||||
heightOfChart = undefined;
|
||||
@@ -235,7 +236,7 @@ const DashboardChartComponentElement: FunctionComponent<ComponentProps> = (
|
||||
|
||||
return (
|
||||
<div
|
||||
className="w-full h-full overflow-hidden"
|
||||
className="w-full h-full overflow-auto"
|
||||
style={{
|
||||
opacity: isLoading ? 0.5 : 1,
|
||||
transition: "opacity 0.2s ease-in-out",
|
||||
|
||||
@@ -1,188 +1,25 @@
|
||||
# OpenTelemetry Profiles: Implementation Roadmap for OneUptime
|
||||
# OpenTelemetry Profiles: Remaining Roadmap
|
||||
|
||||
## Overview
|
||||
|
||||
OpenTelemetry Profiles is the fourth core observability signal (joining traces, metrics, and logs), providing a unified standard for continuous production profiling. As of March 2026, it has reached **Public Alpha** status. This document outlines how OneUptime can add first-class support for ingesting, storing, querying, and visualizing profiling data.
|
||||
|
||||
Reference: https://opentelemetry.io/blog/2026/profiles-alpha/
|
||||
All core phases (ingestion, storage, query API, frontend UI, alerting, docs) are implemented. This document tracks remaining future work items.
|
||||
|
||||
---
|
||||
|
||||
## Why Profiles Matter for OneUptime
|
||||
## Performance Optimization
|
||||
|
||||
- **Complete Observability**: Profiles fill the gap between "what happened" (traces/logs) and "why it was slow" (CPU/memory/allocation hotspots).
|
||||
- **Cross-Signal Correlation**: Profile samples carry `trace_id` and `span_id`, enabling direct linkage from a slow span to the exact flamegraph showing where time was spent.
|
||||
- **Cost Optimization**: Customers can use profiles to identify wasteful code paths and reduce compute costs.
|
||||
- **Competitive Parity**: Major vendors (Datadog, Grafana, Elastic) are actively building OTLP Profiles support.
|
||||
- **Materialized Views**: Pre-aggregate top functions per service per hour for faster queries
|
||||
- **Server-Side Sampling**: Downsampling for high-volume services to control storage costs
|
||||
- **Query Caching**: Cache aggregated flamegraph results to reduce ClickHouse load
|
||||
|
||||
---
|
||||
## Symbolization Pipeline
|
||||
|
||||
## Current Architecture (Context)
|
||||
Symbolization is NOT yet standardized in the OTel Profiles spec. The eBPF agent handles on-target symbolization for Go, and many runtimes provide symbol info at collection time. A dedicated symbolization pipeline (symbol uploads, deferred re-symbolization, object storage) can be added once the spec stabilizes.
|
||||
|
||||
OneUptime already ingests three OTel signals through a consistent pipeline:
|
||||
## Conformance Validation
|
||||
|
||||
```
|
||||
Client --> gRPC (4317) / HTTP (/otlp/v1/{signal})
|
||||
--> OtelRequestMiddleware (protobuf/JSON decode)
|
||||
--> TelemetryIngest Middleware (auth)
|
||||
--> 202 Accepted (immediate response)
|
||||
--> Bull MQ Queue (async)
|
||||
--> Ingest Service (batch processing)
|
||||
--> ClickHouse (MergeTree tables)
|
||||
```
|
||||
|
||||
Key files to reference:
|
||||
- Ingestion endpoints: `Telemetry/API/OTelIngest.ts`
|
||||
- gRPC server: `Telemetry/GrpcServer.ts`
|
||||
- Proto files: `Telemetry/ProtoFiles/OTel/v1/`
|
||||
- Analytics models: `Common/Models/AnalyticsModels/`
|
||||
- Queue services: `Telemetry/Services/Queue/`
|
||||
|
||||
The Profiles implementation should follow this exact same pattern for consistency.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Protocol & Ingestion Layer ✅ COMPLETE
|
||||
|
||||
**Status**: HTTP endpoint, gRPC service, TelemetryType enum, middleware chain, queue processing, and OTel Collector pipeline are all implemented.
|
||||
|
||||
**Implemented in:**
|
||||
- HTTP endpoint `POST /otlp/v1/profiles`: `Telemetry/API/OTelIngest.ts`
|
||||
- gRPC ProfilesService/Export: `Telemetry/GrpcServer.ts`
|
||||
- TelemetryType.Profile enum: `Common/Types/Telemetry/TelemetryType.ts`
|
||||
- Queue service: `Telemetry/Services/Queue/ProfilesQueueService.ts`
|
||||
- Queue handler: `Telemetry/Jobs/TelemetryIngest/ProcessTelemetry.ts`
|
||||
- OTel Collector profiles pipeline: `OTelCollector/otel-collector-config.template.yaml`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Data Model & ClickHouse Storage ✅ COMPLETE
|
||||
|
||||
**Status**: Both ClickHouse tables (profile, profile_sample) and database services are implemented with full schemas, ZSTD(3) compression, bloom filter skip indexes, and retention date support.
|
||||
|
||||
**Implemented in:**
|
||||
- Profile model: `Common/Models/AnalyticsModels/Profile.ts`
|
||||
- ProfileSample model: `Common/Models/AnalyticsModels/ProfileSample.ts`
|
||||
- ProfileService: `Common/Server/Services/ProfileService.ts`
|
||||
- ProfileSampleService: `Common/Server/Services/ProfileSampleService.ts`
|
||||
- API routes registered in `App/FeatureSet/BaseAPI/Index.ts`
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Ingestion Service ✅ COMPLETE
|
||||
|
||||
**Status**: Full OTLP Profiles ingestion is implemented including dictionary denormalization, inline frame handling, mixed-runtime stack support, trace/span correlation via Link table, stacktrace hashing (SHA256), batch processing, and graceful error handling.
|
||||
|
||||
**Implemented in:**
|
||||
- Ingest service (835 lines): `Telemetry/Services/OtelProfilesIngestService.ts`
|
||||
- Queue service: `Telemetry/Services/Queue/ProfilesQueueService.ts`
|
||||
- Queue handler: `Telemetry/Jobs/TelemetryIngest/ProcessTelemetry.ts`
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Query API ✅ COMPLETE
|
||||
|
||||
**Status**: Flamegraph aggregation, function list, diff flamegraph, and pprof export queries are all implemented.
|
||||
|
||||
**Implemented in:**
|
||||
- ProfileAggregationService: `Common/Server/Services/ProfileAggregationService.ts`
|
||||
- `getFlamegraph()` — Aggregated flamegraph tree from samples
|
||||
- `getFunctionList()` — Top functions by selfValue, totalValue, or sampleCount
|
||||
- `getDiffFlamegraph()` — Differential flamegraph comparing two time ranges
|
||||
- API endpoints in `Common/Server/API/TelemetryAPI.ts`:
|
||||
- `POST /telemetry/profiles/flamegraph`
|
||||
- `POST /telemetry/profiles/function-list`
|
||||
- `POST /telemetry/profiles/diff-flamegraph`
|
||||
- `GET /telemetry/profiles/:profileId/pprof`
|
||||
- CRUD routes for profile/profile-sample: `App/FeatureSet/BaseAPI/Index.ts`
|
||||
- pprof encoder: `Common/Server/Utils/Profile/PprofEncoder.ts`
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Frontend — Profiles UI ✅ COMPLETE
|
||||
|
||||
**Status**: All pages, components, cross-signal integration, and service-level profiles tab are implemented.
|
||||
|
||||
**Implemented in:**
|
||||
- Pages: `App/FeatureSet/Dashboard/src/Pages/Profiles/` (Index, View/Index, Layout, SideMenu, Documentation)
|
||||
- Components: `App/FeatureSet/Dashboard/src/Components/Profiles/`
|
||||
- ProfileFlamegraph — Interactive flamegraph with frame type color coding and zoom
|
||||
- ProfileFunctionList — Top functions table with sorting
|
||||
- ProfileTable — Profiles listing with service/type/attribute filters
|
||||
- ProfileTypeSelector — Dropdown filter for profile types (cpu, wall, alloc_objects, etc.)
|
||||
- ProfileTimeline — Bar chart showing profile sample density over time
|
||||
- DiffFlamegraph — Differential flamegraph comparing two time ranges (red=regression, green=improvement)
|
||||
- Frame type color coding: `App/FeatureSet/Dashboard/src/Utils/ProfileUtil.ts`
|
||||
- Cross-Signal Integration:
|
||||
- "View Profiles for this Trace" link in TraceExplorer span tooltips
|
||||
- Service > Profiles tab: `App/FeatureSet/Dashboard/src/Pages/Service/View/Profiles.tsx`
|
||||
- Route and side menu wiring for service-level profiles view
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Production Hardening ✅ MOSTLY COMPLETE
|
||||
|
||||
**Status**: Data retention, billing, compression, alerting, and pprof export are implemented. Symbolization pipeline and conformance validation are deferred.
|
||||
|
||||
### 6.1 Data Retention & Billing ✅ COMPLETE
|
||||
- TTL via `retentionDate DELETE` on both Profile and ProfileSample tables
|
||||
- Billing metering in `Common/Server/Services/TelemetryUsageBillingService.ts`
|
||||
- ZSTD compression on text columns, bloom filter skip indexes
|
||||
|
||||
### 6.2 Performance Optimization — Partially done
|
||||
- **Compression**: ✅ ZSTD(3) codec applied on stacktrace, labels columns
|
||||
- **Materialized Views**: ❌ Deferred — pre-aggregate top functions per service per hour
|
||||
- **Sampling**: ❌ Deferred — server-side downsampling for high-volume services
|
||||
- **Query Caching**: ❌ Deferred — cache aggregated flamegraph results
|
||||
|
||||
### 6.3 Symbolization Pipeline — ❌ Deferred (Future Work)
|
||||
Symbolization is NOT yet standardized in the OTel Profiles spec. The eBPF agent handles on-target symbolization for Go, and many runtimes provide symbol info at collection time. A dedicated symbolization pipeline (symbol uploads, deferred re-symbolization, object storage) can be added in a future release.
|
||||
|
||||
### 6.4 Alerting & Monitoring Integration ✅ COMPLETE
|
||||
**Implemented in:**
|
||||
- `MonitorType.Profiles` added to enum: `Common/Types/Monitor/MonitorType.ts`
|
||||
- `CheckOn.ProfileCount` added: `Common/Types/Monitor/CriteriaFilter.ts`
|
||||
- `ProfileMonitorResponse`: `Common/Types/Monitor/ProfileMonitor/ProfileMonitorResponse.ts`
|
||||
- `ProfileMonitorCriteria`: `Common/Server/Utils/Monitor/Criteria/ProfileMonitorCriteria.ts`
|
||||
- `MonitorStep` updated with `profileMonitor` field: `Common/Types/Monitor/MonitorStep.ts`
|
||||
- `MonitorCriteriaEvaluator` wired for Profiles: `Common/Server/Utils/Monitor/MonitorCriteriaEvaluator.ts`
|
||||
- `monitorProfile()` function: `Worker/Jobs/TelemetryMonitor/MonitorTelemetryMonitor.ts`
|
||||
|
||||
### 6.5 pprof Export ✅ COMPLETE
|
||||
**Implemented in:**
|
||||
- `GET /telemetry/profiles/:profileId/pprof`: `Common/Server/API/TelemetryAPI.ts`
|
||||
- PprofEncoder utility: `Common/Server/Utils/Profile/PprofEncoder.ts`
|
||||
- Reconstructs pprof-compatible JSON from denormalized data, gzip compressed
|
||||
|
||||
### 6.6 Conformance Validation — ❌ Deferred (Future Work)
|
||||
Integrate OTel `profcheck` tool into CI once core profiling features stabilize.
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Documentation & Launch ✅ COMPLETE
|
||||
|
||||
### 7.1 User-Facing Docs ✅ COMPLETE
|
||||
- Comprehensive profiles documentation: `App/FeatureSet/Docs/Content/telemetry/profiles.md`
|
||||
- Covers: profile types, setup instructions, instrumentation guides (Alloy, async-profiler, Go pprof, py-spy), OTel Collector config, features, and data retention
|
||||
|
||||
---
|
||||
|
||||
## Summary Timeline
|
||||
|
||||
| Phase | Description | Status |
|
||||
|-------|-------------|--------|
|
||||
| 1 | Protocol & Ingestion Layer | ✅ Complete |
|
||||
| 2 | Data Model & ClickHouse Storage | ✅ Complete |
|
||||
| 3 | Ingestion Service | ✅ Complete |
|
||||
| 4 | Query API | ✅ Complete |
|
||||
| 5 | Frontend — Profiles UI | ✅ Complete |
|
||||
| 6 | Production Hardening | ✅ Mostly complete (symbolization + conformance deferred) |
|
||||
| 7 | Documentation & Launch | ✅ Complete |
|
||||
|
||||
**Remaining future work:** Symbolization pipeline (symbol uploads, deferred re-symbolization), materialized views for performance, server-side downsampling, query caching, and OTel profcheck CI integration.
|
||||
|
||||
---
|
||||
|
||||
## Key Risks & Mitigations
|
||||
|
||||
| Risk | Impact | Mitigation |
|
||||
@@ -191,11 +28,8 @@ Integrate OTel `profcheck` tool into CI once core profiling features stabilize.
|
||||
| `v1development` package path will change to `v1` at GA | Proto import path migration | Abstract proto version behind internal types; plan migration script for when GA lands |
|
||||
| High storage volume from continuous profiling | ClickHouse disk/cost growth | Server-side sampling, aggressive TTL defaults (15 days), ZSTD(3) compression |
|
||||
| Flamegraph rendering performance with large profiles | Slow UI | Limit to top 10K stacktraces, lazy-load deep frames, pre-aggregate via materialized views |
|
||||
| Denormalization complexity (batch-scoped dictionary, inline frames, mixed runtimes) | Bugs, data loss | Extensive unit tests with real pprof data, conformance checker validation, test with eBPF agent output |
|
||||
| Symbolization is not standardized | Unsymbolized frames in flamegraphs | Store build IDs for deferred symbolization; accept eBPF agent's on-target symbolization as baseline |
|
||||
| Semantic conventions are minimal (only `profile.frame.type`) | Schema may need changes as conventions mature | Keep attribute storage flexible (JSON columns); avoid hardcoding specific attribute names |
|
||||
| Limited client-side instrumentation maturity | Low adoption | Start with eBPF profiler (no code changes needed), expand as ecosystem matures |
|
||||
| `original_payload` can be large | Storage bloat | Store on-demand only (when producer sets `original_payload_format`), not by default |
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user