mirror of
https://github.com/OneUptime/oneuptime.git
synced 2026-04-06 00:32:12 +02:00
725 lines
38 KiB
Markdown
725 lines
38 KiB
Markdown
# Plan: Bring OneUptime Dashboards to Industry Parity and Beyond
|
||
|
||
## Context
|
||
|
||
OneUptime's dashboard implementation provides a 12-column grid layout with drag-and-drop editing, 3 widget types (Chart with Line/Bar, Value, Text with basic formatting), global time range with presets, view/edit modes, role-based permissions, and full-screen support. Dashboard config is stored as a single JSON column. Dashboards can only query OpenTelemetry metrics from ClickHouse.
|
||
|
||
This plan identifies the remaining gaps vs Grafana, Datadog, and New Relic, and proposes a phased implementation to build a best-in-class dashboard product that leverages OneUptime's unique position as an all-in-one observability + status page platform.
|
||
|
||
## Completed
|
||
|
||
The following features have been implemented:
|
||
- **12-Column Grid Layout** - Fixed grid with dynamic unit sizing, 60 default rows (expandable)
|
||
- **Drag-and-Drop Editing** - Move and resize components with bounds checking
|
||
- **Chart Widget** - Line and Bar chart types with single metric query, configurable title/description/legend
|
||
- **Value Widget** - Single metric aggregation displayed as large number
|
||
- **Text Widget** - Bold/Italic/Underline formatting (no markdown)
|
||
- **Global Time Range** - Presets (30min to 3mo) + custom date range picker
|
||
- **View/Edit Modes** - Read-only view with full-screen, edit mode with side panel settings
|
||
- **Role-Based Permissions** - ProjectOwner, ProjectAdmin, ProjectMember + custom permissions
|
||
- **Dashboard CRUD API** - Standard REST API with slug generation
|
||
- **Billing Enforcement** - Free plan limited to 1 dashboard
|
||
- **Area Chart** (Phase 1.1) - Area and Stacked Area chart types added to ChartType enum and rendered via chart component
|
||
- **Table Widget** (Phase 1.1) - New DashboardComponentType.Table with timestamp/value columns, sticky header, configurable max rows
|
||
- **Gauge Widget** (Phase 1.1) - SVG semi-circle gauge with threshold-based color coding (green/yellow/red), configurable min/max/thresholds
|
||
- **Template Variables** (Phase 1.2) - DashboardVariable type with CustomList, Query, and TextInput types; toolbar dropdown/input selectors; variable changes trigger widget refresh
|
||
- **Auto-Refresh** (Phase 1.3) - 7 interval options (5s to 15m), timer pauses in edit mode, pulsing indicator, interval persisted in dashboard config
|
||
- **Multiple Queries per Chart** (Phase 1.4) - metricQueryConfigs array support with fallback to single metricQueryConfig; each query rendered as a separate series
|
||
- **Markdown Support** (Phase 1.5) - isMarkdown flag on text components; renders LazyMarkdownViewer when enabled, falls back to bold/italic/underline when disabled
|
||
- **Threshold / Color Coding** (Phase 1.6) - Warning and critical threshold config on Value and Chart components; Value widget changes background/text color (green → yellow → red) based on thresholds
|
||
- **Legend Interaction** (Phase 1.7) - onValueChange enabled on Line, Area, and Bar charts for built-in Tremor legend click-to-toggle filtering
|
||
- **Chart Zoom** (Phase 1.8) - Time range zoom stack with Reset Zoom button in toolbar; pushing current range before zoom, popping on reset
|
||
|
||
## Gap Analysis Summary
|
||
|
||
| Feature | OneUptime | Grafana | Datadog | New Relic | Priority |
|
||
|---------|-----------|---------|---------|-----------|----------|
|
||
| Widget types | 5 (Chart, Value, Text, Table, Gauge) | 20+ | 40+ | 15+ | ~~**P0**~~ Done |
|
||
| Chart types | 4 (Line, Bar, Area, Stacked Area) | 10+ | 12+ | 10+ | ~~**P0**~~ Done |
|
||
| Template variables | 3 types (CustomList, Query, TextInput) | 6+ types | Yes | 3 types | ~~**P0**~~ Done |
|
||
| Auto-refresh | 7 intervals (5s–15m) | Configurable | Real-time | Yes | ~~**P0**~~ Done |
|
||
| Log panels | None | Yes (Loki) | Yes | Yes (NRQL) | **P0** |
|
||
| Trace panels | None | Yes (Tempo) | Yes | Yes | **P0** |
|
||
| Table widget | Yes | Yes | Yes | Yes | ~~**P0**~~ Done |
|
||
| Multiple queries per chart | Yes (array) | Yes | Yes | Yes | ~~**P0**~~ Done |
|
||
| Markdown support | Yes (toggle) | Full markdown | Full markdown | Full markdown | ~~**P0**~~ Done |
|
||
| Threshold lines / color coding | Value widget color coding | Yes | Yes | Yes | ~~**P0**~~ Partial |
|
||
| Legend interaction (show/hide) | Yes (click toggle) | Yes | Yes | Yes | ~~**P0**~~ Done |
|
||
| Chart zoom | Yes (time range stack) | Yes | Yes | Yes | ~~**P0**~~ Done |
|
||
| Unified query plugin interface | None | Datasource plugins | Yes | NRQL | **P0** |
|
||
| Dashboard linking / drill-down | None | Data links | Yes | Facet linking | **P1** |
|
||
| Annotations / event overlays | None | Yes | Yes | Yes (Labs) | **P1** |
|
||
| Row/section grouping | None | Collapsible rows | Groups | No | **P1** |
|
||
| Public/shared dashboards | None | Yes | Yes | Yes | **P1** |
|
||
| JSON import/export | None | Yes | Yes | Yes | **P1** |
|
||
| Dashboard versioning | None | Yes | Yes | No | **P1** |
|
||
| Alert integration | None | Create from panel + show state | Yes | NRQL alerts | **P1** |
|
||
| TV/Kiosk mode | Full-screen only | Kiosk mode | Yes | Auto-cycling | **P1** |
|
||
| CSV export | None | Yes | Yes | Yes | **P1** |
|
||
| Custom time per widget | None | No | No | No | **P1** |
|
||
| Perses/Grafana import | None | N/A | No | No | **P1** |
|
||
| AI dashboard creation | None | None | None | None | **P2** |
|
||
| Dashboard-as-code SDK | None | Foundation SDK | No | No | **P2** |
|
||
| Terraform provider | None | Yes | Yes | Yes | **P2** |
|
||
|
||
---
|
||
|
||
## Architecture: Query Plugin Interface & Perses Compatibility
|
||
|
||
Before implementing features, we should establish a `QueryPlugin` interface that all widget data sources use. This is a foundational architectural change that enables Phase 2 (logs, traces) and Phase 4 (Dashboard-as-Code) cleanly.
|
||
|
||
### Why Not Adopt Perses Wholesale?
|
||
|
||
[Perses](https://perses.dev) is a CNCF Sandbox project providing an open dashboard specification and embeddable UI components. We evaluated it as a potential protocol for our dashboard system. The decision is to **selectively borrow patterns** rather than adopt it fully:
|
||
|
||
**Against full adoption:**
|
||
- Our `AggregateBy` API queries ClickHouse directly. Perses assumes Prometheus/PromQL as the primary query language — mapping our ClickHouse aggregation queries into Perses's `PrometheusTimeSeriesQuery` plugin model adds unnecessary indirection.
|
||
- Phase 2 (click-to-correlate, cross-signal correlation) is our biggest differentiator. Perses has basic Tempo/Loki plugins but nothing like unified correlation. Adopting their panel model would constrain our ability to build these features.
|
||
- Perses is still CNCF Sandbox stage with data model structs marked deprecated in favor of a new `perses/spec` repo. The spec is not yet stable enough to build a product on.
|
||
- Maintaining a translation layer between Perses spec and our internal `DashboardViewConfig` format for every feature would add ongoing overhead.
|
||
|
||
**What we selectively adopt:**
|
||
|
||
| Perses Concept | Where It Helps | How |
|
||
|---|---|---|
|
||
| `kind`+`spec` plugin pattern | Phase 1.9 (QueryPlugin), Phase 2.1-2.2 | Formalize widget data sources as plugins instead of hardcoding every widget type |
|
||
| Variable model with scoping | Phase 1.2 (Template Variables) | Adopt query-based, list, and text variable types with dashboard → project → global scoping |
|
||
| Decoupled layout from panels | Phase 3.4 (Sections) | Separate panel definitions from grid positions to make sections/grouping cleaner |
|
||
| Dashboard JSON schema | Phase 3.2 (Import/Export) | Support importing Perses-format dashboards alongside native format for Grafana migration path |
|
||
|
||
### 1.9 Unified Query Plugin Interface
|
||
|
||
**Current**: Widgets are hardcoded to query metrics via `MetricQueryConfigData` and the `AggregateBy` API. Adding logs or traces as data sources requires duplicating the entire query path.
|
||
**Target**: A `QueryPlugin` interface that abstracts data sources, enabling any widget to query metrics, logs, or traces through a unified contract.
|
||
|
||
**Design**:
|
||
|
||
```typescript
|
||
// The plugin pattern borrowed from Perses: kind + spec
|
||
interface QueryPlugin {
|
||
kind: "MetricQuery" | "LogQuery" | "TraceQuery" | "FormulaQuery";
|
||
spec: MetricQuerySpec | LogQuerySpec | TraceQuerySpec | FormulaQuerySpec;
|
||
}
|
||
|
||
interface MetricQuerySpec {
|
||
metricName: string;
|
||
attributes: JSONObject;
|
||
aggregationType: AggregationType;
|
||
groupBy?: string[];
|
||
}
|
||
|
||
interface LogQuerySpec {
|
||
severityFilter?: SeverityLevel[];
|
||
serviceFilter?: string[];
|
||
bodyContains?: string;
|
||
attributes?: JSONObject;
|
||
}
|
||
|
||
interface TraceQuerySpec {
|
||
serviceFilter?: string[];
|
||
operationFilter?: string[];
|
||
statusFilter?: TraceStatus[];
|
||
minDuration?: Duration;
|
||
}
|
||
|
||
interface FormulaQuerySpec {
|
||
formula: string; // e.g., "a / b * 100"
|
||
queries: Record<string, QueryPlugin>; // named sub-queries
|
||
}
|
||
|
||
// Each widget stores an array of QueryPlugins instead of MetricQueryConfigData
|
||
interface DashboardWidgetConfig {
|
||
queries: QueryPlugin[];
|
||
// ... other widget config
|
||
}
|
||
```
|
||
|
||
**Benefits**:
|
||
- Log stream and trace list widgets (Phase 2.1, 2.2) plug in without new query plumbing
|
||
- Cross-signal correlation (Phase 2.3) becomes a multi-query widget with mixed `kind` values
|
||
- Formula queries (Phase 1.4) compose naturally across query types
|
||
- Future data sources (e.g., external Prometheus, custom APIs) add a new `kind` without touching widget code
|
||
- Aligns with Perses's extensibility model without coupling to their spec
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/QueryPlugin.ts` (new - interface definitions)
|
||
- `Common/Types/Metrics/MetricsQuery.ts` (refactor to implement MetricQuerySpec)
|
||
- `Common/UI/Utils/AnalyticsModelAPI/AnalyticsModelAPI.ts` (add query resolver that dispatches by `kind`)
|
||
- `Common/Server/API/BaseAnalyticsAPI.ts` (add unified query endpoint)
|
||
- `App/FeatureSet/Dashboard/src/Components/Metrics/Utils/Metrics.ts` (refactor fetchResults to use QueryPlugin)
|
||
|
||
---
|
||
|
||
## Phase 1: Foundation (P0) — Close Critical Gaps
|
||
|
||
These gaps make OneUptime dashboards fundamentally non-competitive. Every major competitor has these.
|
||
|
||
### 1.1 Add Core Chart Types: Area, Pie, Table, Gauge, Heatmap, Histogram ✅ (Partial)
|
||
|
||
**Status**: Area, Stacked Area, Table, and Gauge are implemented. Pie, Heatmap, and Histogram enum values are defined but rendering is not yet implemented.
|
||
**Current**: Line and Bar only.
|
||
**Target**: 8+ chart types covering all standard observability visualization needs.
|
||
|
||
**Implementation**:
|
||
|
||
- **Area Chart** (stacked and non-stacked): Extension of line chart with fill. Use existing chart library's area mode
|
||
- **Pie/Donut Chart**: For proportional breakdowns (e.g., error distribution by service). New component
|
||
- **Table Widget**: Tabular metric data, top-N lists, multi-column display with sortable columns. New component type
|
||
- **Gauge Widget**: Circular gauge with configurable min/max/thresholds and color zones. New component
|
||
- **Heatmap**: Time on X-axis, value buckets on Y-axis, color intensity for count. Essential for distribution/histogram metrics
|
||
- **Histogram**: Bar chart showing value distribution. Important for latency analysis
|
||
|
||
Each chart type needs:
|
||
- A new entry in `DashboardComponentType` or `ChartType` enum
|
||
- A rendering component in `Dashboard/Components/`
|
||
- Configuration options in the component settings side panel
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/Chart/ChartType.ts` (add Area, Pie, Heatmap, Histogram, Gauge)
|
||
- `Common/Types/Dashboard/DashboardComponentType.ts` (add Table, Gauge)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx` (render new types)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardTableComponent.tsx` (new)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardGaugeComponent.tsx` (new)
|
||
|
||
### 1.2 Template Variables ✅
|
||
|
||
**Status**: Implemented. DashboardVariable type with CustomList, Query, and TextInput types. Toolbar renders dropdown selectors and text inputs. Variable changes trigger widget refresh via refreshTick.
|
||
**Current**: No template variables. Users must create separate dashboards for each service/host/environment.
|
||
**Target**: Drop-down variable selectors that dynamically filter all widgets.
|
||
|
||
**Implementation**:
|
||
|
||
- Create a `DashboardVariable` type stored in `dashboardViewConfig`:
|
||
- Name, label, type (query-based, custom list, text input)
|
||
- Query-based: runs a ClickHouse query to populate options (e.g., `SELECT DISTINCT service FROM MetricItem WHERE projectId = {pid}`)
|
||
- Custom list: manually defined options
|
||
- Multi-value selection support
|
||
- Render variables as dropdown selectors in the dashboard toolbar
|
||
- Variables can be referenced in metric queries as `$variable_name`
|
||
- When a variable changes, all widgets re-query with the new value
|
||
- Support cascading variables (variable B's query depends on variable A's value)
|
||
- **Scoping model (from Perses)**: Variables can be defined at dashboard, project, or global scope. Dashboard-level overrides project-level, which overrides global. This lets teams define org-wide variables (e.g., `$environment`) once and reuse across dashboards.
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/DashboardVariable.ts` (new)
|
||
- `Common/Types/Dashboard/DashboardViewConfig.ts` (add variables array)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Toolbar/DashboardToolbar.tsx` (render variable dropdowns)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx` (pass variable values to widgets)
|
||
- `Common/Server/Services/MetricService.ts` (resolve variable references in queries)
|
||
|
||
### 1.3 Auto-Refresh ✅
|
||
|
||
**Status**: Implemented. AutoRefreshInterval enum with 7 options (OFF, 5s, 10s, 30s, 1m, 5m, 15m). Timer management via setInterval/clearInterval, pauses in edit mode, pulsing blue dot indicator, interval persisted in DashboardViewConfig.
|
||
**Current**: Data goes stale after initial load.
|
||
**Target**: Configurable auto-refresh intervals.
|
||
|
||
**Implementation**:
|
||
|
||
- Add auto-refresh dropdown in toolbar with options: Off, 5s, 10s, 30s, 1m, 5m, 15m
|
||
- Store selected interval in dashboard config and URL state
|
||
- Use `setInterval` to trigger re-fetch on all metric widgets
|
||
- Show a subtle refresh indicator when data is being updated
|
||
- Pause auto-refresh when the dashboard is in edit mode
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Toolbar/DashboardToolbar.tsx` (add refresh dropdown)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx` (implement refresh timer)
|
||
- `Common/Types/Dashboard/DashboardViewConfig.ts` (store refresh interval)
|
||
|
||
### 1.4 Multiple Queries per Chart with Formulas ✅ (Partial)
|
||
|
||
**Status**: Multiple queries implemented via metricQueryConfigs array with fallback to single metricQueryConfig. Each query renders as a separate series. Formula evaluation and dual Y-axis are not yet implemented.
|
||
**Current**: Single `MetricQueryConfigData` per chart.
|
||
**Target**: Overlay multiple metric series on a single chart for correlation, with cross-query formulas.
|
||
|
||
**Implementation**:
|
||
|
||
- Change chart component's data source from single `MetricQueryConfigData` to `QueryPlugin[]` (using the new unified interface)
|
||
- Each query gets its own alias and legend entry
|
||
- Support `FormulaQuery` plugin kind for cross-query formulas (e.g., `a / b * 100` where `a` and `b` reference other queries by alias)
|
||
- Y-axis: support dual Y-axes for metrics with different scales
|
||
- Formula evaluation happens server-side to avoid shipping raw data to the client
|
||
|
||
**Files to modify**:
|
||
- `Common/Utils/Dashboard/Components/DashboardChartComponent.ts` (change to QueryPlugin array)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx` (render multiple series)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Canvas/ComponentSettingsSideOver.tsx` (multi-query config UI)
|
||
- `Common/Server/Services/FormulaEvaluator.ts` (new - server-side formula evaluation)
|
||
|
||
### 1.5 Full Markdown Support for Text Widget ✅
|
||
|
||
**Status**: Implemented. isMarkdown boolean flag added to DashboardTextComponent. When enabled, renders via LazyMarkdownViewer. When disabled, falls back to existing bold/italic/underline formatting.
|
||
**Current**: Only bold, italic, underline formatting.
|
||
**Target**: Full markdown rendering including headers, links, lists, code blocks, tables, and images.
|
||
|
||
**Implementation**:
|
||
|
||
- Replace the current custom formatting with a markdown renderer (e.g., `react-markdown` or `marked`)
|
||
- Support: headers (h1-h6), links, ordered/unordered lists, code blocks with syntax highlighting, tables, images, blockquotes
|
||
- Edit mode: raw markdown text area with preview toggle
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardTextComponent.tsx` (replace with markdown renderer)
|
||
- `Common/Utils/Dashboard/Components/DashboardTextComponent.ts` (store raw markdown)
|
||
|
||
### 1.6 Threshold Lines & Color Coding ✅ (Partial)
|
||
|
||
**Status**: Warning/critical threshold config added to Chart and Value components. Value widget implements color coding (green → yellow → red background/text). Chart threshold reference lines are configured in the data model but visual rendering on charts requires Tremor chart library modifications (deferred).
|
||
**Current**: No threshold visualization.
|
||
**Target**: Configurable warning/critical thresholds on charts with color-coded regions.
|
||
|
||
**Implementation**:
|
||
|
||
- Add threshold configuration to chart settings: value, label, color (default: yellow for warning, red for critical)
|
||
- Render horizontal lines on the chart at threshold values
|
||
- Optionally fill regions above/below thresholds with translucent color
|
||
- For value/billboard widgets: change background color based on which threshold range the value falls in (green/yellow/red)
|
||
|
||
**Files to modify**:
|
||
- `Common/Utils/Dashboard/Components/DashboardChartComponent.ts` (add thresholds config)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx` (render threshold lines)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardValueComponent.tsx` (color coding)
|
||
|
||
### 1.7 Legend Interaction (Show/Hide Series) ✅
|
||
|
||
**Status**: Implemented. onValueChange callback enabled on Line, Area, and Bar chart components, activating Tremor's built-in activeLegend state management for click-to-toggle series visibility.
|
||
**Current**: Legends are display-only.
|
||
**Target**: Click legend items to toggle series visibility.
|
||
|
||
**Implementation**:
|
||
|
||
- Add click handler on legend items to toggle series visibility
|
||
- Clicked-off series should be visually dimmed in the legend and removed from the chart
|
||
- Support "isolate" mode: Ctrl+Click shows only that series and hides all others
|
||
- Persist toggled state during the session (reset on page reload)
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Components/Metrics/MetricGraph.tsx` (add legend click handlers)
|
||
|
||
### 1.8 Chart Zoom (Click-Drag Time Selection) ✅
|
||
|
||
**Status**: Implemented via time range stack. Toolbar shows Reset Zoom button when zoomed in. Current range is pushed to stack before zoom, popped on reset. In-chart brush/drag selection not yet implemented (uses toolbar-driven zoom instead).
|
||
**Current**: No zoom capability.
|
||
**Target**: Click and drag on a time series chart to zoom into a time range.
|
||
|
||
**Implementation**:
|
||
|
||
- Enable brush/selection mode on time series charts
|
||
- When user drags to select a range, update the global time range to the selected range
|
||
- Show a "Reset zoom" button to return to the previous time range
|
||
- Maintain a zoom stack so users can zoom in multiple times and zoom back out
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Components/Metrics/MetricGraph.tsx` (add brush selection)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx` (handle time range updates from zoom)
|
||
|
||
---
|
||
|
||
## Phase 2: Observability Integration (P0-P1) — Leverage the Full Platform
|
||
|
||
This is where OneUptime can differentiate: metrics, logs, and traces in one platform. The `QueryPlugin` interface from Phase 1.9 makes this phase significantly easier — each new signal type is a new `kind` in the plugin system rather than a new query pipeline.
|
||
|
||
### 2.1 Log Stream Widget
|
||
|
||
**Current**: Dashboards can only show metrics.
|
||
**Target**: Widget that displays a live log stream with filtering.
|
||
|
||
**Implementation**:
|
||
|
||
- New `DashboardComponentType.LogStream` widget type
|
||
- Uses `QueryPlugin` with `kind: "LogQuery"` — same interface as metric widgets
|
||
- Configuration: log query filter, severity filter, service filter, max rows
|
||
- Renders as a scrolling log list with severity color coding, timestamp, and body
|
||
- Click a log entry to expand and see full details
|
||
- Respects dashboard time range and template variables
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/DashboardComponentType.ts` (add LogStream)
|
||
- `Common/Utils/Dashboard/Components/DashboardLogStreamComponent.ts` (new - config)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardLogStreamComponent.tsx` (new - rendering)
|
||
- `Common/Server/Services/LogQueryResolver.ts` (new - implements QueryPlugin for logs)
|
||
|
||
### 2.2 Trace List Widget
|
||
|
||
**Current**: No trace visualization in dashboards.
|
||
**Target**: Widget showing a filtered trace list with duration and status.
|
||
|
||
**Implementation**:
|
||
|
||
- New `DashboardComponentType.TraceList` widget type
|
||
- Uses `QueryPlugin` with `kind: "TraceQuery"` — same interface as metric and log widgets
|
||
- Configuration: service filter, operation filter, status filter, min duration
|
||
- Renders as a table: trace ID, operation, service, duration, status, timestamp
|
||
- Click a row to navigate to the full trace view
|
||
- Respects dashboard time range and template variables
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/DashboardComponentType.ts` (add TraceList)
|
||
- `Common/Utils/Dashboard/Components/DashboardTraceListComponent.ts` (new)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardTraceListComponent.tsx` (new)
|
||
- `Common/Server/Services/TraceQueryResolver.ts` (new - implements QueryPlugin for traces)
|
||
|
||
### 2.3 Click-to-Correlate Across Signals
|
||
|
||
**Current**: No cross-signal correlation in dashboards.
|
||
**Target**: Click a point on a metric chart to instantly see related logs and traces from that timestamp.
|
||
|
||
**Implementation**:
|
||
|
||
- When clicking a data point on a metric chart, open a correlation panel showing:
|
||
- Logs from the same service and time window (+/- 5 minutes around the clicked point)
|
||
- Traces from the same service and time window
|
||
- Filtered by the same template variables
|
||
- The correlation panel uses the `QueryPlugin` interface internally — it fires a `LogQuery` and `TraceQuery` scoped to the clicked timestamp and service context
|
||
- The correlation panel appears as a slide-over or split view below the chart
|
||
- This is a major differentiator vs Grafana (which requires separate datasources) and ties into OneUptime's all-in-one advantage
|
||
- No competitor, including Perses, has this level of built-in cross-signal correlation
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx` (add click handler)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/CorrelationPanel.tsx` (new - shows correlated logs/traces)
|
||
|
||
### 2.4 Annotations / Event Overlays
|
||
|
||
**Current**: No event markers on charts.
|
||
**Target**: Show deployment events, incidents, and alerts as vertical markers on time series charts.
|
||
|
||
**Implementation**:
|
||
|
||
- Query OneUptime's own data for events in the chart's time range:
|
||
- Incidents (from Incident model)
|
||
- Deployments (can be sent as OTLP resource attributes or a custom event API)
|
||
- Alert triggers (from monitor alert history)
|
||
- Render as vertical dashed lines with icons on hover
|
||
- Color-code by type: red for incidents, blue for deployments, yellow for alerts
|
||
- Allow users to add manual annotations (text + timestamp)
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/DashboardAnnotation.ts` (new)
|
||
- `App/FeatureSet/Dashboard/src/Components/Metrics/MetricGraph.tsx` (render annotation markers)
|
||
- `Common/Server/API/DashboardAnnotationAPI.ts` (new - query events)
|
||
|
||
### 2.5 Alert Integration
|
||
|
||
**Current**: No connection between dashboards and alerting.
|
||
**Target**: Create alerts from dashboard panels and display alert state on panels.
|
||
|
||
**Implementation**:
|
||
|
||
- "Create Alert" button in chart settings that pre-fills a metric monitor with the chart's query
|
||
- Show alert state indicator on chart headers (green/yellow/red dot) based on associated monitor status
|
||
- Alert status widget: shows a summary of all active alerts with severity and duration
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Canvas/ComponentSettingsSideOver.tsx` (add "Create Alert" button)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx` (show alert state)
|
||
- `Common/Types/Dashboard/DashboardComponentType.ts` (add AlertStatus widget type)
|
||
|
||
### 2.6 SLO/SLI Widget
|
||
|
||
**Current**: No SLO visualization.
|
||
**Target**: Dedicated widget showing SLO status, error budget burn rate, and remaining budget.
|
||
|
||
**Implementation** (depends on Metrics roadmap Phase 3.2 - SLO/SLI Tracking):
|
||
|
||
- New `DashboardComponentType.SLO` widget type
|
||
- Configuration: select an SLO definition
|
||
- Displays: current attainment (%), target (%), error budget remaining (%), burn rate chart
|
||
- Color-coded: green (healthy), yellow (burning fast), red (budget exhausted)
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/DashboardComponentType.ts` (add SLO)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardSLOComponent.tsx` (new)
|
||
|
||
---
|
||
|
||
## Phase 3: Collaboration & Sharing (P1) — Production Workflows
|
||
|
||
### 3.1 Public/Shared Dashboards
|
||
|
||
**Current**: Dashboards require login.
|
||
**Target**: Share dashboards with external stakeholders without requiring authentication.
|
||
|
||
**Implementation**:
|
||
|
||
- Add `isPublic` flag and `publicAccessToken` to Dashboard model
|
||
- Generate a shareable URL with token: `/public/dashboard/{token}`
|
||
- Public view is read-only with no editing controls
|
||
- Option to restrict public access to specific IP ranges
|
||
- Render without the OneUptime navigation chrome
|
||
|
||
**Files to modify**:
|
||
- `Common/Models/DatabaseModels/Dashboard.ts` (add isPublic, publicAccessToken)
|
||
- `App/FeatureSet/Dashboard/src/Pages/Public/Dashboard.tsx` (new - public dashboard view)
|
||
|
||
### 3.2 JSON Import/Export with Perses & Grafana Compatibility
|
||
|
||
**Current**: No import/export capability.
|
||
**Target**: Export dashboards as JSON and re-import for backup, migration, and dashboard-as-code. Support importing Perses and Grafana dashboard formats.
|
||
|
||
**Implementation**:
|
||
|
||
- **Native export**: Serialize `dashboardViewConfig` + metadata (name, description, variables) as a JSON file download. Include a schema version for forward compatibility.
|
||
- **Perses-compatible export**: Alongside native format, output a Perses-spec-compatible JSON. This gives users interoperability with the CNCF ecosystem without coupling our internals. Map our `QueryPlugin` kinds to Perses panel plugin types where possible.
|
||
- **Grafana import**: Perses already has tooling to convert Grafana dashboards to Perses format. By supporting Perses import, we get Grafana migration for free: Grafana → Perses → OneUptime.
|
||
- **Import pipeline**: Upload a JSON file → detect format (native, Perses, Grafana) → translate to `DashboardViewConfig` → validate → create dashboard.
|
||
- Handle version compatibility (include a schema version in the export)
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Pages/Dashboards/Dashboards.tsx` (add import button)
|
||
- `App/FeatureSet/Dashboard/src/Pages/Dashboards/View/Settings.tsx` (add export button)
|
||
- `Common/Server/API/DashboardImportExportAPI.ts` (new)
|
||
- `Common/Server/Utils/Dashboard/PersesConverter.ts` (new - bidirectional Perses format conversion)
|
||
- `Common/Server/Utils/Dashboard/GrafanaConverter.ts` (new - Grafana JSON to native format)
|
||
|
||
### 3.3 Dashboard Versioning
|
||
|
||
**Current**: No change history.
|
||
**Target**: Track changes to dashboards over time with the ability to view history and revert.
|
||
|
||
**Implementation**:
|
||
|
||
- Create `DashboardVersion` model: dashboardId, version number, config snapshot, changedBy, timestamp
|
||
- On each save, create a new version entry
|
||
- UI: "Version History" in settings showing a list of versions with timestamps and authors
|
||
- "Restore" button to revert to a previous version
|
||
- Optional: diff view comparing two versions
|
||
|
||
**Files to modify**:
|
||
- `Common/Models/DatabaseModels/DashboardVersion.ts` (new)
|
||
- `Common/Server/Services/DashboardService.ts` (create version on save)
|
||
- `App/FeatureSet/Dashboard/src/Pages/Dashboards/View/VersionHistory.tsx` (new)
|
||
|
||
### 3.4 Row/Section Grouping with Decoupled Layout
|
||
|
||
**Current**: Components placed freely with no grouping. Panel definitions and grid positions are mixed together in each component.
|
||
**Target**: Collapsible rows/sections for organizing related panels, with layout decoupled from panel definitions.
|
||
|
||
**Implementation**:
|
||
|
||
- **Decouple layout from panels** (pattern from Perses): Separate panel definitions (what to render) from layout definitions (where to render it). Panels are stored in a `panels` map keyed by ID. Layouts reference panels by `$ref`. This makes it easier to rearrange panels without modifying their query/display config.
|
||
- Add a "Section" component type that acts as a collapsible container
|
||
- Section has a title bar that can be clicked to collapse/expand
|
||
- When collapsed, hides all components within the section's vertical range
|
||
- Sections can be nested one level deep
|
||
- Migration: existing `DashboardViewConfig` components are automatically split into panel + layout entries on first load
|
||
|
||
**Files to modify**:
|
||
- `Common/Types/Dashboard/DashboardViewConfig.ts` (add panels map + layouts array, deprecate inline component positions)
|
||
- `Common/Types/Dashboard/DashboardComponentType.ts` (add Section)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardSectionComponent.tsx` (new)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Canvas/Index.tsx` (handle section collapse, resolve panel refs)
|
||
|
||
### 3.5 TV/Kiosk Mode
|
||
|
||
**Current**: Full-screen only.
|
||
**Target**: Dedicated kiosk mode optimized for wall-mounted monitors with auto-cycling.
|
||
|
||
**Implementation**:
|
||
|
||
- Kiosk mode: hides all chrome (toolbar, navigation, URL bar), shows only the dashboard grid
|
||
- Auto-cycle: rotate through a list of dashboards at a configurable interval (30s, 1m, 5m)
|
||
- Dashboard playlist: define an ordered list of dashboards to cycle through
|
||
- Support per-dashboard display duration
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Pages/Dashboards/Kiosk.tsx` (new - kiosk view)
|
||
- `Common/Models/DatabaseModels/DashboardPlaylist.ts` (new - playlist model)
|
||
|
||
### 3.6 CSV Export
|
||
|
||
**Current**: No data export.
|
||
**Target**: Export chart/table data as CSV for offline analysis.
|
||
|
||
**Implementation**:
|
||
|
||
- Add "Export CSV" option in chart/table context menu
|
||
- Client-side: serialize the current rendered data to CSV format
|
||
- Include column headers, timestamps, and values
|
||
- Trigger browser download
|
||
|
||
**Files to modify**:
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx` (add export option)
|
||
- `Common/Utils/Dashboard/CSVExport.ts` (new - CSV serialization)
|
||
|
||
### 3.7 Custom Time Range per Widget
|
||
|
||
**Current**: All widgets share the global time range.
|
||
**Target**: Individual widgets can override the global time range.
|
||
|
||
**Implementation**:
|
||
|
||
- Add optional `timeRangeOverride` to each component's config
|
||
- When set, the widget uses its own time range instead of the global one
|
||
- Show a small clock icon on widgets with custom time ranges
|
||
- Configuration in the component settings side panel
|
||
|
||
**Files to modify**:
|
||
- `Common/Utils/Dashboard/Components/DashboardBaseComponent.ts` (add timeRangeOverride)
|
||
- `App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx` (pass per-widget time ranges)
|
||
|
||
---
|
||
|
||
## Phase 4: Differentiation (P2-P3) — Surpass Competition
|
||
|
||
### 4.1 AI-Powered Dashboard Creation
|
||
|
||
**Current**: Manual dashboard creation only.
|
||
**Target**: Natural language dashboard creation - "Show me CPU usage by service for the last 24 hours" auto-creates the right widget.
|
||
|
||
**Implementation**:
|
||
|
||
- Natural language input in the "Add Widget" dialog
|
||
- AI translates to: metric name, aggregation, group by, chart type, time range
|
||
- Uses available MetricType metadata to match metric names
|
||
- Preview the generated widget before adding to dashboard
|
||
- This is a feature NO competitor has done well yet - major differentiator
|
||
|
||
### 4.2 Pre-Built Dashboard Templates
|
||
|
||
**Current**: No templates.
|
||
**Target**: One-click dashboard templates for common stacks.
|
||
|
||
**Implementation**:
|
||
|
||
- Template library: Node.js, Python, Go, Java, Kubernetes, PostgreSQL, Redis, Nginx, MongoDB, etc.
|
||
- Auto-detect relevant templates based on ingested telemetry data
|
||
- "One-click create" instantiates a full dashboard from the template
|
||
- Community template sharing (future)
|
||
|
||
### 4.3 Auto-Generated Dashboards
|
||
|
||
**Current**: Users must manually build dashboards.
|
||
**Target**: When a service connects, auto-generate a relevant dashboard.
|
||
|
||
**Implementation**:
|
||
|
||
- On first telemetry ingest from a new service, analyze the metric names and types
|
||
- Auto-create a service dashboard with relevant charts based on detected metrics
|
||
- Include golden signals (latency, traffic, errors, saturation) where applicable
|
||
- Notify the user and link to the auto-generated dashboard
|
||
|
||
### 4.4 Customer-Facing Dashboards on Status Pages
|
||
|
||
**Current**: Status pages and dashboards are separate.
|
||
**Target**: Embed dashboard widgets on status pages for real-time performance visibility.
|
||
|
||
**Implementation**:
|
||
|
||
- Allow selecting specific dashboard widgets to embed on a status page
|
||
- Render widgets in read-only mode without internal navigation
|
||
- Respect public/private data boundaries (only show metrics the customer should see)
|
||
- This is unique to OneUptime - no competitor has integrated observability dashboards with status pages
|
||
|
||
### 4.5 Dashboard-as-Code SDK (Perses-Compatible)
|
||
|
||
**Current**: No programmatic dashboard creation.
|
||
**Target**: TypeScript SDK for defining dashboards as code, with optional Perses-compatible output.
|
||
|
||
**Implementation**:
|
||
|
||
```typescript
|
||
const dashboard = new Dashboard("Service Health")
|
||
.addVariable("service", { type: "query", query: "SELECT DISTINCT service FROM MetricItem" })
|
||
.addRow("Latency")
|
||
.addChart({ metric: "http.server.duration", aggregation: "p99", groupBy: ["$service"] })
|
||
.addChart({ metric: "http.server.duration", aggregation: "p50", groupBy: ["$service"] })
|
||
.addRow("Throughput")
|
||
.addChart({ metric: "http.server.request.count", aggregation: "rate", groupBy: ["$service"] })
|
||
|
||
// Output native OneUptime format
|
||
dashboard.toJSON();
|
||
|
||
// Output Perses-compatible format for ecosystem interop
|
||
dashboard.toPerses();
|
||
```
|
||
|
||
- SDK generates the JSON config and uses the Dashboard API to create/update
|
||
- Git-based provisioning: store dashboard definitions in repo, CI/CD syncs to OneUptime
|
||
- `toPerses()` output allows users to share dashboard definitions with teams using Perses or other CNCF-compatible tools
|
||
- Perses's CUE SDK patterns can inform our builder API design
|
||
|
||
### 4.6 Anomaly Detection Overlays
|
||
|
||
**Current**: No anomaly visualization.
|
||
**Target**: AI highlights anomalous data points on charts without manual threshold configuration.
|
||
|
||
**Implementation** (depends on Metrics roadmap Phase 3.1 - Anomaly Detection):
|
||
|
||
- Automatically overlay expected range bands (baseline +/- N sigma) on metric charts
|
||
- Highlight data points outside the expected range with color indicators
|
||
- Click an anomaly to see correlated changes across metrics, logs, and traces
|
||
|
||
### 4.7 Terraform / OpenTofu Provider
|
||
|
||
**Current**: No infrastructure-as-code support for dashboards.
|
||
**Target**: Manage dashboards via Terraform/OpenTofu for GitOps workflows.
|
||
|
||
**Implementation**:
|
||
|
||
- Expose dashboard CRUD via a well-documented REST API (already exists)
|
||
- Build a Terraform provider that maps dashboard resources to the API
|
||
- Support `oneuptime_dashboard`, `oneuptime_dashboard_variable`, and `oneuptime_dashboard_template` resources
|
||
- This complements the Dashboard-as-Code SDK (4.5) — SDK for developers, Terraform for ops teams
|
||
|
||
---
|
||
|
||
## Quick Wins (Can Ship This Week) ✅ All Done
|
||
|
||
1. ~~**Auto-refresh** - Add a simple `setInterval` refresh with dropdown selector in toolbar~~ ✅
|
||
2. ~~**Full markdown for text widget** - Replace custom formatting with a markdown renderer~~ ✅
|
||
3. ~~**Legend show/hide** - Add click handler on legend items to toggle series~~ ✅
|
||
4. ~~**Stacked area chart** - Simple extension of existing line chart with fill~~ ✅
|
||
5. ~~**Chart zoom** - Enable brush selection on time series charts~~ ✅
|
||
|
||
---
|
||
|
||
## Recommended Implementation Order
|
||
|
||
### Phase 0: Architecture Foundation
|
||
1. **Phase 1.9** - QueryPlugin interface (enables everything else; do this first)
|
||
|
||
### Phase 1: Core Features ✅ (Complete — remaining items are partial refinements)
|
||
2. ~~**Quick Wins** - Auto-refresh, markdown, legend toggle, stacked area, chart zoom~~ ✅
|
||
3. ~~**Phase 1.1** - More chart types (Area, Table, Gauge)~~ ✅ (Pie, Heatmap, Histogram rendering deferred)
|
||
4. ~~**Phase 1.2** - Template variables with scoping~~ ✅
|
||
5. ~~**Phase 1.4** - Multiple queries per chart~~ ✅ (Formulas deferred)
|
||
6. ~~**Phase 1.6** - Threshold lines & color coding~~ ✅ (Value widget done; chart reference lines deferred)
|
||
|
||
### Phase 2: Platform Leverage (Differentiators)
|
||
7. **Phase 2.1** - Log stream widget (leverages all-in-one platform + QueryPlugin)
|
||
8. **Phase 2.2** - Trace list widget (leverages all-in-one platform + QueryPlugin)
|
||
9. **Phase 2.3** - Click-to-correlate (major differentiator — no competitor has this built-in)
|
||
10. **Phase 2.4** - Annotations / event overlays
|
||
11. **Phase 2.5** - Alert integration
|
||
|
||
### Phase 3: Collaboration
|
||
12. **Phase 3.1** - Public/shared dashboards
|
||
13. **Phase 3.2** - JSON import/export with Perses & Grafana compatibility
|
||
14. **Phase 3.4** - Row/section grouping with decoupled layout
|
||
15. **Phase 3.5** - TV/Kiosk mode
|
||
16. **Phase 3.3** - Dashboard versioning
|
||
17. **Phase 2.6** - SLO widget (depends on SLO/SLI from Metrics roadmap)
|
||
|
||
### Phase 4: Differentiation
|
||
18. **Phase 4.2** - Pre-built dashboard templates
|
||
19. **Phase 4.3** - Auto-generated dashboards
|
||
20. **Phase 4.1** - AI-powered dashboard creation
|
||
21. **Phase 4.4** - Customer-facing dashboards on status pages
|
||
22. **Phase 4.5** - Dashboard-as-code SDK (Perses-compatible)
|
||
23. **Phase 4.7** - Terraform / OpenTofu provider
|
||
24. **Phase 4.6** - Anomaly detection overlays
|
||
|
||
## Verification
|
||
|
||
For each feature:
|
||
1. Unit tests for new widget types, template variable resolution, CSV export logic, QueryPlugin dispatching
|
||
2. Integration tests for new API endpoints (annotations, public dashboards, import/export, Perses/Grafana conversion)
|
||
3. Manual verification via the dev server at `https://oneuptimedev.genosyn.com/dashboard/{projectId}/dashboards`
|
||
4. Visual regression testing for new chart types (ensure correct rendering across browsers)
|
||
5. Performance testing: verify dashboards with 20+ widgets and auto-refresh don't cause excessive API load
|
||
6. Test template variables with edge cases: empty results, special characters, multi-value selections
|
||
7. Verify public dashboards don't leak private data
|
||
8. Test Perses/Grafana import with real-world dashboard exports to validate conversion fidelity
|
||
9. Test QueryPlugin interface with mixed query types (metric + log + trace) on a single dashboard
|