Files
oneuptime/Internal/Roadmap/Dashboards.md

38 KiB
Raw Blame History

Plan: Bring OneUptime Dashboards to Industry Parity and Beyond

Context

OneUptime's dashboard implementation provides a 12-column grid layout with drag-and-drop editing, 3 widget types (Chart with Line/Bar, Value, Text with basic formatting), global time range with presets, view/edit modes, role-based permissions, and full-screen support. Dashboard config is stored as a single JSON column. Dashboards can only query OpenTelemetry metrics from ClickHouse.

This plan identifies the remaining gaps vs Grafana, Datadog, and New Relic, and proposes a phased implementation to build a best-in-class dashboard product that leverages OneUptime's unique position as an all-in-one observability + status page platform.

Completed

The following features have been implemented:

  • 12-Column Grid Layout - Fixed grid with dynamic unit sizing, 60 default rows (expandable)
  • Drag-and-Drop Editing - Move and resize components with bounds checking
  • Chart Widget - Line and Bar chart types with single metric query, configurable title/description/legend
  • Value Widget - Single metric aggregation displayed as large number
  • Text Widget - Bold/Italic/Underline formatting (no markdown)
  • Global Time Range - Presets (30min to 3mo) + custom date range picker
  • View/Edit Modes - Read-only view with full-screen, edit mode with side panel settings
  • Role-Based Permissions - ProjectOwner, ProjectAdmin, ProjectMember + custom permissions
  • Dashboard CRUD API - Standard REST API with slug generation
  • Billing Enforcement - Free plan limited to 1 dashboard
  • Area Chart (Phase 1.1) - Area and Stacked Area chart types added to ChartType enum and rendered via chart component
  • Table Widget (Phase 1.1) - New DashboardComponentType.Table with timestamp/value columns, sticky header, configurable max rows
  • Gauge Widget (Phase 1.1) - SVG semi-circle gauge with threshold-based color coding (green/yellow/red), configurable min/max/thresholds
  • Template Variables (Phase 1.2) - DashboardVariable type with CustomList, Query, and TextInput types; toolbar dropdown/input selectors; variable changes trigger widget refresh
  • Auto-Refresh (Phase 1.3) - 7 interval options (5s to 15m), timer pauses in edit mode, pulsing indicator, interval persisted in dashboard config
  • Multiple Queries per Chart (Phase 1.4) - metricQueryConfigs array support with fallback to single metricQueryConfig; each query rendered as a separate series
  • Markdown Support (Phase 1.5) - isMarkdown flag on text components; renders LazyMarkdownViewer when enabled, falls back to bold/italic/underline when disabled
  • Threshold / Color Coding (Phase 1.6) - Warning and critical threshold config on Value and Chart components; Value widget changes background/text color (green → yellow → red) based on thresholds
  • Legend Interaction (Phase 1.7) - onValueChange enabled on Line, Area, and Bar charts for built-in Tremor legend click-to-toggle filtering
  • Chart Zoom (Phase 1.8) - Time range zoom stack with Reset Zoom button in toolbar; pushing current range before zoom, popping on reset

Gap Analysis Summary

Feature OneUptime Grafana Datadog New Relic Priority
Widget types 5 (Chart, Value, Text, Table, Gauge) 20+ 40+ 15+ P0 Done
Chart types 4 (Line, Bar, Area, Stacked Area) 10+ 12+ 10+ P0 Done
Template variables 3 types (CustomList, Query, TextInput) 6+ types Yes 3 types P0 Done
Auto-refresh 7 intervals (5s15m) Configurable Real-time Yes P0 Done
Log panels None Yes (Loki) Yes Yes (NRQL) P0
Trace panels None Yes (Tempo) Yes Yes P0
Table widget Yes Yes Yes Yes P0 Done
Multiple queries per chart Yes (array) Yes Yes Yes P0 Done
Markdown support Yes (toggle) Full markdown Full markdown Full markdown P0 Done
Threshold lines / color coding Value widget color coding Yes Yes Yes P0 Partial
Legend interaction (show/hide) Yes (click toggle) Yes Yes Yes P0 Done
Chart zoom Yes (time range stack) Yes Yes Yes P0 Done
Unified query plugin interface None Datasource plugins Yes NRQL P0
Dashboard linking / drill-down None Data links Yes Facet linking P1
Annotations / event overlays None Yes Yes Yes (Labs) P1
Row/section grouping None Collapsible rows Groups No P1
Public/shared dashboards None Yes Yes Yes P1
JSON import/export None Yes Yes Yes P1
Dashboard versioning None Yes Yes No P1
Alert integration None Create from panel + show state Yes NRQL alerts P1
TV/Kiosk mode Full-screen only Kiosk mode Yes Auto-cycling P1
CSV export None Yes Yes Yes P1
Custom time per widget None No No No P1
Perses/Grafana import None N/A No No P1
AI dashboard creation None None None None P2
Dashboard-as-code SDK None Foundation SDK No No P2
Terraform provider None Yes Yes Yes P2

Architecture: Query Plugin Interface & Perses Compatibility

Before implementing features, we should establish a QueryPlugin interface that all widget data sources use. This is a foundational architectural change that enables Phase 2 (logs, traces) and Phase 4 (Dashboard-as-Code) cleanly.

Why Not Adopt Perses Wholesale?

Perses is a CNCF Sandbox project providing an open dashboard specification and embeddable UI components. We evaluated it as a potential protocol for our dashboard system. The decision is to selectively borrow patterns rather than adopt it fully:

Against full adoption:

  • Our AggregateBy API queries ClickHouse directly. Perses assumes Prometheus/PromQL as the primary query language — mapping our ClickHouse aggregation queries into Perses's PrometheusTimeSeriesQuery plugin model adds unnecessary indirection.
  • Phase 2 (click-to-correlate, cross-signal correlation) is our biggest differentiator. Perses has basic Tempo/Loki plugins but nothing like unified correlation. Adopting their panel model would constrain our ability to build these features.
  • Perses is still CNCF Sandbox stage with data model structs marked deprecated in favor of a new perses/spec repo. The spec is not yet stable enough to build a product on.
  • Maintaining a translation layer between Perses spec and our internal DashboardViewConfig format for every feature would add ongoing overhead.

What we selectively adopt:

Perses Concept Where It Helps How
kind+spec plugin pattern Phase 1.9 (QueryPlugin), Phase 2.1-2.2 Formalize widget data sources as plugins instead of hardcoding every widget type
Variable model with scoping Phase 1.2 (Template Variables) Adopt query-based, list, and text variable types with dashboard → project → global scoping
Decoupled layout from panels Phase 3.4 (Sections) Separate panel definitions from grid positions to make sections/grouping cleaner
Dashboard JSON schema Phase 3.2 (Import/Export) Support importing Perses-format dashboards alongside native format for Grafana migration path

1.9 Unified Query Plugin Interface

Current: Widgets are hardcoded to query metrics via MetricQueryConfigData and the AggregateBy API. Adding logs or traces as data sources requires duplicating the entire query path. Target: A QueryPlugin interface that abstracts data sources, enabling any widget to query metrics, logs, or traces through a unified contract.

Design:

// The plugin pattern borrowed from Perses: kind + spec
interface QueryPlugin {
  kind: "MetricQuery" | "LogQuery" | "TraceQuery" | "FormulaQuery";
  spec: MetricQuerySpec | LogQuerySpec | TraceQuerySpec | FormulaQuerySpec;
}

interface MetricQuerySpec {
  metricName: string;
  attributes: JSONObject;
  aggregationType: AggregationType;
  groupBy?: string[];
}

interface LogQuerySpec {
  severityFilter?: SeverityLevel[];
  serviceFilter?: string[];
  bodyContains?: string;
  attributes?: JSONObject;
}

interface TraceQuerySpec {
  serviceFilter?: string[];
  operationFilter?: string[];
  statusFilter?: TraceStatus[];
  minDuration?: Duration;
}

interface FormulaQuerySpec {
  formula: string;           // e.g., "a / b * 100"
  queries: Record<string, QueryPlugin>;  // named sub-queries
}

// Each widget stores an array of QueryPlugins instead of MetricQueryConfigData
interface DashboardWidgetConfig {
  queries: QueryPlugin[];
  // ... other widget config
}

Benefits:

  • Log stream and trace list widgets (Phase 2.1, 2.2) plug in without new query plumbing
  • Cross-signal correlation (Phase 2.3) becomes a multi-query widget with mixed kind values
  • Formula queries (Phase 1.4) compose naturally across query types
  • Future data sources (e.g., external Prometheus, custom APIs) add a new kind without touching widget code
  • Aligns with Perses's extensibility model without coupling to their spec

Files to modify:

  • Common/Types/Dashboard/QueryPlugin.ts (new - interface definitions)
  • Common/Types/Metrics/MetricsQuery.ts (refactor to implement MetricQuerySpec)
  • Common/UI/Utils/AnalyticsModelAPI/AnalyticsModelAPI.ts (add query resolver that dispatches by kind)
  • Common/Server/API/BaseAnalyticsAPI.ts (add unified query endpoint)
  • App/FeatureSet/Dashboard/src/Components/Metrics/Utils/Metrics.ts (refactor fetchResults to use QueryPlugin)

Phase 1: Foundation (P0) — Close Critical Gaps

These gaps make OneUptime dashboards fundamentally non-competitive. Every major competitor has these.

1.1 Add Core Chart Types: Area, Pie, Table, Gauge, Heatmap, Histogram (Partial)

Status: Area, Stacked Area, Table, and Gauge are implemented. Pie, Heatmap, and Histogram enum values are defined but rendering is not yet implemented. Current: Line and Bar only. Target: 8+ chart types covering all standard observability visualization needs.

Implementation:

  • Area Chart (stacked and non-stacked): Extension of line chart with fill. Use existing chart library's area mode
  • Pie/Donut Chart: For proportional breakdowns (e.g., error distribution by service). New component
  • Table Widget: Tabular metric data, top-N lists, multi-column display with sortable columns. New component type
  • Gauge Widget: Circular gauge with configurable min/max/thresholds and color zones. New component
  • Heatmap: Time on X-axis, value buckets on Y-axis, color intensity for count. Essential for distribution/histogram metrics
  • Histogram: Bar chart showing value distribution. Important for latency analysis

Each chart type needs:

  • A new entry in DashboardComponentType or ChartType enum
  • A rendering component in Dashboard/Components/
  • Configuration options in the component settings side panel

Files to modify:

  • Common/Types/Dashboard/Chart/ChartType.ts (add Area, Pie, Heatmap, Histogram, Gauge)
  • Common/Types/Dashboard/DashboardComponentType.ts (add Table, Gauge)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx (render new types)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardTableComponent.tsx (new)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardGaugeComponent.tsx (new)

1.2 Template Variables

Status: Implemented. DashboardVariable type with CustomList, Query, and TextInput types. Toolbar renders dropdown selectors and text inputs. Variable changes trigger widget refresh via refreshTick. Current: No template variables. Users must create separate dashboards for each service/host/environment. Target: Drop-down variable selectors that dynamically filter all widgets.

Implementation:

  • Create a DashboardVariable type stored in dashboardViewConfig:
    • Name, label, type (query-based, custom list, text input)
    • Query-based: runs a ClickHouse query to populate options (e.g., SELECT DISTINCT service FROM MetricItem WHERE projectId = {pid})
    • Custom list: manually defined options
    • Multi-value selection support
  • Render variables as dropdown selectors in the dashboard toolbar
  • Variables can be referenced in metric queries as $variable_name
  • When a variable changes, all widgets re-query with the new value
  • Support cascading variables (variable B's query depends on variable A's value)
  • Scoping model (from Perses): Variables can be defined at dashboard, project, or global scope. Dashboard-level overrides project-level, which overrides global. This lets teams define org-wide variables (e.g., $environment) once and reuse across dashboards.

Files to modify:

  • Common/Types/Dashboard/DashboardVariable.ts (new)
  • Common/Types/Dashboard/DashboardViewConfig.ts (add variables array)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Toolbar/DashboardToolbar.tsx (render variable dropdowns)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx (pass variable values to widgets)
  • Common/Server/Services/MetricService.ts (resolve variable references in queries)

1.3 Auto-Refresh

Status: Implemented. AutoRefreshInterval enum with 7 options (OFF, 5s, 10s, 30s, 1m, 5m, 15m). Timer management via setInterval/clearInterval, pauses in edit mode, pulsing blue dot indicator, interval persisted in DashboardViewConfig. Current: Data goes stale after initial load. Target: Configurable auto-refresh intervals.

Implementation:

  • Add auto-refresh dropdown in toolbar with options: Off, 5s, 10s, 30s, 1m, 5m, 15m
  • Store selected interval in dashboard config and URL state
  • Use setInterval to trigger re-fetch on all metric widgets
  • Show a subtle refresh indicator when data is being updated
  • Pause auto-refresh when the dashboard is in edit mode

Files to modify:

  • App/FeatureSet/Dashboard/src/Components/Dashboard/Toolbar/DashboardToolbar.tsx (add refresh dropdown)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx (implement refresh timer)
  • Common/Types/Dashboard/DashboardViewConfig.ts (store refresh interval)

1.4 Multiple Queries per Chart with Formulas (Partial)

Status: Multiple queries implemented via metricQueryConfigs array with fallback to single metricQueryConfig. Each query renders as a separate series. Formula evaluation and dual Y-axis are not yet implemented. Current: Single MetricQueryConfigData per chart. Target: Overlay multiple metric series on a single chart for correlation, with cross-query formulas.

Implementation:

  • Change chart component's data source from single MetricQueryConfigData to QueryPlugin[] (using the new unified interface)
  • Each query gets its own alias and legend entry
  • Support FormulaQuery plugin kind for cross-query formulas (e.g., a / b * 100 where a and b reference other queries by alias)
  • Y-axis: support dual Y-axes for metrics with different scales
  • Formula evaluation happens server-side to avoid shipping raw data to the client

Files to modify:

  • Common/Utils/Dashboard/Components/DashboardChartComponent.ts (change to QueryPlugin array)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx (render multiple series)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Canvas/ComponentSettingsSideOver.tsx (multi-query config UI)
  • Common/Server/Services/FormulaEvaluator.ts (new - server-side formula evaluation)

1.5 Full Markdown Support for Text Widget

Status: Implemented. isMarkdown boolean flag added to DashboardTextComponent. When enabled, renders via LazyMarkdownViewer. When disabled, falls back to existing bold/italic/underline formatting. Current: Only bold, italic, underline formatting. Target: Full markdown rendering including headers, links, lists, code blocks, tables, and images.

Implementation:

  • Replace the current custom formatting with a markdown renderer (e.g., react-markdown or marked)
  • Support: headers (h1-h6), links, ordered/unordered lists, code blocks with syntax highlighting, tables, images, blockquotes
  • Edit mode: raw markdown text area with preview toggle

Files to modify:

  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardTextComponent.tsx (replace with markdown renderer)
  • Common/Utils/Dashboard/Components/DashboardTextComponent.ts (store raw markdown)

1.6 Threshold Lines & Color Coding (Partial)

Status: Warning/critical threshold config added to Chart and Value components. Value widget implements color coding (green → yellow → red background/text). Chart threshold reference lines are configured in the data model but visual rendering on charts requires Tremor chart library modifications (deferred). Current: No threshold visualization. Target: Configurable warning/critical thresholds on charts with color-coded regions.

Implementation:

  • Add threshold configuration to chart settings: value, label, color (default: yellow for warning, red for critical)
  • Render horizontal lines on the chart at threshold values
  • Optionally fill regions above/below thresholds with translucent color
  • For value/billboard widgets: change background color based on which threshold range the value falls in (green/yellow/red)

Files to modify:

  • Common/Utils/Dashboard/Components/DashboardChartComponent.ts (add thresholds config)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx (render threshold lines)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardValueComponent.tsx (color coding)

1.7 Legend Interaction (Show/Hide Series)

Status: Implemented. onValueChange callback enabled on Line, Area, and Bar chart components, activating Tremor's built-in activeLegend state management for click-to-toggle series visibility. Current: Legends are display-only. Target: Click legend items to toggle series visibility.

Implementation:

  • Add click handler on legend items to toggle series visibility
  • Clicked-off series should be visually dimmed in the legend and removed from the chart
  • Support "isolate" mode: Ctrl+Click shows only that series and hides all others
  • Persist toggled state during the session (reset on page reload)

Files to modify:

  • App/FeatureSet/Dashboard/src/Components/Metrics/MetricGraph.tsx (add legend click handlers)

1.8 Chart Zoom (Click-Drag Time Selection)

Status: Implemented via time range stack. Toolbar shows Reset Zoom button when zoomed in. Current range is pushed to stack before zoom, popped on reset. In-chart brush/drag selection not yet implemented (uses toolbar-driven zoom instead). Current: No zoom capability. Target: Click and drag on a time series chart to zoom into a time range.

Implementation:

  • Enable brush/selection mode on time series charts
  • When user drags to select a range, update the global time range to the selected range
  • Show a "Reset zoom" button to return to the previous time range
  • Maintain a zoom stack so users can zoom in multiple times and zoom back out

Files to modify:

  • App/FeatureSet/Dashboard/src/Components/Metrics/MetricGraph.tsx (add brush selection)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx (handle time range updates from zoom)

Phase 2: Observability Integration (P0-P1) — Leverage the Full Platform

This is where OneUptime can differentiate: metrics, logs, and traces in one platform. The QueryPlugin interface from Phase 1.9 makes this phase significantly easier — each new signal type is a new kind in the plugin system rather than a new query pipeline.

2.1 Log Stream Widget

Current: Dashboards can only show metrics. Target: Widget that displays a live log stream with filtering.

Implementation:

  • New DashboardComponentType.LogStream widget type
  • Uses QueryPlugin with kind: "LogQuery" — same interface as metric widgets
  • Configuration: log query filter, severity filter, service filter, max rows
  • Renders as a scrolling log list with severity color coding, timestamp, and body
  • Click a log entry to expand and see full details
  • Respects dashboard time range and template variables

Files to modify:

  • Common/Types/Dashboard/DashboardComponentType.ts (add LogStream)
  • Common/Utils/Dashboard/Components/DashboardLogStreamComponent.ts (new - config)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardLogStreamComponent.tsx (new - rendering)
  • Common/Server/Services/LogQueryResolver.ts (new - implements QueryPlugin for logs)

2.2 Trace List Widget

Current: No trace visualization in dashboards. Target: Widget showing a filtered trace list with duration and status.

Implementation:

  • New DashboardComponentType.TraceList widget type
  • Uses QueryPlugin with kind: "TraceQuery" — same interface as metric and log widgets
  • Configuration: service filter, operation filter, status filter, min duration
  • Renders as a table: trace ID, operation, service, duration, status, timestamp
  • Click a row to navigate to the full trace view
  • Respects dashboard time range and template variables

Files to modify:

  • Common/Types/Dashboard/DashboardComponentType.ts (add TraceList)
  • Common/Utils/Dashboard/Components/DashboardTraceListComponent.ts (new)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardTraceListComponent.tsx (new)
  • Common/Server/Services/TraceQueryResolver.ts (new - implements QueryPlugin for traces)

2.3 Click-to-Correlate Across Signals

Current: No cross-signal correlation in dashboards. Target: Click a point on a metric chart to instantly see related logs and traces from that timestamp.

Implementation:

  • When clicking a data point on a metric chart, open a correlation panel showing:
    • Logs from the same service and time window (+/- 5 minutes around the clicked point)
    • Traces from the same service and time window
    • Filtered by the same template variables
  • The correlation panel uses the QueryPlugin interface internally — it fires a LogQuery and TraceQuery scoped to the clicked timestamp and service context
  • The correlation panel appears as a slide-over or split view below the chart
  • This is a major differentiator vs Grafana (which requires separate datasources) and ties into OneUptime's all-in-one advantage
  • No competitor, including Perses, has this level of built-in cross-signal correlation

Files to modify:

  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx (add click handler)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/CorrelationPanel.tsx (new - shows correlated logs/traces)

2.4 Annotations / Event Overlays

Current: No event markers on charts. Target: Show deployment events, incidents, and alerts as vertical markers on time series charts.

Implementation:

  • Query OneUptime's own data for events in the chart's time range:
    • Incidents (from Incident model)
    • Deployments (can be sent as OTLP resource attributes or a custom event API)
    • Alert triggers (from monitor alert history)
  • Render as vertical dashed lines with icons on hover
  • Color-code by type: red for incidents, blue for deployments, yellow for alerts
  • Allow users to add manual annotations (text + timestamp)

Files to modify:

  • Common/Types/Dashboard/DashboardAnnotation.ts (new)
  • App/FeatureSet/Dashboard/src/Components/Metrics/MetricGraph.tsx (render annotation markers)
  • Common/Server/API/DashboardAnnotationAPI.ts (new - query events)

2.5 Alert Integration

Current: No connection between dashboards and alerting. Target: Create alerts from dashboard panels and display alert state on panels.

Implementation:

  • "Create Alert" button in chart settings that pre-fills a metric monitor with the chart's query
  • Show alert state indicator on chart headers (green/yellow/red dot) based on associated monitor status
  • Alert status widget: shows a summary of all active alerts with severity and duration

Files to modify:

  • App/FeatureSet/Dashboard/src/Components/Dashboard/Canvas/ComponentSettingsSideOver.tsx (add "Create Alert" button)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx (show alert state)
  • Common/Types/Dashboard/DashboardComponentType.ts (add AlertStatus widget type)

2.6 SLO/SLI Widget

Current: No SLO visualization. Target: Dedicated widget showing SLO status, error budget burn rate, and remaining budget.

Implementation (depends on Metrics roadmap Phase 3.2 - SLO/SLI Tracking):

  • New DashboardComponentType.SLO widget type
  • Configuration: select an SLO definition
  • Displays: current attainment (%), target (%), error budget remaining (%), burn rate chart
  • Color-coded: green (healthy), yellow (burning fast), red (budget exhausted)

Files to modify:

  • Common/Types/Dashboard/DashboardComponentType.ts (add SLO)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardSLOComponent.tsx (new)

Phase 3: Collaboration & Sharing (P1) — Production Workflows

3.1 Public/Shared Dashboards

Current: Dashboards require login. Target: Share dashboards with external stakeholders without requiring authentication.

Implementation:

  • Add isPublic flag and publicAccessToken to Dashboard model
  • Generate a shareable URL with token: /public/dashboard/{token}
  • Public view is read-only with no editing controls
  • Option to restrict public access to specific IP ranges
  • Render without the OneUptime navigation chrome

Files to modify:

  • Common/Models/DatabaseModels/Dashboard.ts (add isPublic, publicAccessToken)
  • App/FeatureSet/Dashboard/src/Pages/Public/Dashboard.tsx (new - public dashboard view)

3.2 JSON Import/Export with Perses & Grafana Compatibility

Current: No import/export capability. Target: Export dashboards as JSON and re-import for backup, migration, and dashboard-as-code. Support importing Perses and Grafana dashboard formats.

Implementation:

  • Native export: Serialize dashboardViewConfig + metadata (name, description, variables) as a JSON file download. Include a schema version for forward compatibility.
  • Perses-compatible export: Alongside native format, output a Perses-spec-compatible JSON. This gives users interoperability with the CNCF ecosystem without coupling our internals. Map our QueryPlugin kinds to Perses panel plugin types where possible.
  • Grafana import: Perses already has tooling to convert Grafana dashboards to Perses format. By supporting Perses import, we get Grafana migration for free: Grafana → Perses → OneUptime.
  • Import pipeline: Upload a JSON file → detect format (native, Perses, Grafana) → translate to DashboardViewConfig → validate → create dashboard.
  • Handle version compatibility (include a schema version in the export)

Files to modify:

  • App/FeatureSet/Dashboard/src/Pages/Dashboards/Dashboards.tsx (add import button)
  • App/FeatureSet/Dashboard/src/Pages/Dashboards/View/Settings.tsx (add export button)
  • Common/Server/API/DashboardImportExportAPI.ts (new)
  • Common/Server/Utils/Dashboard/PersesConverter.ts (new - bidirectional Perses format conversion)
  • Common/Server/Utils/Dashboard/GrafanaConverter.ts (new - Grafana JSON to native format)

3.3 Dashboard Versioning

Current: No change history. Target: Track changes to dashboards over time with the ability to view history and revert.

Implementation:

  • Create DashboardVersion model: dashboardId, version number, config snapshot, changedBy, timestamp
  • On each save, create a new version entry
  • UI: "Version History" in settings showing a list of versions with timestamps and authors
  • "Restore" button to revert to a previous version
  • Optional: diff view comparing two versions

Files to modify:

  • Common/Models/DatabaseModels/DashboardVersion.ts (new)
  • Common/Server/Services/DashboardService.ts (create version on save)
  • App/FeatureSet/Dashboard/src/Pages/Dashboards/View/VersionHistory.tsx (new)

3.4 Row/Section Grouping with Decoupled Layout

Current: Components placed freely with no grouping. Panel definitions and grid positions are mixed together in each component. Target: Collapsible rows/sections for organizing related panels, with layout decoupled from panel definitions.

Implementation:

  • Decouple layout from panels (pattern from Perses): Separate panel definitions (what to render) from layout definitions (where to render it). Panels are stored in a panels map keyed by ID. Layouts reference panels by $ref. This makes it easier to rearrange panels without modifying their query/display config.
  • Add a "Section" component type that acts as a collapsible container
  • Section has a title bar that can be clicked to collapse/expand
  • When collapsed, hides all components within the section's vertical range
  • Sections can be nested one level deep
  • Migration: existing DashboardViewConfig components are automatically split into panel + layout entries on first load

Files to modify:

  • Common/Types/Dashboard/DashboardViewConfig.ts (add panels map + layouts array, deprecate inline component positions)
  • Common/Types/Dashboard/DashboardComponentType.ts (add Section)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardSectionComponent.tsx (new)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/Canvas/Index.tsx (handle section collapse, resolve panel refs)

3.5 TV/Kiosk Mode

Current: Full-screen only. Target: Dedicated kiosk mode optimized for wall-mounted monitors with auto-cycling.

Implementation:

  • Kiosk mode: hides all chrome (toolbar, navigation, URL bar), shows only the dashboard grid
  • Auto-cycle: rotate through a list of dashboards at a configurable interval (30s, 1m, 5m)
  • Dashboard playlist: define an ordered list of dashboards to cycle through
  • Support per-dashboard display duration

Files to modify:

  • App/FeatureSet/Dashboard/src/Pages/Dashboards/Kiosk.tsx (new - kiosk view)
  • Common/Models/DatabaseModels/DashboardPlaylist.ts (new - playlist model)

3.6 CSV Export

Current: No data export. Target: Export chart/table data as CSV for offline analysis.

Implementation:

  • Add "Export CSV" option in chart/table context menu
  • Client-side: serialize the current rendered data to CSV format
  • Include column headers, timestamps, and values
  • Trigger browser download

Files to modify:

  • App/FeatureSet/Dashboard/src/Components/Dashboard/Components/DashboardChartComponent.tsx (add export option)
  • Common/Utils/Dashboard/CSVExport.ts (new - CSV serialization)

3.7 Custom Time Range per Widget

Current: All widgets share the global time range. Target: Individual widgets can override the global time range.

Implementation:

  • Add optional timeRangeOverride to each component's config
  • When set, the widget uses its own time range instead of the global one
  • Show a small clock icon on widgets with custom time ranges
  • Configuration in the component settings side panel

Files to modify:

  • Common/Utils/Dashboard/Components/DashboardBaseComponent.ts (add timeRangeOverride)
  • App/FeatureSet/Dashboard/src/Components/Dashboard/DashboardView.tsx (pass per-widget time ranges)

Phase 4: Differentiation (P2-P3) — Surpass Competition

4.1 AI-Powered Dashboard Creation

Current: Manual dashboard creation only. Target: Natural language dashboard creation - "Show me CPU usage by service for the last 24 hours" auto-creates the right widget.

Implementation:

  • Natural language input in the "Add Widget" dialog
  • AI translates to: metric name, aggregation, group by, chart type, time range
  • Uses available MetricType metadata to match metric names
  • Preview the generated widget before adding to dashboard
  • This is a feature NO competitor has done well yet - major differentiator

4.2 Pre-Built Dashboard Templates

Current: No templates. Target: One-click dashboard templates for common stacks.

Implementation:

  • Template library: Node.js, Python, Go, Java, Kubernetes, PostgreSQL, Redis, Nginx, MongoDB, etc.
  • Auto-detect relevant templates based on ingested telemetry data
  • "One-click create" instantiates a full dashboard from the template
  • Community template sharing (future)

4.3 Auto-Generated Dashboards

Current: Users must manually build dashboards. Target: When a service connects, auto-generate a relevant dashboard.

Implementation:

  • On first telemetry ingest from a new service, analyze the metric names and types
  • Auto-create a service dashboard with relevant charts based on detected metrics
  • Include golden signals (latency, traffic, errors, saturation) where applicable
  • Notify the user and link to the auto-generated dashboard

4.4 Customer-Facing Dashboards on Status Pages

Current: Status pages and dashboards are separate. Target: Embed dashboard widgets on status pages for real-time performance visibility.

Implementation:

  • Allow selecting specific dashboard widgets to embed on a status page
  • Render widgets in read-only mode without internal navigation
  • Respect public/private data boundaries (only show metrics the customer should see)
  • This is unique to OneUptime - no competitor has integrated observability dashboards with status pages

4.5 Dashboard-as-Code SDK (Perses-Compatible)

Current: No programmatic dashboard creation. Target: TypeScript SDK for defining dashboards as code, with optional Perses-compatible output.

Implementation:

const dashboard = new Dashboard("Service Health")
  .addVariable("service", { type: "query", query: "SELECT DISTINCT service FROM MetricItem" })
  .addRow("Latency")
    .addChart({ metric: "http.server.duration", aggregation: "p99", groupBy: ["$service"] })
    .addChart({ metric: "http.server.duration", aggregation: "p50", groupBy: ["$service"] })
  .addRow("Throughput")
    .addChart({ metric: "http.server.request.count", aggregation: "rate", groupBy: ["$service"] })

// Output native OneUptime format
dashboard.toJSON();

// Output Perses-compatible format for ecosystem interop
dashboard.toPerses();
  • SDK generates the JSON config and uses the Dashboard API to create/update
  • Git-based provisioning: store dashboard definitions in repo, CI/CD syncs to OneUptime
  • toPerses() output allows users to share dashboard definitions with teams using Perses or other CNCF-compatible tools
  • Perses's CUE SDK patterns can inform our builder API design

4.6 Anomaly Detection Overlays

Current: No anomaly visualization. Target: AI highlights anomalous data points on charts without manual threshold configuration.

Implementation (depends on Metrics roadmap Phase 3.1 - Anomaly Detection):

  • Automatically overlay expected range bands (baseline +/- N sigma) on metric charts
  • Highlight data points outside the expected range with color indicators
  • Click an anomaly to see correlated changes across metrics, logs, and traces

4.7 Terraform / OpenTofu Provider

Current: No infrastructure-as-code support for dashboards. Target: Manage dashboards via Terraform/OpenTofu for GitOps workflows.

Implementation:

  • Expose dashboard CRUD via a well-documented REST API (already exists)
  • Build a Terraform provider that maps dashboard resources to the API
  • Support oneuptime_dashboard, oneuptime_dashboard_variable, and oneuptime_dashboard_template resources
  • This complements the Dashboard-as-Code SDK (4.5) — SDK for developers, Terraform for ops teams

Quick Wins (Can Ship This Week) All Done

  1. Auto-refresh - Add a simple setInterval refresh with dropdown selector in toolbar
  2. Full markdown for text widget - Replace custom formatting with a markdown renderer
  3. Legend show/hide - Add click handler on legend items to toggle series
  4. Stacked area chart - Simple extension of existing line chart with fill
  5. Chart zoom - Enable brush selection on time series charts

Phase 0: Architecture Foundation

  1. Phase 1.9 - QueryPlugin interface (enables everything else; do this first)

Phase 1: Core Features (Complete — remaining items are partial refinements)

  1. Quick Wins - Auto-refresh, markdown, legend toggle, stacked area, chart zoom
  2. Phase 1.1 - More chart types (Area, Table, Gauge) (Pie, Heatmap, Histogram rendering deferred)
  3. Phase 1.2 - Template variables with scoping
  4. Phase 1.4 - Multiple queries per chart (Formulas deferred)
  5. Phase 1.6 - Threshold lines & color coding (Value widget done; chart reference lines deferred)

Phase 2: Platform Leverage (Differentiators)

  1. Phase 2.1 - Log stream widget (leverages all-in-one platform + QueryPlugin)
  2. Phase 2.2 - Trace list widget (leverages all-in-one platform + QueryPlugin)
  3. Phase 2.3 - Click-to-correlate (major differentiator — no competitor has this built-in)
  4. Phase 2.4 - Annotations / event overlays
  5. Phase 2.5 - Alert integration

Phase 3: Collaboration

  1. Phase 3.1 - Public/shared dashboards
  2. Phase 3.2 - JSON import/export with Perses & Grafana compatibility
  3. Phase 3.4 - Row/section grouping with decoupled layout
  4. Phase 3.5 - TV/Kiosk mode
  5. Phase 3.3 - Dashboard versioning
  6. Phase 2.6 - SLO widget (depends on SLO/SLI from Metrics roadmap)

Phase 4: Differentiation

  1. Phase 4.2 - Pre-built dashboard templates
  2. Phase 4.3 - Auto-generated dashboards
  3. Phase 4.1 - AI-powered dashboard creation
  4. Phase 4.4 - Customer-facing dashboards on status pages
  5. Phase 4.5 - Dashboard-as-code SDK (Perses-compatible)
  6. Phase 4.7 - Terraform / OpenTofu provider
  7. Phase 4.6 - Anomaly detection overlays

Verification

For each feature:

  1. Unit tests for new widget types, template variable resolution, CSV export logic, QueryPlugin dispatching
  2. Integration tests for new API endpoints (annotations, public dashboards, import/export, Perses/Grafana conversion)
  3. Manual verification via the dev server at https://oneuptimedev.genosyn.com/dashboard/{projectId}/dashboards
  4. Visual regression testing for new chart types (ensure correct rendering across browsers)
  5. Performance testing: verify dashboards with 20+ widgets and auto-refresh don't cause excessive API load
  6. Test template variables with edge cases: empty results, special characters, multi-value selections
  7. Verify public dashboards don't leak private data
  8. Test Perses/Grafana import with real-world dashboard exports to validate conversion fidelity
  9. Test QueryPlugin interface with mixed query types (metric + log + trace) on a single dashboard