11 KiB
Exception Product Improvements Plan
Overview
This document outlines the improvements to make OneUptime's exception/error tracking product more competitive with tools like Sentry, while working within the constraints of OpenTelemetry (no custom SDK development).
Phase 1: Quick Wins (Foundation)
1.1 Add Release & Environment Tracking
Goal: Track which version/environment exceptions occur in
Changes Required:
-
ExceptionInstance Model (
Common/Models/AnalyticsModels/ExceptionInstance.ts)- Add
releasecolumn (string) - fromservice.versionresource attribute - Add
environmentcolumn (string) - fromdeployment.environmentresource attribute - Add
serviceVersioncolumn (string) - alternative naming
- Add
-
TelemetryException Model (
Common/Models/DatabaseModels/TelemetryException.ts)- Add corresponding columns for aggregated data
-
OtelTracesIngestService (
Telemetry/Services/OtelTracesIngestService.ts)- Extract
service.versionfrom resource attributes during ingestion - Extract
deployment.environmentfrom resource attributes - Populate new columns
- Extract
-
UI Updates
- Add release/environment filters to ExceptionsTable
- Display release info in ExceptionDetail
- Add release comparison view
1.2 Parse Stack Traces Into Structured Frames
Goal: Transform raw stack trace strings into structured, queryable frames
Changes Required:
-
Create Stack Trace Parser (
Telemetry/Utils/StackTraceParser.ts)interface StackFrame { functionName: string; fileName: string; lineNumber: number; columnNumber?: number; inApp: boolean; // true if user code, false if library context?: { pre: string[]; // lines before line: string; // the line post: string[]; // lines after }; } interface ParsedStackTrace { frames: StackFrame[]; raw: string; // original string } -
Support Multiple Languages:
- JavaScript/Node.js:
at functionName (file:line:col) - Python:
File "path", line N, in function - Java:
at package.Class.method(File.java:line) - Go:
package/file.go:line +0xNN - Ruby:
file:line:in 'method'
- JavaScript/Node.js:
-
ExceptionInstance Model Updates:
- Add
parsedFramescolumn (JSON array) - Keep
stackTracefor raw string
- Add
-
UI Component (
Dashboard/src/Components/Exceptions/StackFrameViewer.tsx)- Expandable frame cards
- Highlight app frames vs library frames
- Show code context when available
1.3 Enhanced Breadcrumb/Events Timeline
Goal: Show events leading up to an exception
Changes Required:
-
Extract Breadcrumbs from Span Events
- Span events already captured, need better UI
- Filter events to show last N events before exception
- Categorize by type (http, db, console, user-action)
-
UI Component (
Dashboard/src/Components/Exceptions/BreadcrumbTimeline.tsx)- Timeline visualization
- Color-coded by category
- Relative timestamps ("2s before error")
- Expandable details
-
Integration Points:
- Add to ExceptionExplorer
- Add to SpanViewer exception tab
Phase 2: Source Maps & Code Context
2.1 Source Map Upload Infrastructure
Goal: Allow users to upload source maps for JavaScript/TypeScript unmapping
Changes Required:
-
New Database Model (
Common/Models/DatabaseModels/SourceMap.ts)- projectId: ObjectID - serviceId: ObjectID - release: string - fileName: string (e.g., "main.js") - sourceMapContent: Text (the .map file contents) - uploadedAt: Date - uploadedByUserId: ObjectID -
API Endpoints (
App/FeatureSet/Telemetry/API/SourceMap.ts)- POST
/source-maps/upload- Upload source map files - GET
/source-maps/:serviceId/:release- List source maps - DELETE
/source-maps/:id- Delete source map
- POST
-
UI for Upload (
Dashboard/src/Pages/Telemetry/Services/View/SourceMaps.tsx)- Drag & drop upload
- List uploaded maps by release
- Delete old maps
-
CLI Integration
- Command to upload source maps during CI/CD
oneuptime sourcemaps upload --release v1.2.3 --files ./dist/*.map
2.2 Stack Trace Unmapping
Goal: Resolve minified stack traces to original source
Changes Required:
-
Source Map Resolver Service (
Telemetry/Services/SourceMapService.ts)- Load source map for service + release
- Use
source-mapnpm package for resolution - Cache resolved mappings
-
Integration with Stack Frame Display
- On-demand resolution when viewing exception
- Cache resolved frames
- Show both original and mapped positions
2.3 Code Context Display
Goal: Show source code around each stack frame
Changes Required:
-
Source Code Storage (Optional - requires repo integration)
- Store code snippets with source maps
- Or fetch from connected Git repository
-
Fallback: Manual Context
- Allow source maps to include
sourcesContent - Display inline in stack frame viewer
- Allow source maps to include
Phase 3: Analytics & Intelligence
3.1 Affected Users Tracking
Goal: Track how many unique users are affected by each exception
Changes Required:
-
Extract User Info from Attributes
- Look for
user.id,enduser.id,user_idin span/exception attributes - Store unique user count per exception fingerprint
- Look for
-
Model Updates
- Add
affectedUserIds(array) to ExceptionInstance - Add
affectedUserCountto TelemetryException
- Add
-
UI Updates
- Show "X users affected" in exception list
- User impact chart over time
3.2 Release Comparison
Goal: Compare exceptions across releases
Changes Required:
-
Queries for Comparison
- Exceptions in release A but not B
- Exceptions fixed in release B
- New exceptions introduced in release B
-
UI Component (
Dashboard/src/Pages/Telemetry/Exceptions/ReleaseComparison.tsx)- Side-by-side release comparison
- Regression detection
- Fixed exceptions list
3.3 Error Spike Detection
Goal: Alert when exception rate increases significantly
Changes Required:
-
Background Job - Similar pattern to existing cron jobs
- Calculate baseline exception rate
- Detect anomalies (>2 standard deviations)
- Trigger alerts
-
Integration with Alerts
- New alert type: Exception Rate Alert
- Configurable thresholds
Phase 4: UI/UX Enhancements
4.1 Rich Stack Frame Viewer
Mockup:
┌─────────────────────────────────────────────────────────────┐
│ TypeError: Cannot read property 'id' of undefined │
├─────────────────────────────────────────────────────────────┤
│ ▼ getUser user.service.ts:42 [APP] │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ 40 │ async function getUser(id: string) { │ │
│ │ 41 │ const user = await db.findById(id); │ │
│ │▸ 42 │ return user.id; // Error here │ │
│ │ 43 │ } │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ▶ handleRequest api.controller.ts:128 [APP] │
│ ▶ processMiddleware middleware.ts:56 [APP] │
│ ▶ Layer.handle express/router.ts:174 [LIB] │
│ ▶ next express/router.ts:123 [LIB] │
└─────────────────────────────────────────────────────────────┘
4.2 Breadcrumb Timeline
Mockup:
┌─────────────────────────────────────────────────────────────┐
│ BREADCRUMBS Last 30s │
├─────────────────────────────────────────────────────────────┤
│ ● HTTP GET /api/users/123 200 -28s │
│ ● DB SELECT * FROM users OK -25s │
│ ● HTTP GET /api/orders?user=123 200 -20s │
│ ● DB SELECT * FROM orders OK -18s │
│ ● LOG Processing order #456 info -15s │
│ ○ WARN Rate limit approaching warn -10s │
│ ● HTTP POST /api/checkout 500 -5s │
│ ✖ ERROR TypeError: Cannot read... error 0s │
└─────────────────────────────────────────────────────────────┘
4.3 Exception List Enhancements
- Full-text search on stack traces
- Saved filters/views
- Bulk operations
- Quick preview on hover
Implementation Priority
| Priority | Feature | Effort | Impact |
|---|---|---|---|
| P0 | Release/Environment tracking | Low | High |
| P0 | Parse stack traces | Medium | High |
| P1 | Breadcrumb timeline UI | Low | Medium |
| P1 | Rich stack frame viewer | Medium | High |
| P2 | Source map upload | Medium | High |
| P2 | Stack trace unmapping | Medium | High |
| P2 | Affected users count | Low | Medium |
| P3 | Release comparison | Medium | Medium |
| P3 | Error spike detection | Medium | Medium |
| P3 | Code context display | High | Medium |
Files to Modify
Models
Common/Models/AnalyticsModels/ExceptionInstance.tsCommon/Models/DatabaseModels/TelemetryException.tsCommon/Models/DatabaseModels/SourceMap.ts(new)
Services
Telemetry/Services/OtelTracesIngestService.tsTelemetry/Utils/StackTraceParser.ts(new)Telemetry/Services/SourceMapService.ts(new)Common/Server/Services/ExceptionInstanceService.ts
UI Components
Dashboard/src/Components/Exceptions/ExceptionDetail.tsxDashboard/src/Components/Exceptions/StackFrameViewer.tsx(new)Dashboard/src/Components/Exceptions/BreadcrumbTimeline.tsx(new)Dashboard/src/Components/Exceptions/ExceptionExplorer.tsxDashboard/src/Components/Span/SpanViewer.tsx
API
App/FeatureSet/Telemetry/API/SourceMap.ts(new)
What This Plan Does NOT Include
These require custom SDK development (outside OpenTelemetry):
- Local variables capture - Requires language-specific debugger hooks
- Session replay - Requires browser SDK instrumentation
- Automatic breadcrumbs - Requires SDK-level instrumentation
Success Metrics
- Stack traces displayed as parsed frames (not raw text)
- Release/environment visible on all exceptions
- Source maps can be uploaded and used for unmapping
- Breadcrumb timeline shows events before exception
- Users can compare exceptions across releases