28 Commits

Author SHA1 Message Date
Nawaz Dhandala
f49b1995df feat(telemetry): add new Telemetry service (OTel, Syslog, Fluent, Metrics, Traces) and unified ingestion pipeline
- Add Telemetry service entrypoint
  - Telemetry/Index.ts: app bootstrap, routes mounting, infrastructure init and Telemetry SDK init.

- Unified queue + worker
  - Telemetry/Jobs/TelemetryIngest/ProcessTelemetry.ts: single worker that dispatches queued jobs to specific processors (logs, traces, metrics, syslog, fluent logs).
  - Telemetry/Services/Queue/TelemetryQueueService.ts: central queue API and job payload types.
  - Per-type Queue wrappers (LogsQueueService, MetricsQueueService, TracesQueueService, FluentLogsQueueService, SyslogQueueService).

- OpenTelemetry ingestion middleware and proto support
  - Telemetry/Middleware/OtelRequestMiddleware.ts: detect OTLP endpoint (logs/traces/metrics), decode protobuf bodies using protobufjs and set product type.
  - Telemetry/ProtoFiles/OTel/v1/*.proto: include common.proto, logs.proto, metrics.proto, resource.proto, traces.proto for OTLP v1 messages.

- Ingest services
  - Telemetry/Services/OtelLogsIngestService.ts: parse incoming OTLP logs, map attributes, convert timestamps, batch insert logs.
  - Telemetry/Services/OtelTracesIngestService.ts: parse OTLP traces, build span rows, extract exceptions, batch insert spans and exceptions, save telemetry exception summary.
  - Telemetry/Services/OtelMetricsIngestService.ts: parse OTLP metrics, normalize datapoints, batch insert metrics and index metric name -> service map.
  - Telemetry/Services/SyslogIngestService.ts: syslog ingestion endpoints, parser integration, map syslog fields to attributes and logs.
  - Telemetry/Services/FluentLogsIngestService.ts: ingest Fluentd style logs, normalize entries and insert into log backend.
  - Telemetry/Services/OtelIngestBaseService.ts: helpers to resolve service name from attributes/headers.

- Syslog parser and utilities
  - Telemetry/Utils/SyslogParser.ts: robust RFC5424 and RFC3164 parser, structured data extraction and sanitization.
  - Telemetry/Tests/Utils/SyslogParser.test.ts: unit tests for parser behavior.

- Telemetry exception utilities
  - Telemetry/Utils/Exception.ts: generate exception fingerprint and upsert telemetry exception status (saveOrUpdateTelemetryException).

- Queue & job integration
  - New integration with Common/Server/Infrastructure/Queue and QueueWorker, job id generation and telemetry job types.
  - Telemetry services add ingestion jobs instead of processing synchronously.

- Config, build and dev tooling
  - Add Telemetry/package.json, package-lock.json, tsconfig.json, nodemon.json, jest config.
  - New script configs and dependencies (protobufjs, ts-node, jest, nodemon, etc).

- Docker / environment updates
  - docker-compose.base.yml, docker-compose.dev.yml, docker-compose.yml: rename service from open-telemetry-ingest -> telemetry and wire TELEMETRY_* envs.
  - config.example.env: rename and consolidate environment variables (OPEN_TELEMETRY_* -> TELEMETRY_*, update hostnames and ports).
  - Tests/Scripts/status-check.sh: update ready-check target to telemetry/status/ready.

- Other
  - Telemetry/Services/Queue/*: export helpers and legacy-compatible job interface shims.
  - Memory cleanup and batching safeguards across ingest services.
  - Logging and capture spans added to key code paths.

BREAKING CHANGES / MIGRATION NOTES:
- Environment variables and docker service names changed:
  - Replace OPEN_TELEMETRY_... vars with TELEMETRY_... (PORT, HOSTNAME, CONCURRENCY, DISABLE_TELEMETRY, etc).
  - docker-compose entries moved from "open-telemetry-ingest" to "telemetry" and image name changed to oneuptime/telemetry.
  - Update any deployment automation and monitoring checks referencing the old service name or endpoints.
- Consumers: OTLP endpoints and behavior remain supported, but ingestion is now queued and processed asynchronously.

Testing / Running:
- Install deps in Telemetry/ (npm install) after syncing Common workspace.
- Run dev: npx nodemon (nodemon.json) or build & start using provided scripts.
- Run tests with jest (Telemetry test suite includes SyslogParser unit tests).

Files added/modified (high level):
- Added many files under Telemetry/: Index, Jobs, Middleware, ProtoFiles, Services, Utils, Tests, package and config artifacts.
- Modified docker-compose.* and config.example.env and status check script to use new TELEMETRY service/vars.
2025-11-07 21:36:47 +00:00
Simon Larsen
a80b7ba88c chore(fluent-ingest): migrate fluent log ingest into open-telemetry-ingest and remove legacy fluent-ingest service
- Move Fluent/Fluent Bit logs ingestion into open-telemetry-ingest:
  - Add OpenTelemetryIngest/API/Fluent.ts (routes for /fluentd and queue endpoints)
  - Add Queue service, job worker and processor:
    - OpenTelemetryIngest/Services/Queue/FluentLogsQueueService.ts
    - OpenTelemetryIngest/Jobs/TelemetryIngest/ProcessFluentLogs.ts
  - Register Fluent API and job processing in OpenTelemetryIngest/Index.ts
  - Introduce QueueName.FluentLogs and related queue usage

- Remove legacy FluentIngest service and configuration:
  - Delete fluent-ingest docker-compose/dev/base entries and docker-compose.yml service
  - Remove fluent-ingest related helm values, KEDA scaledobject, ingress host and schema entries
  - Remove FLUENTD_HOST env/values and replace FLUENT_INGEST_HOSTNAME -> FLUENT_LOGS_HOSTNAME (pointing to open-telemetry-ingest)
  - Update config.example.env keys (FLUENT_LOGS_CONCURRENCY, DISABLE_TELEMETRY_FOR_FLUENT_LOGS)
  - Remove FluentIngestRoute and FLUENT_INGEST_URL/hostname usages from UI config/templates
  - Remove VSCode launch debug config for Fluent Ingest
  - Remove Fluent ingest E2E status check entry in Tests/Scripts/status-check.sh
  - Update docs/architecture diagram and Helm templates to reflect "FluentLogs" / Fluent Bit flow

- Misc:
  - Remove FLUENTD_HOST environment injection from docker-compose.base.yml
  - Cleanup related values.schema.json and values.yaml entries

This consolidates log ingestion under the OpenTelemetry ingest service and removes the separate FluentIngest service and its configuration.
2025-11-07 19:37:31 +00:00
Simon Larsen
325fa0eb7a Add SERVER_OPEN_TELEMETRY_INGEST_HOSTNAME to Helm template and update tag replacement in change-release-to-test-tag script 2024-11-22 10:23:56 +00:00
Simon Larsen
815ae7161d Rename Ingestor to ProbeIngest; update configurations, routes, and Docker support; add new request types and workflows 2024-11-21 17:18:22 +00:00
Simon Larsen
3a1f5c7120 Refactor OpenTelemetry Ingest Dockerfile and configuration; update environment variables and docker-compose for new service integration 2024-11-21 17:08:35 +00:00
Simon Larsen
945cef653c Add Incoming Request Ingest service with configuration, Docker support, and tests 2024-11-21 14:41:37 +00:00
Jack Veney
17a0b65a4b Update endpoint-status.sh
Included the -L option in the curl command, ensuring that it will follow any 301 redirects until the final URL is reached.
2024-06-20 15:24:27 -04:00
Simon Larsen
a66a04456b refactor: Update endpoint URLs for status check script 2024-06-13 21:44:49 +01:00
Simon Larsen
f42291a428 refactor: Update status check URLs for Dashboard, Status Page, and Accounts
The status-check.sh script has been modified to update the endpoint URLs for checking the status of the Dashboard, Status Page, and Accounts services. The URLs now include "/status/ready" at the end, ensuring that the services are ready to handle requests. This change improves the accuracy and reliability of the status check script.
2024-06-13 21:05:37 +01:00
Simon Larsen
8325c06ca3 refactor: Update status check URLs for Dashboard and Status Page
The status-check.sh script has been modified to update the endpoint URLs for checking the status of the Dashboard and Status Page services. The URLs now include "/status/ready" at the end, ensuring that the services are ready to handle requests. This change improves the accuracy and reliability of the status check script.
2024-06-12 17:27:22 +01:00
Simon Larsen
b66f56bc12 refactor: Update endpoint status check URLs to include "/ready"
The status-check.sh script has been updated to modify the endpoint URLs for checking the status of various services. The URLs now include "/ready" at the end, indicating that the services are ready to handle requests. This change ensures that the status check accurately reflects the readiness of the services and improves the reliability of the script.
2024-06-12 17:25:08 +01:00
Simon Larsen
597aeb74f4 add e2e to test release 2024-06-09 19:29:23 +01:00
Simon Larsen
a24bf077ce refactor: Improve error message in endpoint-status.sh
This code change updates the endpoint-status.sh script to improve the error message when the endpoint returns an HTTP status other than 200. The previous message mentioned that it usually takes a few minutes for the app to boot, which is not accurate. The updated message removes this misleading information and provides a more accurate description of the retry behavior.
2024-06-09 14:49:48 +01:00
Simon Larsen
1be827741e Update E2E Config.ts to use E2E_TEST_STATUS_PAGE_URL instead of E2E_TEST_REGISTERED_USER_PASSWORD 2024-04-27 16:55:22 +01:00
Simon Larsen
580fd97030 Update retry interval for endpoint status script 2024-01-25 11:32:16 +00:00
Simon Larsen
224943206e Remove unnecessary endpoint status checks 2024-01-25 11:30:48 +00:00
Simon Larsen
04dcb80124 Refactor endpoint status checks in status-check.sh script 2023-12-29 14:06:46 +00:00
Simon Larsen
4d2afa7cf4 Add new endpoint status checks 2023-12-28 16:51:21 +00:00
Simon Larsen
284752631e Remove unnecessary endpoint status checks 2023-12-25 13:30:46 +00:00
Simon Larsen
526df139b1 Update Dockerfile and launch configurations 2023-12-11 19:37:18 +00:00
Simon Larsen
6804e94850 add ingestor status check 2023-10-16 12:55:54 +01:00
Jordan Jones
515b8ba94c chore(tests): sneak in the tiny misspelling 2023-10-01 08:29:47 -07:00
Simon Larsen
c06c0f8b38 fix helm test 2023-10-01 09:13:44 +00:00
Simon Larsen
0d09047454 add endpoint telemetry 2023-09-29 14:26:09 +01:00
Simon Larsen
8a3b893521 add status check script 2023-09-29 14:24:59 +01:00
Simon Larsen
36860e6ee9 fix status check scirpt 2023-09-29 14:16:09 +01:00
Simon Larsen
8863c6a209 add chmod to scripts 2023-09-29 13:52:56 +01:00
Simon Larsen
3714c2c91a add test container 2023-09-28 11:16:50 +00:00