17 Commits

Author SHA1 Message Date
Wei S.
d372ef7566 Feat: Tika Integration and Batch Indexing (#132)
* Feat/tika integration (#94)

* feat(Tika) Integration von Tika zur Textextraktion

* feat(Tika) Integration of Apache Tika for text extraction

* feat(Tika): Complete Tika integration with text extraction and docker-compose setup

- Add Tika service to docker-compose.yml
- Implement text sanitization and document validation
- Improve batch processing with concurrency control

* fix(comments) translated comments into english
fix(docker) removed ports (only used for testing)

* feat(indexing): Implement batch indexing for Meilisearch

This change introduces batch processing for indexing emails into Meilisearch to significantly improve performance and throughput during ingestion. This change is based on the batch processing method previously contributed by @axeldunkel.

Previously, each email was indexed individually, resulting in a high number of separate API calls. This approach was inefficient, especially for large mailboxes.

The `processMailbox` queue worker now accumulates emails into a batch before sending them to the `IndexingService`. The service then uses the `addDocuments` Meilisearch API endpoint to index the entire batch in a single request, reducing network overhead and improving indexing speed.

A new environment variable, `MEILI_INDEXING_BATCH`, has been added to make the batch size configurable, with a default of 500.

Additionally, this commit includes minor refactoring:
- The `TikaService` has been moved to its own dedicated file.
- The `PendingEmail` type has been moved to the shared `@open-archiver/types` package.

* chore(jobs): make continuous sync job scheduling idempotent

Adds a static `jobId` to the repeatable 'schedule-continuous-sync' job.

This prevents duplicate jobs from being scheduled if the server restarts. By providing a unique ID, the queue will update the existing repeatable job instead of creating a new one, ensuring the sync runs only at the configured frequency.

---------

Co-authored-by: axeldunkel <53174090+axeldunkel@users.noreply.github.com>
Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-26 11:34:32 +02:00
Wei S.
4b11cd931a Docs: update rate limiting docs (#91)
* Adding rate limiting docs

* update rate limiting docs

* Resolve conflict

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-06 17:56:34 +03:00
Wei S.
22b173cbe4 Feat: Implement API key authentication (#84)
* feat(auth): Implement API key authentication

This commit enables API access with an API key system. This change provides a better experience for programmatic access and third-party integrations.

Key changes include:
- **API Key Management:** Users can now generate, manage, and revoke persistent API keys through a new "API Keys" section in the settings UI.
- **Authentication Middleware:** API requests are now authenticated via an `X-API-KEY` header instead of the previous `Authorization: Bearer` token.
- **Backend Implementation:** Adds a new `api_keys` database table, along with corresponding services, controllers, and routes to manage the key lifecycle securely.
- **Rate Limiting:** The API rate limiter now uses the API key to identify and track requests.
- **Documentation:** The API authentication documentation has been updated to reflect the new method.

* Add configurable API rate limiting

Two new variables are added to `.env.example`:
- `RATE_LIMIT_WINDOW_MS`: The time window in milliseconds for which requests are checked (defaults to 15 minutes).
- `RATE_LIMIT_MAX_REQUESTS`: The maximum number of requests allowed from an IP within the window (defaults to 100).

The installation documentation has been updated to reflect these new configuration options.

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-04 15:07:53 +03:00
Wei S.
774b0d7a6b Bug fix: Status API response: needsSetup and Remove SUPER_API_KEY support (#83)
* Disable system settings for demo mode

* Status API response: needsSetup

* Remove SUPER_API_KEY support

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-03 16:30:06 +03:00
Wayne
82a83a71e4 BODY_SIZE_LIMIT fix, database url encode 2025-08-13 21:55:22 +03:00
Wayne
b03791d9a6 adding FRONTEND_BODY_SIZE_LIMIT to allow bigger file upload for the frontend. This is to fix the pst file upload error. 2025-08-13 19:20:19 +03:00
Wayne
4872ed597f PST ingestion 2025-08-07 17:03:08 +03:00
Wayne
3201fbfe0b Email thread improvement, user-defined sync frequency 2025-08-05 21:12:06 +03:00
Wayne
5217d24184 Docker Compose deployment 2025-07-25 16:29:09 +03:00
Wayne
8c12cda370 Docker Compose deployment 2025-07-25 15:50:25 +03:00
Wayne
946da7925b Docker deployment 2025-07-24 23:43:38 +03:00
Wayne
9b25c8b9d3 Rename: Open Archiver 2025-07-15 00:48:40 +03:00
Wayne
7ed8d78d73 Ingestion service credentials encryption. Ingestion auth handling 2025-07-11 17:39:21 +03:00
Wayne
59e5c5c69b Pluggable storage service (local + S3 compatible) 2025-07-11 16:21:40 +03:00
Wayne
3eb155ee16 Database migration. Adding ingestion service 2025-07-11 13:36:34 +03:00
Wayne
cc08f35ada Admin user login support 2025-07-10 22:32:12 +03:00
Wayne
f243775ae6 scaffolding 2025-07-10 13:32:54 +03:00