18 Commits

Author SHA1 Message Date
Wei S.
0c42b30c9e V0.5.1 dev (#341)
* OpenAPI root url fix

* Journaling OSS setup

* feat: add preserve-original-file mode for email ingestion for GoBD compliance

- Add `preserveOriginalFile` option to ingestion sources and connectors
- Stream original EML/MBOX/PST emails to temp files instead of holding
  full buffers in memory, reducing memory allocation during ingestion
- Skip attachment binary extraction and EML re-serialization when
  preserve mode is enabled; use raw file on disk as source of truth
- Update `EmailObject` to use `tempFilePath` instead of in-memory `eml`
  buffer across all connectors (EML, MBOX, PST)
- Add new database migration (0032) for `preserve_original_file` column
- Add frontend UI toggle with tooltip (tippy.js) for the new option
- Replace console.warn calls with structured pino logger in connectors

* add isjournaled property to archived_email

* feat(ingestion): add unmerge ingestion source functionality

Introduces the ability to detach a child ingestion source from its
merge group, making it a standalone root source. Changes include:

- Add `unmerge` controller method with auth and error handling
- Add POST `/v1/ingestion-sources/{id}/unmerge` route with OpenAPI docs
- Implement `IngestionService.unmerge` backend logic
- Add unmerge UI action and handler in the frontend ingestion view
- Fix bulk delete to also remove children of deleted root sources
- Update docs with new API operation and merging sources user guide

* code formatting

* Database migration file for enum `partially_active`

* Error handling improvement
2026-03-30 22:29:03 +02:00
Wei S.
e5e119528f V0.5.0 release (#335)
* adding exports to backend package, page icons update

* Integrity report PDF generation

* Fixed inline attachment images not displaying in the email preview by modifying `EmailPreview.svelte`.
The email HTML references embedded images via `cid:` URIs (e.g., `src="cid:ii_19c6d5f8d5eee7bd6d91"`), but the component never resolved those `cid:` references to actual image data, even though `postal-mime` already parses inline attachments with their `contentId` and binary `content`.
The `emailHtml` derived value now calls `resolveContentIdReferences()` before rendering, so inline/embedded images display correctly in the iframe preview.

* feat: strip non-inline attachments from EML before storage

Add nodemailer dependency and emlUtils helper to remove non-inline
attachments from .eml buffers during ingestion. This avoids
double-storing attachment data since attachments are already stored
separately.

* upload error handing for file based ingestion

* Use Postgres for sync session management

* Google workspace / MS 365 duplicate check, avoid extra API call when previous ingestion fails

* OpenAPI specs for API docs

* code formatting

* ran duplicate check for IMAP import, optimize message listing

* Version update
2026-03-20 13:14:41 +01:00
Wei S.
399059a773 V0.4 fix 2 (#207)
* formatting code

* Remove uninstalled packages

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-10-28 13:39:09 +01:00
Wei S.
42b0f6e5f1 V0.4.0 fix (#204)
* Jobs page responsive fix

* feat(ingestion): Refactor email indexing into a dedicated background job

This commit refactors the email indexing process to improve the performance and reliability of the ingestion pipeline.

Previously, email indexing was performed synchronously within the mailbox processing job. This could lead to timeouts and failed ingestion cycles if the indexing step was slow or encountered errors.

To address this, the indexing logic has been moved into a separate, dedicated background job queue (`indexingQueue`). Now, the mailbox processor simply adds a batch of emails to this queue. A separate worker then processes the indexing job asynchronously.

This decoupling makes the ingestion process more robust:
- It prevents slow indexing from blocking or failing the entire mailbox sync.
- It allows for better resource management and scalability by handling indexing in a dedicated process.
- It improves error handling, as a failed indexing job can be retried independently without affecting the main ingestion flow.

Additionally, this commit includes minor documentation updates and removes a premature timeout in the PDF text extraction helper that was causing issues.

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-10-28 13:14:43 +01:00
Wei S.
6e1ebbbfd7 v0.4 init: File encryption, integrity report, deletion protection, job monitoring (#187)
* open-core setup, adding enterprise package

* enterprise: Audit log API, UI

* Audit-log docs

* feat: Integrity report, allowing users to verify the integrity of archived emails and their attachments.

- When an email is archived, Open Archiver calculates a unique cryptographic signature (a SHA256 hash) for the email's raw `.eml` file and for each of its attachments. These signatures are stored in the database alongside the email's metadata.
- The integrity check feature recalculates these signatures for the stored files and compares them to the original signatures stored in the database. This process allows you to verify that the content of your archived emails has not been altered, corrupted, or tampered with since the moment they were archived.
- Add docs of Integrity report

* Update Docker-compose.yml to use bind mount for Open Archiver data.
Fix API rate-limiter warning about trust proxy

* File encryption support

* Scope attachment deduplication to ingestion source

Previously, attachment deduplication was handled globally by enforcing a unique constraint on the content hash (contentHashSha256) in the `attachments` table. This caused an issue where an attachment from one ingestion source would be incorrectly linked if the same attachment was processed by a different source.

This commit refactors the deduplication logic to be scoped on a per-ingestion-source basis.

Changes:
-   **Schema:** The `attachments` table schema has been updated to include a nullable `ingestionSourceId` column. A composite unique index has been added on `(ingestionSourceId, contentHashSha256)` to enforce per-source uniqueness. The `ingestionSourceId` is nullable to ensure backward compatibility with existing databases.
-   **Ingestion Logic:** The `IngestionService` has been updated to provide the `ingestionSourceId` when inserting attachment records. The `onConflictDoUpdate` clause now targets the new composite key, ensuring that attachments are only considered duplicates if they have the same hash and originate from the same ingestion source.

* Scope attachment deduplication to ingestion source

Previously, attachment deduplication was handled globally by enforcing a unique constraint on the content hash (contentHashSha256) in the `attachments` table. This caused an issue where an attachment from one ingestion source would be incorrectly linked if the same attachment was processed by a different source.

This commit refactors the deduplication logic to be scoped on a per-ingestion-source basis.

Changes:
-   **Schema:** The `attachments` table schema has been updated to include a nullable `ingestionSourceId` column. A composite unique index has been added on `(ingestionSourceId, contentHashSha256)` to enforce per-source uniqueness. The `ingestionSourceId` is nullable to ensure backward compatibility with existing databases.
-   **Ingestion Logic:** The `IngestionService` has been updated to provide the `ingestionSourceId` when inserting attachment records. The `onConflictDoUpdate` clause now targets the new composite key, ensuring that attachments are only considered duplicates if they have the same hash and originate from the same ingestion source.

* Add option to disable deletions

This commit introduces a new feature that allows admins to disable the deletion of emails and ingestion sources for the entire instance. This is a critical feature for compliance and data retention, as it prevents accidental or unauthorized deletions.

Changes:
-   **Configuration**: Added an `ENABLE_DELETION` environment variable. If this variable is not set to `true`, all deletion operations will be disabled.
-   **Deletion Guard**: A centralized `checkDeletionEnabled` guard has been implemented to enforce this setting at both the controller and service levels, ensuring a robust and secure implementation.
-   **Documentation**: The installation guide has been updated to include the new `ENABLE_DELETION` environment variable and its behavior.
-   **Refactor**: The `IngestionService`'s `create` method was refactored to remove unnecessary calls to the `delete` method, simplifying the code and improving its robustness.

* Adding position for menu items

* feat(docker): Fix CORS errors

This commit fixes CORS errors when running the app in Docker by introducing the `APP_URL` environment variable. A CORS policy is set up for the backend to only allow origin from the `APP_URL`.

Key changes include:
- New `APP_URL` and `ORIGIN` environment variables have been added to properly configure CORS and the SvelteKit adapter, making the application's public URL easily configurable.
- Dockerfiles are updated to copy the entrypoint script, Drizzle config, and migration files into the final image.
- Documentation and example files (`.env.example`, `docker-compose.yml`) have been updated to reflect these changes.

* feat(attachments): De-duplicate attachment content by content hash

This commit refactors attachment handling to allow multiple emails within the same ingestion source to reference attachments with identical content (same hash).

Changes:
- The unique index on the `attachments` table has been changed to a non-unique index to permit duplicate hash/source pairs.
- The ingestion logic is updated to first check for an existing attachment with the same hash and source. If found, it reuses the existing record; otherwise, it creates a new one. This maintains storage de-duplication.
- The email deletion logic is improved to be more robust. It now correctly removes the email-attachment link before checking if the attachment record and its corresponding file can be safely deleted.

* Not filtering our Trash folder

* feat(backend): Add BullMQ dashboard for job monitoring

This commit introduces a web-based UI for monitoring and managing background jobs using Bullmq.

Key changes:
- A new `/api/v1/jobs` endpoint is created, serving the Bull Board dashboard. Access is restricted to authenticated administrators.
- All BullMQ queue definitions (`ingestion`, `indexing`, `sync-scheduler`) have been centralized into a new `packages/backend/src/jobs/queues.ts` file.
- Workers and services now import queue instances from this central file, improving code organization and removing redundant queue instantiations.

* Add `ALL_INCLUSIVE_ARCHIVE` environment variable to disable jun filtering

* Using BSL license

* frontend: Responsive design for menu bar, pagination

* License service/module

* Remove demoMode logic

* Formatting code

* Remove enterprise packages

* Fix package.json in packages

* Search page responsive fix

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-10-24 17:11:05 +02:00
Wei S.
e9a65f9672 feat: Add Mbox ingestion (#117)
This commit introduces two major features:

1.  **Mbox File Ingestion:**
    Users can now ingest emails from Mbox files (`.mbox`). A new Mbox connector has been implemented on the backend, and the user interface has been updated to support creating Mbox ingestion sources. Documentation for this new provider has also been added.

Additionally, this commit includes new documentation for upgrading and migrating Open Archiver.

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-16 20:30:22 +03:00
Wei S.
37a778cb6d chore(deps): Update dependencies across packages (#105)
This commit updates several dependencies in the frontend and backend packages.

- **Backend:**
  - Upgrades `xlsx` to version `0.20.3` by pointing to the official CDN URL. This ensures usage of the community edition with a permissive license.
  - Removes the unused `bull-board` development dependency.

- **Frontend:**
  - Upgrades `@sveltejs/kit` from `^2.16.0` to `^2.38.1` to stay current with the latest features and fixes.

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-11 22:07:35 +03:00
Wei S.
4a23f8f29f feat: Add new version notification in footer (#99)
This commit implements a system to check for new application versions and notify the user.

On page load, the server-side code now fetches the latest release from the GitHub repository API. It uses `semver` to compare the current application version with the latest release tag.

If a newer version is available, an alert is displayed in the footer with a link to the release page. The current application version is also now displayed in the footer. The version check is cached for one hour to minimize API requests.

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-09 23:36:35 +03:00
Wei S.
22b173cbe4 Feat: Implement API key authentication (#84)
* feat(auth): Implement API key authentication

This commit enables API access with an API key system. This change provides a better experience for programmatic access and third-party integrations.

Key changes include:
- **API Key Management:** Users can now generate, manage, and revoke persistent API keys through a new "API Keys" section in the settings UI.
- **Authentication Middleware:** API requests are now authenticated via an `X-API-KEY` header instead of the previous `Authorization: Bearer` token.
- **Backend Implementation:** Adds a new `api_keys` database table, along with corresponding services, controllers, and routes to manage the key lifecycle securely.
- **Rate Limiting:** The API rate limiter now uses the API key to identify and track requests.
- **Documentation:** The API authentication documentation has been updated to reflect the new method.

* Add configurable API rate limiting

Two new variables are added to `.env.example`:
- `RATE_LIMIT_WINDOW_MS`: The time window in milliseconds for which requests are checked (defaults to 15 minutes).
- `RATE_LIMIT_MAX_REQUESTS`: The maximum number of requests allowed from an IP within the window (defaults to 100).

The installation documentation has been updated to reflect these new configuration options.

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-09-04 15:07:53 +03:00
Wei S.
392f51dabc System settings: adding multi-language support for frontend (#72)
* System settings setup

* Multi-language support

* feat: Add internationalization (i18n) support to frontend

This commit introduces internationalization (i18n) to the frontend using the `sveltekit-i18n` library, allowing the user interface to be translated into multiple languages.

Key changes:
- Added translation files for 10 languages (en, de, es, fr, etc.).
- Replaced hardcoded text strings throughout the frontend components and pages with translation keys.
- Added a language selector to the system settings page, allowing administrators to set the default application language.
- Updated the backend settings API to store and expose the new language configuration.

* Adding greek translation

* feat(backend): Implement i18n for API responses

This commit introduces internationalization (i18n) to the backend API using the `i18next` library.

Hardcoded error and response messages in the API controllers have been replaced with translation keys, which are processed by the new i18next middleware. This allows for API responses to be translated into different languages.

The following dependencies were added:
- `i18next`
- `i18next-fs-backend`
- `i18next-http-middleware`

* Formatting code

* Translation revamp for frontend and backend, adding systems docs

* Docs site title

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-08-31 13:44:28 +03:00
Wei S.
8c33b63bdf feat: Role based access control (#58)
* Format checked, contributing.md update

* Middleware setup

* IAP API, create user/roles in frontend

* RBAC using CASL library

* Switch to CASL, secure search, resource-level access control

---------

Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
2025-08-21 23:45:06 +03:00
Wayne
a2ca79d3eb Fix pnpm-lock unmatch error 2025-08-15 14:23:53 +03:00
Wayne
9873228d01 Before format 2025-08-15 14:14:01 +03:00
Wayne
832e29bd92 Project prettier setup 2025-08-15 13:45:58 +03:00
Wayne
f10bf93d1b eml import support 2025-08-11 10:55:50 +03:00
Wayne
4872ed597f PST ingestion 2025-08-07 17:03:08 +03:00
Wayne
4156abcdfa Error handling, force sync, UI improvement 2025-08-04 13:24:46 +03:00
Wayne
9c5922fd31 Docs site deploy 2025-07-27 21:28:51 +03:00