Migrate to DB users, implement IAM & add PST/EML importers #307

Closed
opened 2026-04-05 16:17:21 +02:00 by MrUnknownDE · 0 comments
Owner

Originally created by @wayneshn on 8/11/2025

This PR refactors the core authentication and authorization system, replacing the static .env admin user with a database-backed user model. It also introduces a granular, AWS-style IAM policy engine for permissions.

Additionally, it expands data ingestion capabilities by adding connectors for PST and EML files.

Technical Changes:

  • Authentication Overhaul

    • The ADMIN_EMAIL and ADMIN_PASSWORD variables have been deprecated and removed from the environment configuration.
    • A new setup flow is introduced. The backend status endpoint now checks if any users exist in the database. If not, the frontend redirects to a setup page where the initial admin user is created. This operation is restricted and can only run if the users table is empty.
    • The AdminUserService has been replaced with a persistent UserService that interacts with the PostgreSQL database via Drizzle ORM.
    • New tables for users, roles, and sessions have been added to the database schema to support multi-user authentication and role-based access control. #23
  • IAM Policy Engine

    • A new IAM service allows for the creation of roles with specific permissions defined in JSON policy documents.
    • A PolicyValidator has been implemented to ensure that all policies adhere to the defined iam-definitions.ts before being saved, preventing malformed policies.
    • The system supports wildcard permissions for both actions (e.g., archive:*) and resources (e.g., ingestion-source/*).
  • PST & EML Ingestion

    • New ingestion providers for pst_import and eml_import have been added. #24 Add Chinese Translation Support (#22)
    • The frontend form for creating an ingestion source now includes file upload options for .pst and .zip (for EMLs) files.
    • A generic /upload endpoint has been created to handle file streaming with busboy, temporarily storing the file and returning its path for the ingestion job.
    • New backend connectors, PSTConnector and EMLConnector, use pst-extractor and yauzl respectively to parse the uploaded files and extract email objects.
  • Core Improvements

    • Data Model: The archived_emails table now includes path and tags columns to preserve the original folder structure and labels from the source mailbox. The ingestion connectors for IMAP, Google Workspace, and Microsoft 365 have been updated to populate this metadata.
    • IMAP Syncing: The IMAP connector now processes mailboxes in batches and includes retry logic with exponential backoff to handle mail server rate limits more gracefully.
    • Configurability: The continuous email sync frequency is now configurable via the SYNC_FREQUENCY environment variable.
*Originally created by @wayneshn on 8/11/2025* This PR refactors the core authentication and authorization system, replacing the static `.env` admin user with a database-backed user model. It also introduces a granular, AWS-style IAM policy engine for permissions. Additionally, it expands data ingestion capabilities by adding connectors for PST and EML files. **Technical Changes:** * **Authentication Overhaul** * The `ADMIN_EMAIL` and `ADMIN_PASSWORD` variables have been deprecated and removed from the environment configuration. * A new `setup` flow is introduced. The backend `status` endpoint now checks if any users exist in the database. If not, the frontend redirects to a setup page where the initial admin user is created. This operation is restricted and can only run if the `users` table is empty. * The `AdminUserService` has been replaced with a persistent `UserService` that interacts with the PostgreSQL database via Drizzle ORM. * New tables for `users`, `roles`, and `sessions` have been added to the database schema to support multi-user authentication and role-based access control. #23 * **IAM Policy Engine** * A new IAM service allows for the creation of roles with specific permissions defined in JSON policy documents. * A `PolicyValidator` has been implemented to ensure that all policies adhere to the defined `iam-definitions.ts` before being saved, preventing malformed policies. * The system supports wildcard permissions for both actions (e.g., `archive:*`) and resources (e.g., `ingestion-source/*`). * **PST & EML Ingestion** * New ingestion providers for `pst_import` and `eml_import` have been added. #24 #22 * The frontend form for creating an ingestion source now includes file upload options for `.pst` and `.zip` (for EMLs) files. * A generic `/upload` endpoint has been created to handle file streaming with `busboy`, temporarily storing the file and returning its path for the ingestion job. * New backend connectors, `PSTConnector` and `EMLConnector`, use `pst-extractor` and `yauzl` respectively to parse the uploaded files and extract email objects. * **Core Improvements** * **Data Model:** The `archived_emails` table now includes `path` and `tags` columns to preserve the original folder structure and labels from the source mailbox. The ingestion connectors for IMAP, Google Workspace, and Microsoft 365 have been updated to populate this metadata. * **IMAP Syncing:** The IMAP connector now processes mailboxes in batches and includes retry logic with exponential backoff to handle mail server rate limits more gracefully. * **Configurability:** The continuous email sync frequency is now configurable via the `SYNC_FREQUENCY` environment variable.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github/OpenArchiver#307