mirror of
https://github.com/LogicLabs-OU/OpenArchiver.git
synced 2026-04-06 00:31:57 +02:00
Previously, attachment deduplication was handled globally by enforcing a unique constraint on the content hash (contentHashSha256) in the `attachments` table. This caused an issue where an attachment from one ingestion source would be incorrectly linked if the same attachment was processed by a different source. This commit refactors the deduplication logic to be scoped on a per-ingestion-source basis. Changes: - **Schema:** The `attachments` table schema has been updated to include a nullable `ingestionSourceId` column. A composite unique index has been added on `(ingestionSourceId, contentHashSha256)` to enforce per-source uniqueness. The `ingestionSourceId` is nullable to ensure backward compatibility with existing databases. - **Ingestion Logic:** The `IngestionService` has been updated to provide the `ingestionSourceId` when inserting attachment records. The `onConflictDoUpdate` clause now targets the new composite key, ensuring that attachments are only considered duplicates if they have the same hash and originate from the same ingestion source.
31 lines
240 B
Plaintext
31 lines
240 B
Plaintext
# Node
|
|
node_modules
|
|
dist
|
|
.env
|
|
.env.*
|
|
!/.env.example
|
|
|
|
#Meilisearch
|
|
**/meili_data/
|
|
|
|
# PNPM
|
|
pnpm-debug.log
|
|
|
|
# IDE
|
|
.vscode
|
|
.idea
|
|
|
|
# OS
|
|
.DS_Store
|
|
|
|
# Dev
|
|
.dev
|
|
|
|
# Vitepress
|
|
docs/.vitepress/dist
|
|
docs/.vitepress/cache
|
|
|
|
|
|
# TS
|
|
**/tsconfig.tsbuildinfo
|