Critical Issue: signal: killed during 60GB Database Restore on Databasus Hi #162

Closed
opened 2026-04-05 16:15:52 +02:00 by MrUnknownDE · 0 comments
Owner

Originally created by @WendelSTi on 2/4/2026

I’m reaching out because we are facing a persistent signal: killed error when attempting to restore a large database (~60GB) using the Databasus container.

The Problem: During the restore process, the logs return: mysql failed: signal: killed – stderr:. The restore for the UUID 549956b1-40f3-44c6-bfae-a2b4d4b31ab6 enters a loop, constantly returning HTTP 400 after the failure.

Procedures already performed:

Memory Allocation: I have already allocated over 20GB of RAM to both the host server (S2SVHL046) and the database container limits.

Resource Monitoring: * docker inspect shows OOMKilled: false, meaning the Docker daemon itself isn't killing the container for exceeding its hard limit.

System logs (dmesg, syslog) do not show a global Out-of-Memory (OOM) event.

valkey-server and postgres are running stable within the same container, but the restore sub-process is being terminated.

Process Verification: I’ve checked the resources using docker top. The main Go process (./main) remains active, but the child process responsible for the restore is the one receiving the SIGKILL.

Observations: Since the database file is 60GB and the RAM is ~20GB, it seems the restore process is trying to allocate a large buffer or temporary memory table that exceeds the allowed per-process limit (ulimit) or causes a memory spike faster than the monitoring tools can capture.

Request: Could you check how the application handles large dump files? Specifically:

Does it try to load large chunks into memory?

Are there any work_mem or max_allowed_packet configurations inside the start.sh or the Go binary that we should tune?

Is there a way to prevent the 400 error loop that saturates the logs after the first failure?

Looking forward to your feedback.

Best regards,

*Originally created by @WendelSTi on 2/4/2026* I’m reaching out because we are facing a persistent signal: killed error when attempting to restore a large database (~60GB) using the Databasus container. The Problem: During the restore process, the logs return: mysql failed: signal: killed – stderr:. The restore for the UUID 549956b1-40f3-44c6-bfae-a2b4d4b31ab6 enters a loop, constantly returning HTTP 400 after the failure. Procedures already performed: Memory Allocation: I have already allocated over 20GB of RAM to both the host server (S2SVHL046) and the database container limits. Resource Monitoring: * docker inspect shows OOMKilled: false, meaning the Docker daemon itself isn't killing the container for exceeding its hard limit. System logs (dmesg, syslog) do not show a global Out-of-Memory (OOM) event. valkey-server and postgres are running stable within the same container, but the restore sub-process is being terminated. Process Verification: I’ve checked the resources using docker top. The main Go process (./main) remains active, but the child process responsible for the restore is the one receiving the SIGKILL. Observations: Since the database file is 60GB and the RAM is ~20GB, it seems the restore process is trying to allocate a large buffer or temporary memory table that exceeds the allowed per-process limit (ulimit) or causes a memory spike faster than the monitoring tools can capture. Request: Could you check how the application handles large dump files? Specifically: Does it try to load large chunks into memory? Are there any work_mem or max_allowed_packet configurations inside the start.sh or the Go binary that we should tune? Is there a way to prevent the 400 error loop that saturates the logs after the first failure? Looking forward to your feedback. Best regards,
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github/databasus#162