mirror of
https://github.com/cloudpanel-io/cloudpanel-ce.git
synced 2026-04-05 20:31:58 +02:00
MySQL Service Daily Crashes #349
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @JulianPrieber on 8/25/2023
CloudPanel version(s) affected
v2.3.1-latest
maybe v2.3.0
Description
Description:
For the past two weeks, our server has been plagued by a recurring issue where the MySQL service crashes on a daily basis, necessitating manual restarts and causing significant periods of downtime. Multiple users on the Discord server have reported experiencing the same problem, indicating that this is not an isolated incident. Despite exhaustive efforts to mitigate the issue, such as increasing RAM, adjusting swap settings, and modifying the
vm.overcommit_memoryparameter, the problem persists. Interestingly, the only temporary workaround has been to setvm.overcommit_memoryto 2, which prevents MySQL crashes but introduces complications as other systems contend with RAM limitations.Our server setup is characterized by two nearly identical systems. However, the problematic behavior is isolated to one system, while the other continues to operate without disruptions. Both servers were established within the last three weeks, initially on CloudPanel version 2.3.1 and subsequently upgraded to 2.3.2. The afflicted server utilizes ARMx64 architecture, boasts 8GB of RAM, 4 cores, and runs Ubuntu 22 alongside MariaDB 10.11.5. We've been long-time users of CloudPanel since its version 1 release without encountering such issues. Our server is home to various PHP applications, including WordPress, and the root cause appears to be intertwined with Redis failures during database dumping, ultimately leading to memory cache overflow.
Attempts to reproduce the issue on a cloned system have proved unsuccessful, pointing to the possibility that traffic or specific database activity could be triggering the problem. Notably, during these crashes, no unusual spikes in incoming or outgoing traffic are detected. In fact, the servers tend to remain in an idle state with only sporadic requests.
Additional Notes:
Given the severe impact of this issue on system reliability and uptime, immediate and focused attention is imperative to identify the underlying cause and implement a sustainable solution. The sporadic nature of the crashes and their exclusive occurrence on one system, despite its similarity to another stable system, suggest that external factors or specific configurations could be influencing this behavior. Notably, the fact that another user has reported a similar issue on a system with 32GB of RAM indicates that the problem likely transcends a simple memory limitation.
How to reproduce
The method to reproduce this issue is currently unknown. Assistance with troubleshooting and investigation is greatly appreciated.
Possible Solution
No response
Additional Context
Log snipped:
Failure around 4:26 PM:

Note: The system experiences high CPU utilization and continuous disk writing until both memory and swap resources are fully exhausted.
Updates:
Our troubleshooting efforts have been ongoing. In order to address the persisting issue, we took steps to investigate potential culprits based on the error logs. Specifically, we proceeded to disable two prominent processes: Redis and Varnish cache.
However, despite the deactivation of these processes, the problem persists, leading us to conclude that neither Redis nor Varnish cache are responsible for the issue at hand. Further investigation is required to identify the root cause of the problem.
We logged failures for the last few days with no apparent pattern. The failures seem to get more frequent and more sporadic. With previous failures seeming to only occur at night between 3 and 5 AM (with the daily backup scheduled at 2 AM).