mirror of
https://github.com/gyptazy/ProxLB.git
synced 2026-04-06 04:41:58 +02:00
Compare commits
28 Commits
feature/8-
...
v1.0.2
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
143135f1d8 | ||
|
|
c865829a2e | ||
|
|
101855b404 | ||
|
|
37e7a601be | ||
|
|
8791007e77 | ||
|
|
3a2c16b137 | ||
|
|
adc476e848 | ||
|
|
28be8b8146 | ||
|
|
cbaeba2046 | ||
|
|
61de9cb01d | ||
|
|
2e36d59f84 | ||
|
|
3f1444a19f | ||
|
|
86fe2487b5 | ||
|
|
46832ba6b2 | ||
|
|
4671b414b8 | ||
|
|
4efa9df965 | ||
|
|
5c6cf04ed2 | ||
|
|
62f82e389a | ||
|
|
1c7a630e39 | ||
|
|
f2209ce1b0 | ||
|
|
1908c2e8d8 | ||
|
|
0e1774ee84 | ||
|
|
8bf4b1f107 | ||
|
|
86009ff3c2 | ||
|
|
3d634ef824 | ||
|
|
e204bba54f | ||
|
|
0cd5bb0b3f | ||
|
|
2278a91cb9 |
@@ -1,2 +1,2 @@
|
||||
added:
|
||||
- Add container (e.g., Docker, Podman) support. [#10 by @daanbosch]
|
||||
- Add Docker/Podman support. [#10 by @daanbosch]
|
||||
|
||||
@@ -0,0 +1,2 @@
|
||||
added:
|
||||
- Add option to rebalance by assigned VM resources to avoid overprovisioning. [#16]
|
||||
4
.changelogs/1.0.0/17-add-configurable-log-verbosity.yml
Normal file
4
.changelogs/1.0.0/17-add-configurable-log-verbosity.yml
Normal file
@@ -0,0 +1,4 @@
|
||||
added:
|
||||
- Add feature to make log verbosity configurable [#17].
|
||||
changed:
|
||||
- Adjusted general logging and log more details.
|
||||
2
.changelogs/1.0.0/27_add_container_lxc_support.yml
Normal file
2
.changelogs/1.0.0/27_add_container_lxc_support.yml
Normal file
@@ -0,0 +1,2 @@
|
||||
added:
|
||||
- Add LXC/Container integration. [#27]
|
||||
@@ -0,0 +1,2 @@
|
||||
added:
|
||||
- Add option_mode to rebalance by node's free resources in percent (instead of bytes). [#29]
|
||||
@@ -1 +1 @@
|
||||
date: TBD
|
||||
date: 2024-08-01
|
||||
|
||||
@@ -0,0 +1,2 @@
|
||||
added:
|
||||
- Add option to run ProxLB only on the Proxmox's master node in the cluster (reg. HA feature). [#40]
|
||||
@@ -0,0 +1,2 @@
|
||||
added:
|
||||
- Add option to run migrations in parallel or sequentially. [#41]
|
||||
2
.changelogs/1.0.2/45_fix_daemon_timer.yml
Normal file
2
.changelogs/1.0.2/45_fix_daemon_timer.yml
Normal file
@@ -0,0 +1,2 @@
|
||||
changed:
|
||||
- Fix daemon timer to use hours instead of minutes. [#45]
|
||||
2
.changelogs/1.0.2/49_fix_cmake_debian_packaging.yml
Normal file
2
.changelogs/1.0.2/49_fix_cmake_debian_packaging.yml
Normal file
@@ -0,0 +1,2 @@
|
||||
fixed:
|
||||
- Fix CMake packaging for Debian package to avoid overwriting the config file. [#49]
|
||||
1
.changelogs/1.0.2/release_meta.yml
Normal file
1
.changelogs/1.0.2/release_meta.yml
Normal file
@@ -0,0 +1 @@
|
||||
date: 2024-08-13
|
||||
34
CHANGELOG.md
34
CHANGELOG.md
@@ -6,6 +6,40 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
|
||||
## [1.0.2] - 2024-08-13
|
||||
|
||||
### Added
|
||||
|
||||
- Add option to run migration in parallel or sequentially. [#41]
|
||||
- Add option to run ProxLB only on the Proxmox's master node in the cluster (reg. HA feature). [#40]
|
||||
|
||||
### Changed
|
||||
|
||||
- Fix daemon timer to use hours instead of minutes. [#45]
|
||||
- Fix CMake packaging for Debian package to avoid overwriting the config file. [#49]
|
||||
- Fix wonkey code style.
|
||||
|
||||
|
||||
## [1.0.0] - 2024-08-01
|
||||
|
||||
### Added
|
||||
|
||||
- Add feature to prevent VMs from being relocated by defining a wildcard pattern. [#7]
|
||||
- Add feature to make log verbosity configurable [#17].
|
||||
- Add option_mode to rebalance by node's free resources in percent (instead of bytes). [#29]
|
||||
- Add option to rebalance by assigned VM resources to avoid over provisioning. [#16]
|
||||
- Add Docker/Podman support. [#10 by @daanbosch]
|
||||
- Add exclude grouping feature to rebalance VMs from being located together to new nodes. [#4]
|
||||
- Add feature to prevent VMs from being relocated by defining the 'plb_ignore_vm' tag. [#7]
|
||||
- Add dry-run support to see what kind of rebalancing would be done. [#6]
|
||||
- Add LXC/Container integration. [#27]
|
||||
- Add include grouping feature to rebalance VMs bundled to new nodes. [#3]
|
||||
|
||||
### Changed
|
||||
|
||||
- Adjusted general logging and log more details.
|
||||
|
||||
|
||||
## [0.9.9] - 2024-07-06
|
||||
|
||||
### Added
|
||||
|
||||
121
CONTRIBUTING.md
Normal file
121
CONTRIBUTING.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Contributing to ProxLB (PLB)
|
||||
|
||||
Thank you for considering contributing to ProxLB! We appreciate your help in improving the efficiency and performance of Proxmox clusters. Below are guidelines for contributing to the project.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Contributing to ProxLB (PLB)](#contributing-to-proxlb-plb)
|
||||
- [Table of Contents](#table-of-contents)
|
||||
- [Creating an Issue](#creating-an-issue)
|
||||
- [Running Linting](#running-linting)
|
||||
- [Running Tests](#running-tests)
|
||||
- [Add Changelogs](#add-changelogs)
|
||||
- [Submitting a Pull Request](#submitting-a-pull-request)
|
||||
- [Code of Conduct](#code-of-conduct)
|
||||
- [Getting Help](#getting-help)
|
||||
|
||||
## Creating an Issue
|
||||
|
||||
If you encounter a bug, have a feature request, or have any suggestions, please create an issue in our GitHub repository. To create an issue:
|
||||
|
||||
1. **Go to the [Issues](https://github.com/gyptazy/proxlb/issues) section of the repository.**
|
||||
2. **Click on the "New issue" button.**
|
||||
3. **Select the appropriate issue template (Bug Report, Feature Request, or Custom Issue).**
|
||||
4. **Provide a clear and descriptive title.**
|
||||
5. **Fill out the necessary details in the issue template.** Provide as much detail as possible to help us understand and reproduce the issue or evaluate the feature request.
|
||||
|
||||
## Running Linting
|
||||
Before submitting a pull request, ensure that your changes sucessfully perform the lintin. ProxLB uses [flake8] for running tests. Follow these steps to run tests locally:
|
||||
|
||||
1. **Install pytest if you haven't already:**
|
||||
```sh
|
||||
pip install fake8
|
||||
```
|
||||
|
||||
2. **Run the lintin:**
|
||||
```sh
|
||||
python3 -m flake8 proxlb
|
||||
```
|
||||
|
||||
Linting will also be performed for each PR. Therefore, it might make sense to test this before pushing locally.
|
||||
|
||||
## Running Tests
|
||||
|
||||
Before submitting a pull request, ensure that your changes do not break existing functionality. ProxLB uses [pytest](https://docs.pytest.org/en/stable/) for running tests. Follow these steps to run tests locally:
|
||||
|
||||
1. **Install pytest if you haven't already:**
|
||||
```sh
|
||||
pip install pytest
|
||||
```
|
||||
|
||||
2. **Run the tests:**
|
||||
```sh
|
||||
pytest
|
||||
```
|
||||
|
||||
Ensure all tests pass before submitting your changes.
|
||||
|
||||
## Add Changelogs
|
||||
ProxLB uses the [Changelog Fragments Creator](https://github.com/gyptazy/changelog-fragments-creator) for creating the overall `CHANGELOG.md` file. This changelog file is being generated from the files placed in the https://github.com/gyptazy/ProxLB/tree/main/.changelogs/ directory. Each release is represented by its version number where additional yaml files are being placed and parsed by the CFC tool. Such files look like:
|
||||
|
||||
```
|
||||
added:
|
||||
- Add option to rebalance by assigned VM resources to avoid overprovisioning. [#16]
|
||||
```
|
||||
|
||||
Every PR should contain such a file describing the change to ensure this is also stated in the changelog file.
|
||||
|
||||
## Submitting a Pull Request
|
||||
|
||||
We welcome your contributions! Follow these steps to submit a pull request:
|
||||
|
||||
1. **Fork the repository to your GitHub account.**
|
||||
2. **Clone your forked repository to your local machine:**
|
||||
```sh
|
||||
git clone https://github.com/gyptazy/proxlb.git
|
||||
cd proxlb
|
||||
```
|
||||
|
||||
Please prefix your PR regarding its type. It might be:
|
||||
* doc
|
||||
* feature
|
||||
* fix
|
||||
|
||||
It should also provide the issue id to which it is related.
|
||||
|
||||
1. **Create a new branch for your changes:**
|
||||
```sh
|
||||
git checkout -b feature/10-add-new-cool-stuff
|
||||
```
|
||||
|
||||
2. **Make your changes and commit them with a descriptive commit message:**
|
||||
```sh
|
||||
git add .
|
||||
git commit -m "feature: Adding new cool stuff"
|
||||
```
|
||||
|
||||
3. **Push your changes to your forked repository:**
|
||||
```sh
|
||||
git push origin feature/10-add-new-cool-stuff
|
||||
```
|
||||
|
||||
4. **Create a pull request from your forked repository:**
|
||||
- Go to the original repository on GitHub.
|
||||
- Click on the "New pull request" button.
|
||||
- Select the branch you pushed your changes to and create the pull request.
|
||||
|
||||
Please ensure that your pull request:
|
||||
|
||||
- Follows the project's coding style and guidelines.
|
||||
- Includes tests for any new functionality.
|
||||
- Updates the documentation as necessary.
|
||||
|
||||
## Code of Conduct
|
||||
|
||||
By participating in this project, you agree to abide by our [Code of Conduct](CODE_OF_CONDUCT.md). Please read it to understand the expected behavior and responsibilities when interacting with the community.
|
||||
|
||||
## Getting Help
|
||||
|
||||
If you need help or have any questions, feel free to reach out by creating an issue or by joining our [discussion forum](https://github.com/gyptazy/proxlb/discussions). You can also refer to our [documentation](https://github.com/gyptazy/ProxLB/tree/main/docs) for more information about the project or join our [chat room](https://matrix.to/#/#proxlb:gyptazy.ch) in Matrix.
|
||||
|
||||
Thank you for contributing to ProxLB! Together, we can enhance the efficiency and performance of Proxmox clusters.
|
||||
165
README.md
165
README.md
@@ -5,33 +5,49 @@
|
||||
<p float="center"><img src="https://img.shields.io/github/license/gyptazy/ProxLB"/><img src="https://img.shields.io/github/contributors/gyptazy/ProxLB"/><img src="https://img.shields.io/github/last-commit/gyptazy/ProxLB/main"/><img src="https://img.shields.io/github/issues-raw/gyptazy/ProxLB"/><img src="https://img.shields.io/github/issues-pr/gyptazy/ProxLB"/></p>
|
||||
|
||||
|
||||
## Table of Content
|
||||
* Introduction
|
||||
* Video of Migration
|
||||
* Features
|
||||
* Usage
|
||||
* Dependencies
|
||||
* Options
|
||||
* Parameters
|
||||
* Systemd
|
||||
* Manuel
|
||||
* Proxmox GUI Integration
|
||||
* Quick Start
|
||||
* Container Quick Start (Docker/Podman)
|
||||
* Motivation
|
||||
* References
|
||||
* Packages / Container Images
|
||||
* Misc
|
||||
* Bugs
|
||||
* Contributing
|
||||
* Author(s)
|
||||
## Table of Contents
|
||||
- [ProxLB - (Re)Balance VM Workloads in Proxmox Clusters](#proxlb---rebalance-vm-workloads-in-proxmox-clusters)
|
||||
- [Table of Contents](#table-of-contents)
|
||||
- [Introduction](#introduction)
|
||||
- [Video of Migration](#video-of-migration)
|
||||
- [Features](#features)
|
||||
- [How does it work?](#how-does-it-work)
|
||||
- [Usage](#usage)
|
||||
- [Dependencies](#dependencies)
|
||||
- [Options](#options)
|
||||
- [Parameters](#parameters)
|
||||
- [Balancing](#balancing)
|
||||
- [General](#general)
|
||||
- [By Used Memory of VMs/CTs](#by-used-memory-of-vmscts)
|
||||
- [By Assigned Memory of VMs/CTs](#by-assigned-memory-of-vmscts)
|
||||
- [Grouping](#grouping)
|
||||
- [Include (Stay Together)](#include-stay-together)
|
||||
- [Exclude (Stay Separate)](#exclude-stay-separate)
|
||||
- [Ignore VMs (Tag Style)](#ignore-vms-tag-style)
|
||||
- [Systemd](#systemd)
|
||||
- [Manual](#manual)
|
||||
- [Proxmox GUI Integration](#proxmox-gui-integration)
|
||||
- [Quick Start](#quick-start)
|
||||
- [Container Quick Start (Docker/Podman)](#container-quick-start-dockerpodman)
|
||||
- [Logging](#logging)
|
||||
- [Motivation](#motivation)
|
||||
- [References](#references)
|
||||
- [Downloads](#downloads)
|
||||
- [Packages](#packages)
|
||||
- [Repository](#repository)
|
||||
- [Container Images (Docker/Podman)](#container-images-dockerpodman)
|
||||
- [Misc](#misc)
|
||||
- [Bugs](#bugs)
|
||||
- [Contributing](#contributing)
|
||||
- [Support](#support)
|
||||
- [Author(s)](#authors)
|
||||
|
||||
## Introduction
|
||||
`ProxLB` (PLB) is an advanced tool designed to enhance the efficiency and performance of Proxmox clusters by optimizing the distribution of virtual machines (VMs) across the cluster nodes by using the Proxmox API. ProxLB meticulously gathers and analyzes a comprehensive set of resource metrics from both the cluster nodes and the running VMs. These metrics include CPU usage, memory consumption, and disk utilization, specifically focusing on local disk resources.
|
||||
`ProxLB` (PLB) is an advanced tool designed to enhance the efficiency and performance of Proxmox clusters by optimizing the distribution of virtual machines (VMs) or Containers (CTs) across the cluster nodes by using the Proxmox API. ProxLB meticulously gathers and analyzes a comprehensive set of resource metrics from both the cluster nodes and the running VMs. These metrics include CPU usage, memory consumption, and disk utilization, specifically focusing on local disk resources.
|
||||
|
||||
PLB collects resource usage data from each node in the Proxmox cluster, including CPU, (local) disk and memory utilization. Additionally, it gathers resource usage statistics from all running VMs, ensuring a granular understanding of the cluster's workload distribution.
|
||||
|
||||
Intelligent rebalancing is a key feature of ProxLB where it re-balances VMs based on their memory, disk or cpu usage, ensuring that no node is overburdened while others remain underutilized. The rebalancing capabilities of PLB significantly enhance cluster performance and reliability. By ensuring that resources are evenly distributed, PLB helps prevent any single node from becoming a performance bottleneck, improving the reliability and stability of the cluster. Efficient rebalancing leads to better utilization of available resources, potentially reducing the need for additional hardware investments and lowering operational costs.
|
||||
Intelligent rebalancing is a key feature of ProxLB where it re-balances VMs based on their memory, disk or CPU usage, ensuring that no node is overburdened while others remain underutilized. The rebalancing capabilities of PLB significantly enhance cluster performance and reliability. By ensuring that resources are evenly distributed, PLB helps prevent any single node from becoming a performance bottleneck, improving the reliability and stability of the cluster. Efficient rebalancing leads to better utilization of available resources, potentially reducing the need for additional hardware investments and lowering operational costs.
|
||||
|
||||
Automated rebalancing reduces the need for manual actions, allowing operators to focus on other critical tasks, thereby increasing operational efficiency.
|
||||
|
||||
@@ -46,6 +62,10 @@ Automated rebalancing reduces the need for manual actions, allowing operators to
|
||||
* Performing
|
||||
* Periodically
|
||||
* One-shot solution
|
||||
* Types
|
||||
* Rebalance only VMs
|
||||
* Rebalance only CTs
|
||||
* Rebalance all (VMs and CTs)
|
||||
* Filter
|
||||
* Exclude nodes
|
||||
* Exclude virtual machines
|
||||
@@ -54,7 +74,7 @@ Automated rebalancing reduces the need for manual actions, allowing operators to
|
||||
* Exclude groups (VMs that must run on different nodes)
|
||||
* Ignore groups (VMs that should be untouched)
|
||||
* Dry-run support
|
||||
* Human readable output in cli
|
||||
* Human readable output in CLI
|
||||
* JSON output for further parsing
|
||||
* Migrate VM workloads away (e.g. maintenance preparation)
|
||||
* Fully based on Proxmox API
|
||||
@@ -63,6 +83,11 @@ Automated rebalancing reduces the need for manual actions, allowing operators to
|
||||
* Periodically (daemon)
|
||||
* Proxmox Web GUI Integration (optional)
|
||||
|
||||
## How does it work?
|
||||
ProxLB is a load-balancing system designed to optimize the distribution of virtual machines (VMs) and containers (CTs) across a cluster. It works by first gathering resource usage metrics from all nodes in the cluster through the Proxmox API. This includes detailed resource metrics for each VM and CT on every node. ProxLB then evaluates the difference between the maximum and minimum resource usage of the nodes, referred to as "Balanciness." If this difference exceeds a predefined threshold (which is configurable), the system initiates the rebalancing process.
|
||||
|
||||
Before starting any migrations, ProxLB validates that rebalancing actions are necessary and beneficial. Depending on the selected balancing mode — such as CPU, memory, or disk — it creates a balancing matrix. This matrix sorts the VMs by their maximum used or assigned resources, identifying the VM with the highest usage. ProxLB then places this VM on the node with the most free resources in the selected balancing type. This process runs recursively until the operator-defined Balanciness is achieved. Balancing can be defined for the used or max. assigned resources of VMs/CTs.
|
||||
|
||||
## Usage
|
||||
Running PLB is easy and it runs almost everywhere since it just depends on `Python3` and the `proxmoxer` library. Therefore, it can directly run on a Proxmox node, dedicated systems like Debian, RedHat, or even FreeBSD, as long as the API is reachable by the client running PLB.
|
||||
|
||||
@@ -80,11 +105,17 @@ The following options can be set in the `proxlb.conf` file:
|
||||
| api_pass | FooBar | Password for the API. |
|
||||
| verify_ssl | 1 | Validate SSL certificates (1) or ignore (0). (default: 1) |
|
||||
| method | memory | Defines the balancing method (default: memory) where you can use `memory`, `disk` or `cpu`. |
|
||||
| mode | used | Rebalance by `used` resources (efficiency) or `assigned` (avoid overprovisioning) resources. (default: used)|
|
||||
| mode_option | byte | Rebalance by node's resources in `bytes` or `percent`. (default: bytes) |
|
||||
| type | vm | Rebalance only `vm` (virtual machines), `ct` (containers) or `all` (virtual machines & containers). (default: vm)|
|
||||
| balanciness | 10 | Value of the percentage of lowest and highest resource consumption on nodes may differ before rebalancing. (default: 10) |
|
||||
| parallel_migrations | 1 | Defines if migrations should be done parallely or sequentially. (default: 1) |
|
||||
| ignore_nodes | dummynode01,dummynode02,test* | Defines a comma separated list of nodes to exclude. |
|
||||
| ignore_vms | testvm01,testvm02 | Defines a comma separated list of VMs to exclude. (`*` as suffix wildcard or tags are also supported) |
|
||||
| master_only | 0 | Defines is this should only be performed (1) on the cluster master node or not (0). (default: 0) |
|
||||
| daemon | 1 | Run as a daemon (1) or one-shot (0). (default: 1) |
|
||||
| schedule | 24 | Hours to rebalance in hours. (default: 24) |
|
||||
| log_verbosity | INFO | Defines the log level (default: CRITICAL) where you can use `INFO`, `WARN` or `CRITICAL` |
|
||||
|
||||
An example of the configuration file looks like:
|
||||
```
|
||||
@@ -95,6 +126,8 @@ api_pass: FooBar
|
||||
verify_ssl: 1
|
||||
[balancing]
|
||||
method: memory
|
||||
mode: used
|
||||
type: vm
|
||||
# Balanciness defines how much difference may be
|
||||
# between the lowest & highest resource consumption
|
||||
# of nodes before rebalancing will be done.
|
||||
@@ -102,9 +135,16 @@ method: memory
|
||||
# Rebalancing: node01: 41% memory consumption :: node02: 52% consumption
|
||||
# No rebalancing: node01: 43% memory consumption :: node02: 50% consumption
|
||||
balanciness: 10
|
||||
# Enable parallel migrations. If set to 0 it will wait for completed migrations
|
||||
# before starting next migration.
|
||||
parallel_migrations: 1
|
||||
ignore_nodes: dummynode01,dummynode02
|
||||
ignore_vms: testvm01,testvm02
|
||||
[service]
|
||||
# The master_only option might be usuful if running ProxLB on all nodes in a cluster
|
||||
# but only a single one should do the balancing. The master node is obtained from the Proxmox
|
||||
# HA status.
|
||||
master_only: 0
|
||||
daemon: 1
|
||||
```
|
||||
|
||||
@@ -117,6 +157,25 @@ The following options and parameters are currently supported:
|
||||
| -d | --dry-run | Perform a dry-run without doing any actions. | Unset |
|
||||
| -j | --json | Return a JSON of the VM movement. | Unset |
|
||||
|
||||
### Balancing
|
||||
#### General
|
||||
In general, virtual machines and containers can be rebalanced and moved around nodes in the cluster. Often, this also works without downtime without any further downtimes. However, this does **not** work with containers. LXC based containers will be shutdown, copied and started on the new node. Also to note, live migrations can work fluently without any issues but there are still several things to be considered. This is out of scope for ProxLB and applies in general to Proxmox and your cluster setup. You can find more details about this here: https://pve.proxmox.com/wiki/Migrate_to_Proxmox_VE.
|
||||
|
||||
#### By Used Memory of VMs/CTs
|
||||
By continuously monitoring the current resource usage of VMs, ProxLB intelligently reallocates workloads to prevent any single node from becoming overloaded. This approach ensures that resources are balanced efficiently, providing consistent and optimal performance across the entire cluster at all times. To activate this balancing mode, simply activate the following option in your ProxLB configuration:
|
||||
```
|
||||
mode: used
|
||||
```
|
||||
|
||||
Afterwards, restart the service (if running in daemon mode) to activate this rebalancing mode.
|
||||
|
||||
#### By Assigned Memory of VMs/CTs
|
||||
By ensuring that resources are always available for each VM, ProxLB prevents over-provisioning and maintains a balanced load across all nodes. This guarantees that users have consistent access to the resources they need. However, if the total assigned resources exceed the combined capacity of the cluster, ProxLB will issue a warning, indicating potential over-provisioning despite its best efforts to balance the load. To activate this balancing mode, simply activate the following option in your ProxLB configuration:
|
||||
```
|
||||
mode: assigned
|
||||
```
|
||||
|
||||
Afterwards, restart the service (if running in daemon mode) to activate this rebalancing mode.
|
||||
|
||||
### Grouping
|
||||
#### Include (Stay Together)
|
||||
@@ -125,7 +184,7 @@ The following options and parameters are currently supported:
|
||||
#### Exclude (Stay Separate)
|
||||
<img align="left" src="https://cdn.gyptazy.ch/images/plb-rebalancing-exclude-balance-group.jpg"/> Access the Proxmox Web UI by opening your web browser and navigating to your Proxmox VE web interface, then log in with your credentials. Navigate to the VM you want to tag by selecting it from the left-hand navigation panel. Click on the "Options" tab to view the VM's options, then select "Edit" or "Add" (depending on whether you are editing an existing tag or adding a new one). In the tag field, enter plb_exclude_ followed by your unique identifier, for example, plb_exclude_critical. Save the changes to apply the tag to the VM. Repeat these steps for each VM that should be excluded from being on the same node.
|
||||
|
||||
#### Ignore VMs (tag style)
|
||||
#### Ignore VMs (Tag Style)
|
||||
<img align="left" src="https://cdn.gyptazy.ch/images/plb-rebalancing-ignore-vm.jpg"/> In Proxmox, you can ensure that certain VMs are ignored during the rebalancing process by setting a specific tag within the Proxmox Web UI, rather than solely relying on configurations in the ProxLB config file. This can be achieved by adding the tag 'plb_ignore_vm' to the VM. Once this tag is applied, the VM will be excluded from any further rebalancing operations, simplifying the management process.
|
||||
|
||||
### Systemd
|
||||
@@ -151,8 +210,8 @@ The executable must be able to read the config file, if no dedicated config file
|
||||
The easiest way to get started is by using the ready-to-use packages that I provide on my CDN and to run it on a Linux Debian based system. This can also be one of the Proxmox nodes itself.
|
||||
|
||||
```
|
||||
wget https://cdn.gyptazy.ch/files/amd64/debian/proxlb/proxlb_0.9.9_amd64.deb
|
||||
dpkg -i proxlb_0.9.9_amd64.deb
|
||||
wget https://cdn.gyptazy.ch/files/amd64/debian/proxlb/proxlb_1.0.2_amd64.deb
|
||||
dpkg -i proxlb_1.0.2_amd64.deb
|
||||
# Adjust your config
|
||||
vi /etc/proxlb/proxlb.conf
|
||||
systemctl restart proxlb
|
||||
@@ -165,7 +224,7 @@ Creating a container image of ProxLB is straightforward using the provided Docke
|
||||
```bash
|
||||
git clone https://github.com/gyptazy/ProxLB.git
|
||||
cd ProxLB
|
||||
build -t proxlb .
|
||||
docker build -t proxlb .
|
||||
```
|
||||
|
||||
Afterwards simply adjust the config file to your needs:
|
||||
@@ -179,7 +238,15 @@ docker run -it --rm -v $(pwd)/proxlb.conf:/etc/proxlb/proxlb.conf proxlb
|
||||
```
|
||||
|
||||
### Logging
|
||||
ProxLB uses the `SystemdHandler` for logging. You can find all your logs in your systemd unit log or in the journalctl.
|
||||
ProxLB uses the `SystemdHandler` for logging. You can find all your logs in your systemd unit log or in the `journalctl`. In default, ProxLB only logs critical events. However, for further understanding of the balancing it might be useful to change this to `INFO` or `DEBUG` which can simply be done in the [proxlb.conf](https://github.com/gyptazy/ProxLB/blob/main/proxlb.conf#L14) file by changing the `log_verbosity` parameter.
|
||||
|
||||
Available logging values:
|
||||
| Verbosity | Description |
|
||||
|------|:------:|
|
||||
| DEBUG | This option logs everything and is needed for debugging the code. |
|
||||
| INFO | This option provides insides behind the scenes. What/why has been something done and with which values. |
|
||||
| WARNING | This option provides only warning messages, which might be a problem in general but not for the application itself. |
|
||||
| CRITICAL | This option logs all critical events that will avoid running ProxLB. |
|
||||
|
||||
## Motivation
|
||||
As a developer managing a cluster of virtual machines for my projects, I often encountered the challenge of resource imbalance. Nodes within the cluster would become unevenly loaded, with some nodes being overburdened while others remained underutilized. This imbalance led to inefficiencies, performance bottlenecks, and increased operational costs. Frustrated by the lack of an adequate solution to address this issue, I decided to develop the ProxLB (PLB) to ensure better resource distribution across my clusters.
|
||||
@@ -200,25 +267,61 @@ Here you can find some overviews of references for and about the ProxLB (PLB):
|
||||
| General introduction into ProxLB | https://gyptazy.ch/blog/proxlb-rebalancing-vm-workloads-across-nodes-in-proxmox-clusters/ |
|
||||
| Howto install and use ProxLB on Debian to rebalance vm workloads in a Proxmox cluster | https://gyptazy.ch/howtos/howto-install-and-use-proxlb-to-rebalance-vm-workloads-across-nodes-in-proxmox-clusters/ |
|
||||
|
||||
## Packages / Container Images
|
||||
## Downloads
|
||||
ProxLB can be obtained in man different ways, depending on which use case you prefer. You can use simply copy the code from GitHub, use created packages for Debian or RedHat based systems, use a Repository to keep ProxLB always up to date or simply use a Container image for Docker/Podman.
|
||||
|
||||
### Packages
|
||||
Ready to use packages can be found at:
|
||||
* https://cdn.gyptazy.ch/files/amd64/debian/proxlb/
|
||||
* https://cdn.gyptazy.ch/files/amd64/ubuntu/proxlb/
|
||||
* https://cdn.gyptazy.ch/files/amd64/redhat/proxlb/
|
||||
* https://cdn.gyptazy.ch/files/amd64/freebsd/proxlb/
|
||||
|
||||
|
||||
### Repository
|
||||
Debian based systems can also use the repository by adding the following line to their apt sources:
|
||||
|
||||
```
|
||||
deb https://repo.gyptazy.ch/ /
|
||||
```
|
||||
|
||||
The Repository's GPG key can be found at: `https://repo.gyptazy.ch/repo/KEY.gpg`
|
||||
|
||||
You can also simply import it by running:
|
||||
|
||||
```
|
||||
# KeyID: DEB76ADF7A0BAADB51792782FD6A7A70C11226AA
|
||||
# SHA256: 5e44fffa09c747886ee37cc6e9e7eaf37c6734443cc648eaf0a9241a89084383 KEY.gpg
|
||||
|
||||
wget -O /etc/apt/trusted.gpg.d/proxlb.asc https://repo.gyptazy.ch/repo/KEY.gpg
|
||||
```
|
||||
|
||||
*Note: The defined repositories `repo.gyptazy.ch` and `repo.proxlb.de` are the same!*
|
||||
|
||||
### Container Images (Docker/Podman)
|
||||
Container Images for Podman, Docker etc., can be found at:
|
||||
| Version | Image |
|
||||
|------|:------:|
|
||||
| latest | cr.gyptazy.ch/proxlb/proxlb:latest |
|
||||
| v0.0.9 | cr.gyptazy.ch/proxlb/proxlb:v0.0.9 |
|
||||
| v1.0.2 | cr.gyptazy.ch/proxlb/proxlb:v1.0.2 |
|
||||
| v1.0.0 | cr.gyptazy.ch/proxlb/proxlb:v1.0.0 |
|
||||
| v0.9.9 | cr.gyptazy.ch/proxlb/proxlb:v0.9.9 |
|
||||
|
||||
## Misc
|
||||
### Bugs
|
||||
Bugs can be reported via the GitHub issue tracker [here](https://github.com/gyptazy/ProxLB/issues). You may also report bugs via email or deliver PRs to fix them on your own. Therefore, you might also see the contributing chapter.
|
||||
|
||||
### Contributing
|
||||
Feel free to add further documentation, to adjust already existing one or to contribute with code. Please take care about the style guide and naming conventions.
|
||||
Feel free to add further documentation, to adjust already existing one or to contribute with code. Please take care about the style guide and naming conventions. You can find more in our [CONTRIBUTING.md](https://github.com/gyptazy/ProxLB/blob/main/CONTRIBUTING.md) file.
|
||||
|
||||
### Support
|
||||
If you need assistance or have any questions, we offer support through our dedicated [chat room](https://matrix.to/#/#proxlb:gyptazy.ch) in Matrix and on Reddit. Join our community for real-time help, advice, and discussions. Connect with us in our dedicated chat room for immediate support and live interaction with other users and developers. You can also visit our [Reddit community](https://www.reddit.com/r/Proxmox/comments/1e78ap3/introducing_proxlb_rebalance_your_vm_workloads/) to post your queries, share your experiences, and get support from fellow community members and moderators. You may also just open directly an issue [here](https://github.com/gyptazy/ProxLB/issues) on GitHub. We are here to help and ensure you have the best experience possible.
|
||||
|
||||
| Support Channel | Link |
|
||||
|------|:------:|
|
||||
| Matrix | [#proxlb:gyptazy.ch](https://matrix.to/#/#proxlb:gyptazy.ch) |
|
||||
| Reddit | [Reddit community](https://www.reddit.com/r/Proxmox/comments/1e78ap3/introducing_proxlb_rebalance_your_vm_workloads/) |
|
||||
| GitHub | [ProxLB GitHub](https://github.com/gyptazy/ProxLB/issues) |
|
||||
|
||||
### Author(s)
|
||||
* Florian Paul Azim Hoberg @gyptazy (https://gyptazy.ch)
|
||||
|
||||
@@ -1,4 +1,20 @@
|
||||
# Configuration
|
||||
|
||||
## Balancing
|
||||
### By Used Memmory of VMs
|
||||
By continuously monitoring the current resource usage of VMs, ProxLB intelligently reallocates workloads to prevent any single node from becoming overloaded. This approach ensures that resources are balanced efficiently, providing consistent and optimal performance across the entire cluster at all times. To activate this balancing mode, simply activate the following option in your ProxLB configuration:
|
||||
```
|
||||
mode: used
|
||||
```
|
||||
Afterwards, restart the service (if running in daemon mode) to activate this rebalancing mode.
|
||||
|
||||
### By Assigned Memory of VMs
|
||||
By ensuring that resources are always available for each VM, ProxLB prevents over-provisioning and maintains a balanced load across all nodes. This guarantees that users have consistent access to the resources they need. However, if the total assigned resources exceed the combined capacity of the cluster, ProxLB will issue a warning, indicating potential over-provisioning despite its best efforts to balance the load. To activate this balancing mode, simply activate the following option in your ProxLB configuration:
|
||||
```
|
||||
mode: assigned
|
||||
```
|
||||
Afterwards, restart the service (if running in daemon mode) to activate this rebalancing mode.
|
||||
|
||||
## Grouping
|
||||
### Include (Stay Together)
|
||||
<img align="left" src="https://cdn.gyptazy.ch/images/plb-rebalancing-include-balance-group.jpg"/> Access the Proxmox Web UI by opening your web browser and navigating to your Proxmox VE web interface, then log in with your credentials. Navigate to the VM you want to tag by selecting it from the left-hand navigation panel. Click on the "Options" tab to view the VM's options, then select "Edit" or "Add" (depending on whether you are editing an existing tag or adding a new one). In the tag field, enter plb_include_ followed by your unique identifier, for example, plb_include_group1. Save the changes to apply the tag to the VM. Repeat these steps for each VM that should be included in the group.
|
||||
|
||||
@@ -21,3 +21,57 @@ Jul 06 10:25:16 build01 proxlb[7285]: proxlb: Error: [python-imports]: Could no
|
||||
|
||||
Debian/Ubuntu: apt-get install python3-proxmoxer
|
||||
If the package is not provided by your systems repository, you can also install it by running `pip3 install proxmoxer`.
|
||||
|
||||
### How does it work?
|
||||
ProxLB is a load-balancing system designed to optimize the distribution of virtual machines (VMs) and containers (CTs) across a cluster. It works by first gathering resource usage metrics from all nodes in the cluster through the Proxmox API. This includes detailed resource metrics for each VM and CT on every node. ProxLB then evaluates the difference between the maximum and minimum resource usage of the nodes, referred to as "Balanciness." If this difference exceeds a predefined threshold (which is configurable), the system initiates the rebalancing process.
|
||||
|
||||
Before starting any migrations, ProxLB validates that rebalancing actions are necessary and beneficial. Depending on the selected balancing mode — such as CPU, memory, or disk — it creates a balancing matrix. This matrix sorts the VMs by their maximum used or assigned resources, identifying the VM with the highest usage. ProxLB then places this VM on the node with the most free resources in the selected balancing type. This process runs recursively until the operator-defined Balanciness is achieved. Balancing can be defined for the used or max. assigned resources of VMs/CTs.
|
||||
|
||||
### Logging
|
||||
ProxLB uses the `SystemdHandler` for logging. You can find all your logs in your systemd unit log or in the `journalctl`. In default, ProxLB only logs critical events. However, for further understanding of the balancing it might be useful to change this to `INFO` or `DEBUG` which can simply be done in the [proxlb.conf](https://github.com/gyptazy/ProxLB/blob/main/proxlb.conf#L14) file by changing the `log_verbosity` parameter.
|
||||
|
||||
Available logging values:
|
||||
| Verbosity | Description |
|
||||
|------|:------:|
|
||||
| DEBUG | This option logs everything and is needed for debugging the code. |
|
||||
| INFO | This option provides insides behind the scenes. What/why has been something done and with which values. |
|
||||
| WARNING | This option provides only warning messages, which might be a problem in general but not for the application itself. |
|
||||
| CRITICAL | This option logs all critical events that will avoid running ProxLB. |
|
||||
|
||||
### Motivation
|
||||
As a developer managing a cluster of virtual machines for my projects, I often encountered the challenge of resource imbalance. Nodes within the cluster would become unevenly loaded, with some nodes being overburdened while others remained underutilized. This imbalance led to inefficiencies, performance bottlenecks, and increased operational costs. Frustrated by the lack of an adequate solution to address this issue, I decided to develop the ProxLB (PLB) to ensure better resource distribution across my clusters.
|
||||
|
||||
My primary motivation for creating PLB stemmed from my work on my BoxyBSD project, where I consistently faced the difficulty of maintaining balanced nodes while running various VM workloads but also on my personal clusters. The absence of an efficient rebalancing mechanism made it challenging to achieve optimal performance and stability. Recognizing the necessity for a tool that could gather and analyze resource metrics from both the cluster nodes and the running VMs, I embarked on developing ProxLB.
|
||||
|
||||
PLB meticulously collects detailed resource usage data from each node in a Proxmox cluster, including CPU load, memory usage, and local disk space utilization. It also gathers comprehensive statistics from all running VMs, providing a granular understanding of the workload distribution. With this data, PLB intelligently redistributes VMs based on memory usage, local disk usage, and CPU usage. This ensures that no single node is overburdened, storage resources are evenly distributed, and the computational load is balanced, enhancing overall cluster performance.
|
||||
|
||||
As an advocate of the open-source philosophy, I believe in the power of community and collaboration. By sharing solutions like PLB, I aim to contribute to the collective knowledge and tools available to developers facing similar challenges. Open source fosters innovation, transparency, and mutual support, enabling developers to build on each other's work and create better solutions together.
|
||||
|
||||
Developing PLB was driven by a desire to solve a real problem I faced in my projects. However, the spirit behind this effort was to provide a valuable resource to the community. By open-sourcing PLB, I hope to help other developers manage their clusters more efficiently, optimize their resource usage, and reduce operational costs. Sharing this solution aligns with the core principles of open source, where the goal is not only to solve individual problems but also to contribute to the broader ecosystem.
|
||||
|
||||
### Packages / Container Images
|
||||
Ready to use packages can be found at:
|
||||
* https://cdn.gyptazy.ch/files/amd64/debian/proxlb/
|
||||
* https://cdn.gyptazy.ch/files/amd64/ubuntu/proxlb/
|
||||
* https://cdn.gyptazy.ch/files/amd64/redhat/proxlb/
|
||||
* https://cdn.gyptazy.ch/files/amd64/freebsd/proxlb/
|
||||
|
||||
Container Images for Podman, Docker etc., can be found at:
|
||||
| Version | Image |
|
||||
|------|:------:|
|
||||
| latest | cr.gyptazy.ch/proxlb/proxlb:latest |
|
||||
|
||||
### Bugs
|
||||
Bugs can be reported via the GitHub issue tracker [here](https://github.com/gyptazy/ProxLB/issues). You may also report bugs via email or deliver PRs to fix them on your own. Therefore, you might also see the contributing chapter.
|
||||
|
||||
### Contributing
|
||||
Feel free to add further documentation, to adjust already existing one or to contribute with code. Please take care about the style guide and naming conventions. You can find more in our [CONTRIBUTING.md](https://github.com/gyptazy/ProxLB/blob/main/CONTRIBUTING.md) file.
|
||||
|
||||
### Support
|
||||
If you need assistance or have any questions, we offer support through our dedicated [chat room](https://matrix.to/#/#proxlb:gyptazy.ch) in Matrix and on Reddit. Join our community for real-time help, advice, and discussions. Connect with us in our dedicated chat room for immediate support and live interaction with other users and developers. You can also visit our [Reddit community](https://www.reddit.com/r/Proxmox/comments/1e78ap3/introducing_proxlb_rebalance_your_vm_workloads/) to post your queries, share your experiences, and get support from fellow community members and moderators. You may also just open directly an issue [here](https://github.com/gyptazy/ProxLB/issues) on GitHub. We are here to help and ensure you have the best experience possible.
|
||||
|
||||
| Support Channel | Link |
|
||||
|------|:------:|
|
||||
| Matrix | [#proxlb:gyptazy.ch](https://matrix.to/#/#proxlb:gyptazy.ch) |
|
||||
| Reddit | [Reddit community](https://www.reddit.com/r/Proxmox/comments/1e78ap3/introducing_proxlb_rebalance_your_vm_workloads/) |
|
||||
| GitHub | [ProxLB GitHub](https://github.com/gyptazy/ProxLB/issues) |
|
||||
@@ -1,5 +1,5 @@
|
||||
cmake_minimum_required(VERSION 3.16)
|
||||
project(proxmox-rebalancing-service VERSION 0.9.9)
|
||||
project(proxmox-rebalancing-service VERSION 1.0.2)
|
||||
|
||||
install(PROGRAMS ../proxlb DESTINATION /bin)
|
||||
install(FILES ../proxlb.conf DESTINATION /etc/proxlb)
|
||||
@@ -17,8 +17,8 @@ set(CPACK_PACKAGE_VENDOR "gyptazy")
|
||||
set(CPACK_PACKAGE_VERSION ${CMAKE_PROJECT_VERSION})
|
||||
set(CPACK_GENERATOR "RPM")
|
||||
set(CPACK_RPM_PACKAGE_ARCHITECTURE "amd64")
|
||||
set(CPACK_RPM_PACKAGE_SUMMARY "ProxLB Rebalancing VM workloads within Proxmox clusters.")
|
||||
set(CPACK_RPM_PACKAGE_DESCRIPTION "ProxLB Rebalancing VM workloads within Proxmox clusters.")
|
||||
set(CPACK_RPM_PACKAGE_SUMMARY "ProxLB - Rebalance VM workloads across nodes in Proxmox clusters.")
|
||||
set(CPACK_RPM_PACKAGE_DESCRIPTION "ProxLB - Rebalance VM workloads across nodes in Proxmox clusters.")
|
||||
set(CPACK_RPM_CHANGELOG_FILE "${CMAKE_CURRENT_SOURCE_DIR}/changelog_redhat")
|
||||
set(CPACK_PACKAGE_RELEASE 1)
|
||||
set(CPACK_RPM_PACKAGE_LICENSE "GPL 3.0")
|
||||
@@ -27,15 +27,14 @@ set(CPACK_RPM_PACKAGE_REQUIRES "python >= 3.2.0")
|
||||
# DEB packaging
|
||||
set(CPACK_DEBIAN_FILE_NAME DEB-DEFAULT)
|
||||
set(CPACK_DEBIAN_PACKAGE_ARCHITECTURE "amd64")
|
||||
set(CPACK_DEBIAN_PACKAGE_SUMMARY "ProxLB Rebalancing VM workloads within Proxmox clusters.")
|
||||
set(CPACK_DEBIAN_PACKAGE_DESCRIPTION "ProxLB Rebalancing VM workloads within Proxmox clusters.")
|
||||
set(CPACK_DEBIAN_PACKAGE_SUMMARY "ProxLB - Rebalance VM workloads across nodes in Proxmox clusters.")
|
||||
set(CPACK_DEBIAN_PACKAGE_DESCRIPTION "ProxLB - Rebalance VM workloads across nodes in Proxmox clusters.")
|
||||
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${CMAKE_CURRENT_SOURCE_DIR}/changelog_debian")
|
||||
set(CPACK_DEBIAN_PACKAGE_DEPENDS "python3")
|
||||
set(CPACK_DEBIAN_PACKAGE_DEPENDS "python3, python3-proxmoxer")
|
||||
set(CPACK_DEBIAN_PACKAGE_LICENSE "GPL 3.0")
|
||||
|
||||
|
||||
# Install
|
||||
set(CPACK_PACKAGING_INSTALL_PREFIX ${CMAKE_INSTALL_PREFIX})
|
||||
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${CMAKE_CURRENT_SOURCE_DIR}/postinst")
|
||||
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${CMAKE_CURRENT_SOURCE_DIR}/postinst;${CMAKE_CURRENT_SOURCE_DIR}/conffiles")
|
||||
set(CPACK_RPM_POST_INSTALL_SCRIPT_FILE "${CMAKE_CURRENT_SOURCE_DIR}/postinst")
|
||||
include(CPack)
|
||||
|
||||
@@ -1,5 +1,21 @@
|
||||
proxlb (0.9.0) unstable; urgency=low
|
||||
proxlb (1.0.2) unstable; urgency=low
|
||||
|
||||
* Add option to run migration in parallel or sequentially.
|
||||
* Add option to run ProxLB only on a Proxmox cluster master (req. HA feature).
|
||||
* Fix daemon timer to use hours instead of minutes.
|
||||
* Fix CMake packaging for Debian package to avoid overwriting the config file.
|
||||
* Fix some wonkey code styles.
|
||||
|
||||
-- Florian Paul Azim Hoberg <gyptazy@gyptazy.ch> Tue, 13 Aug 2024 17:28:14 +0200
|
||||
|
||||
proxlb (1.0.0) unstable; urgency=low
|
||||
|
||||
* Initial release of ProxLB.
|
||||
|
||||
-- Florian Paul Azim Hoberg <gyptazy@gyptazy.ch> Sun, 07 Jul 2024 05:38:41 -0200
|
||||
-- Florian Paul Azim Hoberg <gyptazy@gyptazy.ch> Thu, 01 Aug 2024 17:04:12 +0200
|
||||
|
||||
proxlb (0.9.0) unstable; urgency=low
|
||||
|
||||
* Initial development release of ProxLB as a tech preview.
|
||||
|
||||
-- Florian Paul Azim Hoberg <gyptazy@gyptazy.ch> Sun, 07 Jul 2024 05:38:41 +0200
|
||||
|
||||
@@ -1,2 +1,11 @@
|
||||
* Sun Jul 07 2024 Florian Paul Azim Hoberg <gyptazy@gyptazy.ch>
|
||||
* Tue Aug 13 2024 Florian Paul Azim Hoberg <gyptazy@gyptazy.ch>
|
||||
- Add option to run migration in parallel or sequentially.
|
||||
- Add option to run ProxLB only on a Proxmox cluster master (req. HA feature).
|
||||
- Fixed daemon timer to use hours instead of minutes.
|
||||
- Fixed some wonkey code styles.
|
||||
|
||||
* Thu Aug 01 2024 Florian Paul Azim Hoberg <gyptazy@gyptazy.ch>
|
||||
- Initial release of ProxLB.
|
||||
|
||||
* Sun Jul 07 2024 Florian Paul Azim Hoberg <gyptazy@gyptazy.ch>
|
||||
- Initial development release of ProxLB as a tech preview.
|
||||
|
||||
1
packaging/conffiles
Normal file
1
packaging/conffiles
Normal file
@@ -0,0 +1 @@
|
||||
/etc/proxlb/proxlb.conf
|
||||
647
proxlb
647
proxlb
@@ -33,6 +33,7 @@ except ImportError:
|
||||
import random
|
||||
import re
|
||||
import requests
|
||||
import socket
|
||||
import sys
|
||||
import time
|
||||
import urllib3
|
||||
@@ -40,7 +41,7 @@ import urllib3
|
||||
|
||||
# Constants
|
||||
__appname__ = "ProxLB"
|
||||
__version__ = "0.9.9"
|
||||
__version__ = "1.0.2"
|
||||
__author__ = "Florian Paul Azim Hoberg <gyptazy@gyptazy.ch> @gyptazy"
|
||||
__errors__ = False
|
||||
|
||||
@@ -72,14 +73,18 @@ class SystemdHandler(logging.Handler):
|
||||
|
||||
|
||||
# Functions
|
||||
def initialize_logger(log_level, log_handler):
|
||||
def initialize_logger(log_level, update_log_verbosity=False):
|
||||
""" Initialize ProxLB logging handler. """
|
||||
info_prefix = 'Info: [logger]:'
|
||||
|
||||
root_logger = logging.getLogger()
|
||||
root_logger.setLevel(log_level)
|
||||
root_logger.addHandler(SystemdHandler())
|
||||
logging.info(f'{info_prefix} Logger got initialized.')
|
||||
|
||||
if not update_log_verbosity:
|
||||
root_logger.addHandler(SystemdHandler())
|
||||
logging.info(f'{info_prefix} Logger got initialized.')
|
||||
else:
|
||||
logging.info(f'{info_prefix} Logger verbosity got updated to: {log_level}.')
|
||||
|
||||
|
||||
def pre_validations(config_path):
|
||||
@@ -108,7 +113,7 @@ def validate_daemon(daemon, schedule):
|
||||
|
||||
if bool(int(daemon)):
|
||||
logging.info(f'{info_prefix} Running in daemon mode. Next run in {schedule} hours.')
|
||||
time.sleep(int(schedule) * 60)
|
||||
time.sleep(int(schedule) * 60 * 60)
|
||||
else:
|
||||
logging.info(f'{info_prefix} Not running in daemon mode. Quitting.')
|
||||
sys.exit(0)
|
||||
@@ -141,7 +146,7 @@ def __validate_config_file(config_path):
|
||||
def initialize_args():
|
||||
""" Initialize given arguments for ProxLB. """
|
||||
argparser = argparse.ArgumentParser(description='ProxLB')
|
||||
argparser.add_argument('-c', '--config', type=str, help='Path to config file.', required=True)
|
||||
argparser.add_argument('-c', '--config', type=str, help='Path to config file.', required=False)
|
||||
argparser.add_argument('-d', '--dry-run', help='Perform a dry-run without doing any actions.', action='store_true', required=False)
|
||||
argparser.add_argument('-j', '--json', help='Return a JSON of the VM movement.', action='store_true', required=False)
|
||||
return argparser.parse_args()
|
||||
@@ -169,18 +174,24 @@ def initialize_config_options(config_path):
|
||||
config = configparser.ConfigParser()
|
||||
config.read(config_path)
|
||||
# Proxmox config
|
||||
proxmox_api_host = config['proxmox']['api_host']
|
||||
proxmox_api_user = config['proxmox']['api_user']
|
||||
proxmox_api_pass = config['proxmox']['api_pass']
|
||||
proxmox_api_ssl_v = config['proxmox']['verify_ssl']
|
||||
proxmox_api_host = config['proxmox']['api_host']
|
||||
proxmox_api_user = config['proxmox']['api_user']
|
||||
proxmox_api_pass = config['proxmox']['api_pass']
|
||||
proxmox_api_ssl_v = config['proxmox']['verify_ssl']
|
||||
# Balancing
|
||||
balancing_method = config['balancing'].get('method', 'memory')
|
||||
balanciness = config['balancing'].get('balanciness', 10)
|
||||
ignore_nodes = config['balancing'].get('ignore_nodes', None)
|
||||
ignore_vms = config['balancing'].get('ignore_vms', None)
|
||||
balancing_method = config['balancing'].get('method', 'memory')
|
||||
balancing_mode = config['balancing'].get('mode', 'used')
|
||||
balancing_mode_option = config['balancing'].get('mode_option', 'bytes')
|
||||
balancing_type = config['balancing'].get('type', 'vm')
|
||||
balanciness = config['balancing'].get('balanciness', 10)
|
||||
parallel_migrations = config['balancing'].get('parallel_migrations', 1)
|
||||
ignore_nodes = config['balancing'].get('ignore_nodes', None)
|
||||
ignore_vms = config['balancing'].get('ignore_vms', None)
|
||||
# Service
|
||||
daemon = config['service'].get('daemon', 1)
|
||||
schedule = config['service'].get('schedule', 24)
|
||||
master_only = config['service'].get('master_only', 0)
|
||||
daemon = config['service'].get('daemon', 1)
|
||||
schedule = config['service'].get('schedule', 24)
|
||||
log_verbosity = config['service'].get('log_verbosity', 'CRITICAL')
|
||||
except configparser.NoSectionError:
|
||||
logging.critical(f'{error_prefix} Could not find the required section.')
|
||||
sys.exit(2)
|
||||
@@ -192,8 +203,8 @@ def initialize_config_options(config_path):
|
||||
sys.exit(2)
|
||||
|
||||
logging.info(f'{info_prefix} Configuration file loaded.')
|
||||
return proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v, balancing_method, \
|
||||
balanciness, ignore_nodes, ignore_vms, daemon, schedule
|
||||
return proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v, balancing_method, balancing_mode, balancing_mode_option, \
|
||||
balancing_type, balanciness, parallel_migrations, ignore_nodes, ignore_vms, master_only, daemon, schedule, log_verbosity
|
||||
|
||||
|
||||
def api_connect(proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v):
|
||||
@@ -223,36 +234,105 @@ def api_connect(proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_ap
|
||||
return api_object
|
||||
|
||||
|
||||
def execute_rebalancing_only_by_master(api_object, master_only):
|
||||
""" Validate if balancing should only be done by the cluster master. Afterwards, validate if this node is the cluster master. """
|
||||
info_prefix = 'Info: [only-on-master-executor]:'
|
||||
master_only = bool(int(master_only))
|
||||
|
||||
if bool(int(master_only)):
|
||||
logging.info(f'{info_prefix} Master only rebalancing is defined. Starting validation.')
|
||||
cluster_master_node = get_cluster_master(api_object)
|
||||
cluster_master = validate_cluster_master(cluster_master_node)
|
||||
return cluster_master, master_only
|
||||
else:
|
||||
logging.info(f'{info_prefix} No master only rebalancing is defined. Skipping validation.')
|
||||
return False, master_only
|
||||
|
||||
|
||||
def get_cluster_master(api_object):
|
||||
""" Get the current master of the Proxmox cluster. """
|
||||
error_prefix = 'Error: [cluster-master-getter]:'
|
||||
info_prefix = 'Info: [cluster-master-getter]:'
|
||||
|
||||
try:
|
||||
ha_status_object = api_object.cluster().ha().status().manager_status().get()
|
||||
logging.info(f'{info_prefix} Master node: {ha_status_object.get("manager_status", None).get("master_node", None)}')
|
||||
except urllib3.exceptions.NameResolutionError:
|
||||
logging.critical(f'{error_prefix} Could not resolve the API.')
|
||||
sys.exit(2)
|
||||
except requests.exceptions.ConnectTimeout:
|
||||
logging.critical(f'{error_prefix} Connection time out to API.')
|
||||
sys.exit(2)
|
||||
except requests.exceptions.SSLError:
|
||||
logging.critical(f'{error_prefix} SSL certificate verification failed for API.')
|
||||
sys.exit(2)
|
||||
|
||||
cluster_master = ha_status_object.get("manager_status", None).get("master_node", None)
|
||||
|
||||
if cluster_master:
|
||||
return cluster_master
|
||||
else:
|
||||
logging.critical(f'{error_prefix} Could not obtain cluster master. Please check your configuration - stopping.')
|
||||
sys.exit(2)
|
||||
|
||||
|
||||
def validate_cluster_master(cluster_master):
|
||||
""" Validate if the current execution node is the cluster master. """
|
||||
info_prefix = 'Info: [cluster-master-validator]:'
|
||||
|
||||
node_executor_hostname = socket.gethostname()
|
||||
logging.info(f'{info_prefix} Node executor hostname is: {node_executor_hostname}')
|
||||
|
||||
if node_executor_hostname != cluster_master:
|
||||
logging.info(f'{info_prefix} {node_executor_hostname} is not the cluster master ({cluster_master}).')
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
|
||||
def get_node_statistics(api_object, ignore_nodes):
|
||||
""" Get statistics of cpu, memory and disk for each node in the cluster. """
|
||||
info_prefix = 'Info: [node-statistics]:'
|
||||
node_statistics = {}
|
||||
info_prefix = 'Info: [node-statistics]:'
|
||||
node_statistics = {}
|
||||
ignore_nodes_list = ignore_nodes.split(',')
|
||||
|
||||
for node in api_object.nodes.get():
|
||||
if node['status'] == 'online' and node['node'] not in ignore_nodes_list:
|
||||
node_statistics[node['node']] = {}
|
||||
node_statistics[node['node']]['cpu_total'] = node['maxcpu']
|
||||
node_statistics[node['node']]['cpu_used'] = node['cpu']
|
||||
node_statistics[node['node']]['cpu_free'] = int(node['maxcpu']) - int(node['cpu'])
|
||||
node_statistics[node['node']]['cpu_free_percent'] = int((node_statistics[node['node']]['cpu_free']) / int(node['maxcpu']) * 100)
|
||||
node_statistics[node['node']]['memory_total'] = node['maxmem']
|
||||
node_statistics[node['node']]['memory_used'] = node['mem']
|
||||
node_statistics[node['node']]['memory_free'] = int(node['maxmem']) - int(node['mem'])
|
||||
node_statistics[node['node']]['memory_free_percent'] = int((node_statistics[node['node']]['memory_free']) / int(node['maxmem']) * 100)
|
||||
node_statistics[node['node']]['disk_total'] = node['maxdisk']
|
||||
node_statistics[node['node']]['disk_used'] = node['disk']
|
||||
node_statistics[node['node']]['disk_free'] = int(node['maxdisk']) - int(node['disk'])
|
||||
node_statistics[node['node']]['disk_free_percent'] = int((node_statistics[node['node']]['disk_free']) / int(node['maxdisk']) * 100)
|
||||
node_statistics[node['node']]['cpu_total'] = node['maxcpu']
|
||||
node_statistics[node['node']]['cpu_assigned'] = node['cpu']
|
||||
node_statistics[node['node']]['cpu_assigned_percent'] = int((node_statistics[node['node']]['cpu_assigned']) / int(node_statistics[node['node']]['cpu_total']) * 100)
|
||||
node_statistics[node['node']]['cpu_assigned_percent_last_run'] = 0
|
||||
node_statistics[node['node']]['cpu_used'] = 0
|
||||
node_statistics[node['node']]['cpu_free'] = int(node['maxcpu']) - int(node['cpu'])
|
||||
node_statistics[node['node']]['cpu_free_percent'] = int((node_statistics[node['node']]['cpu_free']) / int(node['maxcpu']) * 100)
|
||||
node_statistics[node['node']]['cpu_free_percent_last_run'] = 0
|
||||
node_statistics[node['node']]['memory_total'] = node['maxmem']
|
||||
node_statistics[node['node']]['memory_assigned'] = 0
|
||||
node_statistics[node['node']]['memory_assigned_percent'] = int((node_statistics[node['node']]['memory_assigned']) / int(node_statistics[node['node']]['memory_total']) * 100)
|
||||
node_statistics[node['node']]['memory_assigned_percent_last_run'] = 0
|
||||
node_statistics[node['node']]['memory_used'] = node['mem']
|
||||
node_statistics[node['node']]['memory_free'] = int(node['maxmem']) - int(node['mem'])
|
||||
node_statistics[node['node']]['memory_free_percent'] = int((node_statistics[node['node']]['memory_free']) / int(node['maxmem']) * 100)
|
||||
node_statistics[node['node']]['memory_free_percent_last_run'] = 0
|
||||
node_statistics[node['node']]['disk_total'] = node['maxdisk']
|
||||
node_statistics[node['node']]['disk_assigned'] = 0
|
||||
node_statistics[node['node']]['disk_assigned_percent'] = int((node_statistics[node['node']]['disk_assigned']) / int(node_statistics[node['node']]['disk_total']) * 100)
|
||||
node_statistics[node['node']]['disk_assigned_percent_last_run'] = 0
|
||||
node_statistics[node['node']]['disk_used'] = node['disk']
|
||||
node_statistics[node['node']]['disk_free'] = int(node['maxdisk']) - int(node['disk'])
|
||||
node_statistics[node['node']]['disk_free_percent'] = int((node_statistics[node['node']]['disk_free']) / int(node['maxdisk']) * 100)
|
||||
node_statistics[node['node']]['disk_free_percent_last_run'] = 0
|
||||
logging.info(f'{info_prefix} Added node {node["node"]}.')
|
||||
|
||||
logging.info(f'{info_prefix} Created node statistics.')
|
||||
return node_statistics
|
||||
|
||||
|
||||
def get_vm_statistics(api_object, ignore_vms):
|
||||
def get_vm_statistics(api_object, ignore_vms, balancing_type):
|
||||
""" Get statistics of cpu, memory and disk for each vm in the cluster. """
|
||||
info_prefix = 'Info: [vm-statistics]:'
|
||||
warn_prefix = 'Warn: [vm-statistics]:'
|
||||
vm_statistics = {}
|
||||
ignore_vms_list = ignore_vms.split(',')
|
||||
group_include = None
|
||||
@@ -265,43 +345,112 @@ def get_vm_statistics(api_object, ignore_vms):
|
||||
vm_ignore_wildcard = __validate_ignore_vm_wildcard(ignore_vms)
|
||||
|
||||
for node in api_object.nodes.get():
|
||||
for vm in api_object.nodes(node['node']).qemu.get():
|
||||
|
||||
# Get the VM tags from API.
|
||||
vm_tags = __get_vm_tags(api_object, node, vm['vmid'])
|
||||
if vm_tags is not None:
|
||||
group_include, group_exclude, vm_ignore = __get_proxlb_groups(vm_tags)
|
||||
# Add all virtual machines if type is vm or all.
|
||||
if balancing_type == 'vm' or balancing_type == 'all':
|
||||
for vm in api_object.nodes(node['node']).qemu.get():
|
||||
|
||||
# Get wildcard match for VMs to ignore if a wildcard pattern was
|
||||
# previously found. Wildcards may slow down the task when using
|
||||
# many patterns in the ignore list. Therefore, run this only if
|
||||
# a wildcard pattern was found. We also do not need to validate
|
||||
# this if the VM is already being ignored by a defined tag.
|
||||
if vm_ignore_wildcard and not vm_ignore:
|
||||
vm_ignore = __check_vm_name_wildcard_pattern(vm['name'], ignore_vms_list)
|
||||
# Get the VM tags from API.
|
||||
vm_tags = __get_vm_tags(api_object, node, vm['vmid'], 'vm')
|
||||
if vm_tags is not None:
|
||||
group_include, group_exclude, vm_ignore = __get_proxlb_groups(vm_tags)
|
||||
|
||||
if vm['status'] == 'running' and vm['name'] not in ignore_vms_list and not vm_ignore:
|
||||
vm_statistics[vm['name']] = {}
|
||||
vm_statistics[vm['name']]['group_include'] = group_include
|
||||
vm_statistics[vm['name']]['group_exclude'] = group_exclude
|
||||
vm_statistics[vm['name']]['cpu_total'] = vm['cpus']
|
||||
vm_statistics[vm['name']]['cpu_used'] = vm['cpu']
|
||||
vm_statistics[vm['name']]['memory_total'] = vm['maxmem']
|
||||
vm_statistics[vm['name']]['memory_used'] = vm['mem']
|
||||
vm_statistics[vm['name']]['disk_total'] = vm['maxdisk']
|
||||
vm_statistics[vm['name']]['disk_used'] = vm['disk']
|
||||
vm_statistics[vm['name']]['vmid'] = vm['vmid']
|
||||
vm_statistics[vm['name']]['node_parent'] = node['node']
|
||||
# Rebalancing node will be overwritten after calculations.
|
||||
# If the vm stays on the node, it will be removed at a
|
||||
# later time.
|
||||
vm_statistics[vm['name']]['node_rebalance'] = node['node']
|
||||
logging.info(f'{info_prefix} Added vm {vm["name"]}.')
|
||||
# Get wildcard match for VMs to ignore if a wildcard pattern was
|
||||
# previously found. Wildcards may slow down the task when using
|
||||
# many patterns in the ignore list. Therefore, run this only if
|
||||
# a wildcard pattern was found. We also do not need to validate
|
||||
# this if the VM is already being ignored by a defined tag.
|
||||
if vm_ignore_wildcard and not vm_ignore:
|
||||
vm_ignore = __check_vm_name_wildcard_pattern(vm['name'], ignore_vms_list)
|
||||
|
||||
if vm['status'] == 'running' and vm['name'] not in ignore_vms_list and not vm_ignore:
|
||||
vm_statistics[vm['name']] = {}
|
||||
vm_statistics[vm['name']]['group_include'] = group_include
|
||||
vm_statistics[vm['name']]['group_exclude'] = group_exclude
|
||||
vm_statistics[vm['name']]['cpu_total'] = vm['cpus']
|
||||
vm_statistics[vm['name']]['cpu_used'] = vm['cpu']
|
||||
vm_statistics[vm['name']]['memory_total'] = vm['maxmem']
|
||||
vm_statistics[vm['name']]['memory_used'] = vm['mem']
|
||||
vm_statistics[vm['name']]['disk_total'] = vm['maxdisk']
|
||||
vm_statistics[vm['name']]['disk_used'] = vm['disk']
|
||||
vm_statistics[vm['name']]['vmid'] = vm['vmid']
|
||||
vm_statistics[vm['name']]['node_parent'] = node['node']
|
||||
vm_statistics[vm['name']]['type'] = 'vm'
|
||||
# Rebalancing node will be overwritten after calculations.
|
||||
# If the vm stays on the node, it will be removed at a
|
||||
# later time.
|
||||
vm_statistics[vm['name']]['node_rebalance'] = node['node']
|
||||
logging.info(f'{info_prefix} Added vm {vm["name"]}.')
|
||||
|
||||
# Add all containers if type is ct or all.
|
||||
if balancing_type == 'ct' or balancing_type == 'all':
|
||||
for vm in api_object.nodes(node['node']).lxc.get():
|
||||
|
||||
logging.warning(f'{warn_prefix} Rebalancing on LXC containers (CT) always requires them to shut down.')
|
||||
logging.warning(f'{warn_prefix} {vm["name"]} is from type CT and cannot be live migrated!')
|
||||
# Get the VM tags from API.
|
||||
vm_tags = __get_vm_tags(api_object, node, vm['vmid'], 'ct')
|
||||
if vm_tags is not None:
|
||||
group_include, group_exclude, vm_ignore = __get_proxlb_groups(vm_tags)
|
||||
|
||||
# Get wildcard match for VMs to ignore if a wildcard pattern was
|
||||
# previously found. Wildcards may slow down the task when using
|
||||
# many patterns in the ignore list. Therefore, run this only if
|
||||
# a wildcard pattern was found. We also do not need to validate
|
||||
# this if the VM is already being ignored by a defined tag.
|
||||
if vm_ignore_wildcard and not vm_ignore:
|
||||
vm_ignore = __check_vm_name_wildcard_pattern(vm['name'], ignore_vms_list)
|
||||
|
||||
if vm['status'] == 'running' and vm['name'] not in ignore_vms_list and not vm_ignore:
|
||||
vm_statistics[vm['name']] = {}
|
||||
vm_statistics[vm['name']]['group_include'] = group_include
|
||||
vm_statistics[vm['name']]['group_exclude'] = group_exclude
|
||||
vm_statistics[vm['name']]['cpu_total'] = vm['cpus']
|
||||
vm_statistics[vm['name']]['cpu_used'] = vm['cpu']
|
||||
vm_statistics[vm['name']]['memory_total'] = vm['maxmem']
|
||||
vm_statistics[vm['name']]['memory_used'] = vm['mem']
|
||||
vm_statistics[vm['name']]['disk_total'] = vm['maxdisk']
|
||||
vm_statistics[vm['name']]['disk_used'] = vm['disk']
|
||||
vm_statistics[vm['name']]['vmid'] = vm['vmid']
|
||||
vm_statistics[vm['name']]['node_parent'] = node['node']
|
||||
vm_statistics[vm['name']]['type'] = 'ct'
|
||||
# Rebalancing node will be overwritten after calculations.
|
||||
# If the vm stays on the node, it will be removed at a
|
||||
# later time.
|
||||
vm_statistics[vm['name']]['node_rebalance'] = node['node']
|
||||
logging.info(f'{info_prefix} Added vm {vm["name"]}.')
|
||||
|
||||
logging.info(f'{info_prefix} Created VM statistics.')
|
||||
return vm_statistics
|
||||
|
||||
|
||||
def update_node_statistics(node_statistics, vm_statistics):
|
||||
""" Update node statistics by VMs statistics. """
|
||||
info_prefix = 'Info: [node-update-statistics]:'
|
||||
warn_prefix = 'Warning: [node-update-statistics]:'
|
||||
|
||||
for vm, vm_value in vm_statistics.items():
|
||||
node_statistics[vm_value['node_parent']]['cpu_assigned'] = node_statistics[vm_value['node_parent']]['cpu_assigned'] + int(vm_value['cpu_total'])
|
||||
node_statistics[vm_value['node_parent']]['cpu_assigned_percent'] = (node_statistics[vm_value['node_parent']]['cpu_assigned'] / node_statistics[vm_value['node_parent']]['cpu_total']) * 100
|
||||
node_statistics[vm_value['node_parent']]['memory_assigned'] = node_statistics[vm_value['node_parent']]['memory_assigned'] + int(vm_value['memory_total'])
|
||||
node_statistics[vm_value['node_parent']]['memory_assigned_percent'] = (node_statistics[vm_value['node_parent']]['memory_assigned'] / node_statistics[vm_value['node_parent']]['memory_total']) * 100
|
||||
node_statistics[vm_value['node_parent']]['disk_assigned'] = node_statistics[vm_value['node_parent']]['disk_assigned'] + int(vm_value['disk_total'])
|
||||
node_statistics[vm_value['node_parent']]['disk_assigned_percent'] = (node_statistics[vm_value['node_parent']]['disk_assigned'] / node_statistics[vm_value['node_parent']]['disk_total']) * 100
|
||||
|
||||
if node_statistics[vm_value['node_parent']]['cpu_assigned_percent'] > 99:
|
||||
logging.warning(f'{warn_prefix} Node {vm_value["node_parent"]} is overprovisioned for CPU by {int(node_statistics[vm_value["node_parent"]]["cpu_assigned_percent"])}%.')
|
||||
|
||||
if node_statistics[vm_value['node_parent']]['memory_assigned_percent'] > 99:
|
||||
logging.warning(f'{warn_prefix} Node {vm_value["node_parent"]} is overprovisioned for memory by {int(node_statistics[vm_value["node_parent"]]["memory_assigned_percent"])}%.')
|
||||
|
||||
if node_statistics[vm_value['node_parent']]['disk_assigned_percent'] > 99:
|
||||
logging.warning(f'{warn_prefix} Node {vm_value["node_parent"]} is overprovisioned for disk by {int(node_statistics[vm_value["node_parent"]]["disk_assigned_percent"])}%.')
|
||||
|
||||
logging.info(f'{info_prefix} Updated node resource assignments by all VMs.')
|
||||
logging.debug('node_statistics')
|
||||
return node_statistics
|
||||
|
||||
|
||||
def __validate_ignore_vm_wildcard(ignore_vms):
|
||||
""" Validate if a wildcard is used for ignored VMs. """
|
||||
if '*' in ignore_vms:
|
||||
@@ -316,18 +465,23 @@ def __check_vm_name_wildcard_pattern(vm_name, ignore_vms_list):
|
||||
return True
|
||||
|
||||
|
||||
def __get_vm_tags(api_object, node, vmid):
|
||||
""" Get a comment for a VM from a given VMID. """
|
||||
info_prefix = 'Info: [api-get-vm-tags]:'
|
||||
def __get_vm_tags(api_object, node, vmid, balancing_type):
|
||||
""" Get tags for a VM/CT for a given VMID. """
|
||||
info_prefix = 'Info: [api-get-vm-tags]:'
|
||||
|
||||
vm_config = api_object.nodes(node['node']).qemu(vmid).config.get()
|
||||
logging.info(f'{info_prefix} Got VM comment from API.')
|
||||
if balancing_type == 'vm':
|
||||
vm_config = api_object.nodes(node['node']).qemu(vmid).config.get()
|
||||
|
||||
if balancing_type == 'ct':
|
||||
vm_config = api_object.nodes(node['node']).lxc(vmid).config.get()
|
||||
|
||||
logging.info(f'{info_prefix} Got VM/CT tag from API.')
|
||||
return vm_config.get('tags', None)
|
||||
|
||||
|
||||
def __get_proxlb_groups(vm_tags):
|
||||
""" Get ProxLB related include and exclude groups. """
|
||||
info_prefix = 'Info: [api-get-vm-include-exclude-tags]:'
|
||||
info_prefix = 'Info: [api-get-vm-include-exclude-tags]:'
|
||||
group_include = None
|
||||
group_exclude = None
|
||||
vm_ignore = None
|
||||
@@ -350,33 +504,31 @@ def __get_proxlb_groups(vm_tags):
|
||||
return group_include, group_exclude, vm_ignore
|
||||
|
||||
|
||||
def balancing_calculations(balancing_method, node_statistics, vm_statistics, balanciness):
|
||||
def balancing_calculations(balancing_method, balancing_mode, balancing_mode_option, node_statistics, vm_statistics, balanciness, rebalance, processed_vms):
|
||||
""" Calculate re-balancing of VMs on present nodes across the cluster. """
|
||||
info_prefix = 'Info: [rebalancing-calculator]:'
|
||||
balanciness = int(balanciness)
|
||||
rebalance = False
|
||||
processed_vms = []
|
||||
rebalance = True
|
||||
emergency_counter = 0
|
||||
info_prefix = 'Info: [rebalancing-calculator]:'
|
||||
|
||||
# Validate for a supported balancing method.
|
||||
# Validate for a supported balancing method, mode and if rebalancing is required.
|
||||
__validate_balancing_method(balancing_method)
|
||||
__validate_balancing_mode(balancing_mode)
|
||||
__validate_vm_statistics(vm_statistics)
|
||||
rebalance = __validate_balanciness(balanciness, balancing_method, balancing_mode, node_statistics)
|
||||
|
||||
# Rebalance VMs with the highest resource usage to a new
|
||||
# node until reaching the desired balanciness.
|
||||
while rebalance and emergency_counter < 10000:
|
||||
emergency_counter = emergency_counter + 1
|
||||
rebalance = __validate_balanciness(balanciness, balancing_method, node_statistics)
|
||||
if rebalance:
|
||||
# Get most used/assigned resources of the VM and the most free or less allocated node.
|
||||
resources_vm_most_used, processed_vms = __get_most_used_resources_vm(balancing_method, balancing_mode, vm_statistics, processed_vms)
|
||||
resources_node_most_free = __get_most_free_resources_node(balancing_method, balancing_mode, balancing_mode_option, node_statistics)
|
||||
|
||||
if rebalance:
|
||||
resource_highest_used_resources_vm, processed_vms = __get_most_used_resources_vm(balancing_method, vm_statistics, processed_vms)
|
||||
resource_highest_free_resources_node = __get_most_free_resources_node(balancing_method, node_statistics)
|
||||
node_statistics, vm_statistics = __update_resource_statistics(resource_highest_used_resources_vm, resource_highest_free_resources_node,
|
||||
vm_statistics, node_statistics, balancing_method)
|
||||
# Update resource statistics for VMs and nodes.
|
||||
node_statistics, vm_statistics = __update_resource_statistics(resources_vm_most_used, resources_node_most_free,
|
||||
vm_statistics, node_statistics, balancing_method, balancing_mode)
|
||||
|
||||
# Start recursion until we do not have any needs to rebalance anymore.
|
||||
balancing_calculations(balancing_method, balancing_mode, balancing_mode_option, node_statistics, vm_statistics, balanciness, rebalance, processed_vms)
|
||||
|
||||
# Honour groupings for include and exclude groups for rebalancing VMs.
|
||||
node_statistics, vm_statistics = __get_vm_tags_include_groups(vm_statistics, node_statistics, balancing_method)
|
||||
node_statistics, vm_statistics = __get_vm_tags_exclude_groups(vm_statistics, node_statistics, balancing_method)
|
||||
node_statistics, vm_statistics = __get_vm_tags_include_groups(vm_statistics, node_statistics, balancing_method, balancing_mode)
|
||||
node_statistics, vm_statistics = __get_vm_tags_exclude_groups(vm_statistics, node_statistics, balancing_method, balancing_mode)
|
||||
|
||||
# Remove VMs that are not being relocated.
|
||||
vms_to_remove = [vm_name for vm_name, vm_info in vm_statistics.items() if 'node_rebalance' in vm_info and vm_info['node_rebalance'] == vm_info.get('node_parent')]
|
||||
@@ -390,7 +542,7 @@ def balancing_calculations(balancing_method, node_statistics, vm_statistics, bal
|
||||
def __validate_balancing_method(balancing_method):
|
||||
""" Validate for valid and supported balancing method. """
|
||||
error_prefix = 'Error: [balancing-method-validation]:'
|
||||
info_prefix = 'Info: [balancing-method-validation]]:'
|
||||
info_prefix = 'Info: [balancing-method-validation]:'
|
||||
|
||||
if balancing_method not in ['memory', 'disk', 'cpu']:
|
||||
logging.error(f'{error_prefix} Invalid balancing method: {balancing_method}')
|
||||
@@ -399,84 +551,146 @@ def __validate_balancing_method(balancing_method):
|
||||
logging.info(f'{info_prefix} Valid balancing method: {balancing_method}')
|
||||
|
||||
|
||||
def __validate_balanciness(balanciness, balancing_method, node_statistics):
|
||||
def __validate_balancing_mode(balancing_mode):
|
||||
""" Validate for valid and supported balancing mode. """
|
||||
error_prefix = 'Error: [balancing-mode-validation]:'
|
||||
info_prefix = 'Info: [balancing-mode-validation]:'
|
||||
|
||||
if balancing_mode not in ['used', 'assigned']:
|
||||
logging.error(f'{error_prefix} Invalid balancing method: {balancing_mode}')
|
||||
sys.exit(2)
|
||||
else:
|
||||
logging.info(f'{info_prefix} Valid balancing method: {balancing_mode}')
|
||||
|
||||
|
||||
def __validate_vm_statistics(vm_statistics):
|
||||
""" Validate for at least a single object of type CT/VM to rebalance. """
|
||||
error_prefix = 'Error: [balancing-vm-stats-validation]:'
|
||||
|
||||
if len(vm_statistics) == 0:
|
||||
logging.error(f'{error_prefix} Not a single CT/VM found in cluster.')
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def __validate_balanciness(balanciness, balancing_method, balancing_mode, node_statistics):
|
||||
""" Validate for balanciness to ensure further rebalancing is needed. """
|
||||
info_prefix = 'Info: [balanciness-validation]]:'
|
||||
node_memory_free_percent_list = []
|
||||
info_prefix = 'Info: [balanciness-validation]:'
|
||||
node_resource_percent_list = []
|
||||
node_assigned_percent_match = []
|
||||
|
||||
# Remap balancing mode to get the related values from nodes dict.
|
||||
if balancing_mode == 'used':
|
||||
node_resource_selector = 'free'
|
||||
if balancing_mode == 'assigned':
|
||||
node_resource_selector = 'assigned'
|
||||
|
||||
for node_name, node_info in node_statistics.items():
|
||||
node_memory_free_percent_list.append(node_info[f'{balancing_method}_free_percent'])
|
||||
|
||||
node_memory_free_percent_list_sorted = sorted(node_memory_free_percent_list)
|
||||
node_lowest_percent = node_memory_free_percent_list_sorted[0]
|
||||
node_highest_percent = node_memory_free_percent_list_sorted[-1]
|
||||
# Save information of nodes from current run to compare them in the next recursion.
|
||||
if node_statistics[node_name][f'{balancing_method}_{node_resource_selector}_percent_last_run'] == node_statistics[node_name][f'{balancing_method}_{node_resource_selector}_percent']:
|
||||
node_statistics[node_name][f'{balancing_method}_{node_resource_selector}_percent_match'] = True
|
||||
else:
|
||||
node_statistics[node_name][f'{balancing_method}_{node_resource_selector}_percent_match'] = False
|
||||
# Update value to the current value of the recursion run.
|
||||
node_statistics[node_name][f'{balancing_method}_{node_resource_selector}_percent_last_run'] = node_statistics[node_name][f'{balancing_method}_{node_resource_selector}_percent']
|
||||
|
||||
if (node_lowest_percent + balanciness) < node_highest_percent:
|
||||
logging.info(f'{info_prefix} Rebalancing is for {balancing_method} is needed.')
|
||||
# If all node resources are unchanged, the recursion can be left.
|
||||
for key, value in node_statistics.items():
|
||||
node_assigned_percent_match.append(value.get(f'{balancing_method}_{node_resource_selector}_percent_match', False))
|
||||
|
||||
if False not in node_assigned_percent_match:
|
||||
return False
|
||||
|
||||
# Add node information to resource list.
|
||||
node_resource_percent_list.append(int(node_info[f'{balancing_method}_{node_resource_selector}_percent']))
|
||||
logging.debug(f'{info_prefix} Node: {node_name} with values: {node_info}')
|
||||
|
||||
# Create a sorted list of the delta + balanciness between the node resources.
|
||||
node_resource_percent_list_sorted = sorted(node_resource_percent_list)
|
||||
node_lowest_percent = node_resource_percent_list_sorted[0]
|
||||
node_highest_percent = node_resource_percent_list_sorted[-1]
|
||||
|
||||
# Validate if the recursion should be proceeded for further rebalancing.
|
||||
if (int(node_lowest_percent) + int(balanciness)) < int(node_highest_percent):
|
||||
logging.info(f'{info_prefix} Rebalancing for {balancing_method} is needed. Highest usage: {int(node_highest_percent)}% | Lowest usage: {int(node_lowest_percent)}%.')
|
||||
return True
|
||||
else:
|
||||
logging.info(f'{info_prefix} Rebalancing is for {balancing_method} is not needed.')
|
||||
logging.info(f'{info_prefix} Rebalancing for {balancing_method} is not needed. Highest usage: {int(node_highest_percent)}% | Lowest usage: {int(node_lowest_percent)}%.')
|
||||
return False
|
||||
|
||||
|
||||
def __get_most_used_resources_vm(balancing_method, vm_statistics, processed_vms):
|
||||
def __get_most_used_resources_vm(balancing_method, balancing_mode, vm_statistics, processed_vms):
|
||||
""" Get and return the most used resources of a VM by the defined balancing method. """
|
||||
if balancing_method == 'memory':
|
||||
vm = max(vm_statistics.items(), key=lambda item: item[1]['memory_used'] if item[0] not in processed_vms else -float('inf'))
|
||||
processed_vms.append(vm[0])
|
||||
return vm, processed_vms
|
||||
if balancing_method == 'disk':
|
||||
vm = max(vm_statistics.items(), key=lambda item: item[1]['disk_used'] if item[0] not in processed_vms else -float('inf'))
|
||||
processed_vms.append(vm[0])
|
||||
return vm, processed_vms
|
||||
if balancing_method == 'cpu':
|
||||
vm = max(vm_statistics.items(), key=lambda item: item[1]['cpu_used'] if item[0] not in processed_vms else -float('inf'))
|
||||
processed_vms.append(vm[0])
|
||||
return vm, processed_vms
|
||||
info_prefix = 'Info: [get-most-used-resources-vm]:'
|
||||
|
||||
# Remap balancing mode to get the related values from nodes dict.
|
||||
if balancing_mode == 'used':
|
||||
vm_resource_selector = 'used'
|
||||
if balancing_mode == 'assigned':
|
||||
vm_resource_selector = 'total'
|
||||
|
||||
vm = max(vm_statistics.items(), key=lambda item: item[1][f'{balancing_method}_{vm_resource_selector}'] if item[0] not in processed_vms else -float('inf'))
|
||||
processed_vms.append(vm[0])
|
||||
|
||||
logging.info(f'{info_prefix} {vm}')
|
||||
return vm, processed_vms
|
||||
|
||||
|
||||
def __get_most_free_resources_node(balancing_method, node_statistics):
|
||||
def __get_most_free_resources_node(balancing_method, balancing_mode, balancing_mode_option, node_statistics):
|
||||
""" Get and return the most free resources of a node by the defined balancing method. """
|
||||
if balancing_method == 'memory':
|
||||
return max(node_statistics.items(), key=lambda item: item[1]['memory_free'])
|
||||
if balancing_method == 'disk':
|
||||
return max(node_statistics.items(), key=lambda item: item[1]['disk_free'])
|
||||
if balancing_method == 'cpu':
|
||||
return max(node_statistics.items(), key=lambda item: item[1]['cpu_free'])
|
||||
info_prefix = 'Info: [get-most-free-resources-nodes]:'
|
||||
|
||||
# Return the node information based on the balancing mode.
|
||||
if balancing_mode == 'used' and balancing_mode_option == 'bytes':
|
||||
node = max(node_statistics.items(), key=lambda item: item[1][f'{balancing_method}_free'])
|
||||
if balancing_mode == 'used' and balancing_mode_option == 'percent':
|
||||
node = max(node_statistics.items(), key=lambda item: item[1][f'{balancing_method}_free_percent'])
|
||||
if balancing_mode == 'assigned':
|
||||
node = min(node_statistics.items(), key=lambda item: item[1][f'{balancing_method}_assigned'] if item[1][f'{balancing_method}_assigned_percent'] > 0 or item[1][f'{balancing_method}_assigned_percent'] < 100 else -float('inf'))
|
||||
|
||||
logging.info(f'{info_prefix} {node}')
|
||||
return node
|
||||
|
||||
|
||||
def __update_resource_statistics(resource_highest_used_resources_vm, resource_highest_free_resources_node, vm_statistics, node_statistics, balancing_method):
|
||||
def __update_resource_statistics(resource_highest_used_resources_vm, resource_highest_free_resources_node, vm_statistics, node_statistics, balancing_method, balancing_mode):
|
||||
""" Update VM and node resource statistics. """
|
||||
info_prefix = 'Info: [rebalancing-resource-statistics-update]:'
|
||||
info_prefix = 'Info: [rebalancing-resource-statistics-update]:'
|
||||
|
||||
if resource_highest_used_resources_vm[1]['node_parent'] != resource_highest_free_resources_node[0]:
|
||||
vm_name = resource_highest_used_resources_vm[0]
|
||||
vm_node_parent = resource_highest_used_resources_vm[1]['node_parent']
|
||||
vm_node_rebalance = resource_highest_free_resources_node[0]
|
||||
vm_resource_used = vm_statistics[resource_highest_used_resources_vm[0]][f'{balancing_method}_used']
|
||||
vm_name = resource_highest_used_resources_vm[0]
|
||||
vm_node_parent = resource_highest_used_resources_vm[1]['node_parent']
|
||||
vm_node_rebalance = resource_highest_free_resources_node[0]
|
||||
vm_resource_used = vm_statistics[resource_highest_used_resources_vm[0]][f'{balancing_method}_used']
|
||||
vm_resource_total = vm_statistics[resource_highest_used_resources_vm[0]][f'{balancing_method}_total']
|
||||
|
||||
# Update dictionaries for new values
|
||||
# Assign new rebalance node to vm
|
||||
vm_statistics[vm_name]['node_rebalance'] = vm_node_rebalance
|
||||
|
||||
logging.info(f'Moving {vm_name} from {vm_node_parent} to {vm_node_rebalance}')
|
||||
|
||||
# Recalculate values for nodes
|
||||
## Add freed resources to old parent node
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_used'] = int(node_statistics[vm_node_parent][f'{balancing_method}_used']) - int(vm_resource_used)
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_free'] = int(node_statistics[vm_node_parent][f'{balancing_method}_free']) + int(vm_resource_used)
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_free_percent'] = int(int(node_statistics[vm_node_parent][f'{balancing_method}_free']) / int(node_statistics[vm_node_parent][f'{balancing_method}_total']) * 100)
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_used'] = int(node_statistics[vm_node_parent][f'{balancing_method}_used']) - int(vm_resource_used)
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_free'] = int(node_statistics[vm_node_parent][f'{balancing_method}_free']) + int(vm_resource_used)
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_free_percent'] = int(int(node_statistics[vm_node_parent][f'{balancing_method}_free']) / int(node_statistics[vm_node_parent][f'{balancing_method}_total']) * 100)
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_assigned'] = int(node_statistics[vm_node_parent][f'{balancing_method}_assigned']) - int(vm_resource_total)
|
||||
node_statistics[vm_node_parent][f'{balancing_method}_assigned_percent'] = int(int(node_statistics[vm_node_parent][f'{balancing_method}_assigned']) / int(node_statistics[vm_node_parent][f'{balancing_method}_total']) * 100)
|
||||
|
||||
## Removed newly allocated resources to new rebalanced node
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_used'] = int(node_statistics[vm_node_rebalance][f'{balancing_method}_used']) + int(vm_resource_used)
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_free'] = int(node_statistics[vm_node_rebalance][f'{balancing_method}_free']) - int(vm_resource_used)
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_free_percent'] = int(int(node_statistics[vm_node_rebalance][f'{balancing_method}_free']) / int(node_statistics[vm_node_rebalance][f'{balancing_method}_total']) * 100)
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_used'] = int(node_statistics[vm_node_rebalance][f'{balancing_method}_used']) + int(vm_resource_used)
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_free'] = int(node_statistics[vm_node_rebalance][f'{balancing_method}_free']) - int(vm_resource_used)
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_free_percent'] = int(int(node_statistics[vm_node_rebalance][f'{balancing_method}_free']) / int(node_statistics[vm_node_rebalance][f'{balancing_method}_total']) * 100)
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_assigned'] = int(node_statistics[vm_node_rebalance][f'{balancing_method}_assigned']) + int(vm_resource_total)
|
||||
node_statistics[vm_node_rebalance][f'{balancing_method}_assigned_percent'] = int(int(node_statistics[vm_node_rebalance][f'{balancing_method}_assigned']) / int(node_statistics[vm_node_rebalance][f'{balancing_method}_total']) * 100)
|
||||
|
||||
logging.info(f'{info_prefix} Updated VM and node statistics.')
|
||||
return node_statistics, vm_statistics
|
||||
|
||||
|
||||
def __get_vm_tags_include_groups(vm_statistics, node_statistics, balancing_method):
|
||||
def __get_vm_tags_include_groups(vm_statistics, node_statistics, balancing_method, balancing_mode):
|
||||
""" Get VMs tags for include groups. """
|
||||
info_prefix = 'Info: [rebalancing-tags-group-include]:'
|
||||
info_prefix = 'Info: [rebalancing-tags-group-include]:'
|
||||
tags_include_vms = {}
|
||||
processed_vm = []
|
||||
|
||||
@@ -503,16 +717,15 @@ def __get_vm_tags_include_groups(vm_statistics, node_statistics, balancing_metho
|
||||
vm_node_rebalance = vm_statistics[vm_name]['node_rebalance']
|
||||
else:
|
||||
_mocked_vm_object = (vm_name, vm_statistics[vm_name])
|
||||
node_statistics, vm_statistics = __update_resource_statistics(_mocked_vm_object, [vm_node_rebalance],
|
||||
vm_statistics, node_statistics, balancing_method)
|
||||
node_statistics, vm_statistics = __update_resource_statistics(_mocked_vm_object, [vm_node_rebalance], vm_statistics, node_statistics, balancing_method, balancing_mode)
|
||||
processed_vm.append(vm_name)
|
||||
|
||||
return node_statistics, vm_statistics
|
||||
|
||||
|
||||
def __get_vm_tags_exclude_groups(vm_statistics, node_statistics, balancing_method):
|
||||
def __get_vm_tags_exclude_groups(vm_statistics, node_statistics, balancing_method, balancing_mode):
|
||||
""" Get VMs tags for exclude groups. """
|
||||
info_prefix = 'Info: [rebalancing-tags-group-exclude]:'
|
||||
info_prefix = 'Info: [rebalancing-tags-group-exclude]:'
|
||||
tags_exclude_vms = {}
|
||||
processed_vm = []
|
||||
|
||||
@@ -543,92 +756,176 @@ def __get_vm_tags_exclude_groups(vm_statistics, node_statistics, balancing_metho
|
||||
random_node = random.choice(list(node_statistics.keys()))
|
||||
else:
|
||||
_mocked_vm_object = (vm_name, vm_statistics[vm_name])
|
||||
node_statistics, vm_statistics = __update_resource_statistics(_mocked_vm_object, [random_node],
|
||||
vm_statistics, node_statistics, balancing_method)
|
||||
node_statistics, vm_statistics = __update_resource_statistics(_mocked_vm_object, [random_node], vm_statistics, node_statistics, balancing_method, balancing_mode)
|
||||
processed_vm.append(vm_name)
|
||||
|
||||
return node_statistics, vm_statistics
|
||||
|
||||
|
||||
def run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args):
|
||||
""" Run rebalancing of vms to new nodes in cluster. """
|
||||
def __wait_job_finalized(api_object, node_name, job_id, counter):
|
||||
""" Wait for a job to be finalized. """
|
||||
error_prefix = 'Error: [job-status-getter]:'
|
||||
info_prefix = 'Info: [job-status-getter]:'
|
||||
|
||||
logging.info(f'{info_prefix} Getting job status for job {job_id}.')
|
||||
task = api_object.nodes(node_name).tasks(job_id).status().get()
|
||||
logging.info(f'{info_prefix} {task}')
|
||||
|
||||
if task['status'] == 'running':
|
||||
logging.info(f'{info_prefix} Validating job {job_id} for the {counter} run.')
|
||||
|
||||
# Do not run for infinity this recursion and fail when reaching the limit.
|
||||
if counter == 300:
|
||||
logging.critical(f'{error_prefix} The job {job_id} on node {node_name} did not finished in time for migration.')
|
||||
|
||||
time.sleep(5)
|
||||
counter = counter + 1
|
||||
logging.info(f'{info_prefix} Revalidating job {job_id} in a next run.')
|
||||
__wait_job_finalized(api_object, node_name, job_id, counter)
|
||||
|
||||
logging.info(f'{info_prefix} Job {job_id} for migration from {node_name} terminiated succesfully.')
|
||||
|
||||
|
||||
def __run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations):
|
||||
""" Run & execute the VM rebalancing via API. """
|
||||
error_prefix = 'Error: [rebalancing-executor]:'
|
||||
info_prefix = 'Info: [rebalancing-executor]:'
|
||||
|
||||
if not app_args.dry_run:
|
||||
logging.info(f'{info_prefix} Starting to rebalance vms to their new nodes.')
|
||||
if len(vm_statistics_rebalanced) > 0 and not app_args.dry_run:
|
||||
for vm, value in vm_statistics_rebalanced.items():
|
||||
|
||||
try:
|
||||
logging.info(f'{info_prefix} Rebalancing vm {vm} from node {value["node_parent"]} to node {value["node_rebalance"]}.')
|
||||
api_object.nodes(value['node_parent']).qemu(value['vmid']).migrate().post(target=value['node_rebalance'],online=1)
|
||||
# Migrate type VM (live migration).
|
||||
if value['type'] == 'vm':
|
||||
logging.info(f'{info_prefix} Rebalancing VM {vm} from node {value["node_parent"]} to node {value["node_rebalance"]}.')
|
||||
job_id = api_object.nodes(value['node_parent']).qemu(value['vmid']).migrate().post(target=value['node_rebalance'],online=1)
|
||||
|
||||
# Migrate type CT (requires restart of container).
|
||||
if value['type'] == 'ct':
|
||||
logging.info(f'{info_prefix} Rebalancing CT {vm} from node {value["node_parent"]} to node {value["node_rebalance"]}.')
|
||||
job_id = api_object.nodes(value['node_parent']).lxc(value['vmid']).migrate().post(target=value['node_rebalance'],restart=1)
|
||||
|
||||
except proxmoxer.core.ResourceException as error_resource:
|
||||
logging.critical(f'{error_prefix} {error_resource}')
|
||||
if app_args.json:
|
||||
logging.info(f'{info_prefix} Printing json output of VM statistics.')
|
||||
json.dumps(vm_statistics_rebalanced)
|
||||
else:
|
||||
logging.info(f'{info_prefix} Starting dry-run to rebalance vms to their new nodes.')
|
||||
_vm_to_node_list = []
|
||||
_vm_to_node_list.append(['VM', 'Current Node', 'Rebalanced Node'])
|
||||
|
||||
for vm_name, vm_values in vm_statistics_rebalanced.items():
|
||||
_vm_to_node_list.append([vm_name, vm_values['node_parent'], vm_values['node_rebalance']])
|
||||
|
||||
if app_args.json:
|
||||
logging.info(f'{info_prefix} Printing json output of VM statistics.')
|
||||
json.dumps(vm_statistics_rebalanced)
|
||||
else:
|
||||
if len(vm_statistics_rebalanced) > 0:
|
||||
logging.info(f'{info_prefix} Printing cli output of VM rebalancing.')
|
||||
print_table_cli(_vm_to_node_list)
|
||||
# Wait for migration to be finished unless running parallel migrations.
|
||||
if not bool(int(parallel_migrations)):
|
||||
logging.info(f'{info_prefix} Rebalancing will be performed sequentially.')
|
||||
__wait_job_finalized(api_object, value['node_parent'], job_id, counter=1)
|
||||
else:
|
||||
logging.info(f'{info_prefix} No rebalancing needed according to the defined balanciness.')
|
||||
print('No rebalancing needed according to the defined balanciness.')
|
||||
logging.info(f'{info_prefix} Rebalancing will be performed parallely.')
|
||||
|
||||
else:
|
||||
logging.info(f'{info_prefix} No rebalancing needed.')
|
||||
|
||||
|
||||
def print_table_cli(table):
|
||||
def __create_json_output(vm_statistics_rebalanced, app_args):
|
||||
""" Create a machine parsable json output of VM rebalance statitics. """
|
||||
info_prefix = 'Info: [json-output-generator]:'
|
||||
|
||||
if app_args.json:
|
||||
logging.info(f'{info_prefix} Printing json output of VM statistics.')
|
||||
print(json.dumps(vm_statistics_rebalanced))
|
||||
|
||||
|
||||
def __create_cli_output(vm_statistics_rebalanced, app_args):
|
||||
""" Create output for CLI when running in dry-run mode. """
|
||||
info_prefix_dry_run = 'Info: [cli-output-generator-dry-run]:'
|
||||
info_prefix_run = 'Info: [cli-output-generator]:'
|
||||
vm_to_node_list = []
|
||||
|
||||
if app_args.dry_run:
|
||||
info_prefix = info_prefix_dry_run
|
||||
logging.info(f'{info_prefix} Starting dry-run to rebalance vms to their new nodes.')
|
||||
else:
|
||||
info_prefix = info_prefix_run
|
||||
logging.info(f'{info_prefix} Start rebalancing vms to their new nodes.')
|
||||
|
||||
vm_to_node_list.append(['VM', 'Current Node', 'Rebalanced Node', 'VM Type'])
|
||||
for vm_name, vm_values in vm_statistics_rebalanced.items():
|
||||
vm_to_node_list.append([vm_name, vm_values['node_parent'], vm_values['node_rebalance'], vm_values['type']])
|
||||
|
||||
if len(vm_statistics_rebalanced) > 0:
|
||||
logging.info(f'{info_prefix} Printing cli output of VM rebalancing.')
|
||||
__print_table_cli(vm_to_node_list, app_args.dry_run)
|
||||
else:
|
||||
logging.info(f'{info_prefix} No rebalancing needed.')
|
||||
|
||||
|
||||
def __print_table_cli(table, dry_run=False):
|
||||
""" Pretty print a given table to the cli. """
|
||||
info_prefix_dry_run = 'Info: [cli-output-generator-table-dryn-run]:'
|
||||
info_prefix_run = 'Info: [cli-output-generator-table]:'
|
||||
info_prefix = info_prefix_run
|
||||
|
||||
longest_cols = [
|
||||
(max([len(str(row[i])) for row in table]) + 3)
|
||||
for i in range(len(table[0]))
|
||||
]
|
||||
|
||||
row_format = "".join(["{:>" + str(longest_col) + "}" for longest_col in longest_cols])
|
||||
|
||||
for row in table:
|
||||
print(row_format.format(*row))
|
||||
# Print CLI output when running in dry-run mode to make the user's life easier.
|
||||
if dry_run:
|
||||
info_prefix = info_prefix_dry_run
|
||||
print(row_format.format(*row))
|
||||
|
||||
# Log all items in info mode.
|
||||
logging.info(f'{info_prefix} {row_format.format(*row)}')
|
||||
|
||||
|
||||
def run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations):
|
||||
""" Run rebalancing of vms to new nodes in cluster. """
|
||||
__run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations)
|
||||
__create_json_output(vm_statistics_rebalanced, app_args)
|
||||
__create_cli_output(vm_statistics_rebalanced, app_args)
|
||||
|
||||
|
||||
def main():
|
||||
""" Run ProxLB for balancing VM workloads across a Proxmox cluster. """
|
||||
# Initialize PAS.
|
||||
initialize_logger('CRITICAL', 'SystemdHandler()')
|
||||
initialize_logger('CRITICAL')
|
||||
app_args = initialize_args()
|
||||
config_path = initialize_config_path(app_args)
|
||||
pre_validations(config_path)
|
||||
|
||||
# Parse global config
|
||||
proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v, balancing_method, \
|
||||
balanciness, ignore_nodes, ignore_vms, daemon, schedule = initialize_config_options(config_path)
|
||||
# Parse global config.
|
||||
proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v, balancing_method, balancing_mode, balancing_mode_option, balancing_type, \
|
||||
balanciness, parallel_migrations, ignore_nodes, ignore_vms, master_only, daemon, schedule, log_verbosity = initialize_config_options(config_path)
|
||||
|
||||
# Overwrite logging handler with user defined log verbosity.
|
||||
initialize_logger(log_verbosity, update_log_verbosity=True)
|
||||
|
||||
while True:
|
||||
# API Authentication.
|
||||
api_object = api_connect(proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v)
|
||||
|
||||
# Get master node of cluster and ensure that ProxLB is only performed on the
|
||||
# cluster master node to avoid ongoing rebalancing.
|
||||
cluster_master, master_only = execute_rebalancing_only_by_master(api_object, master_only)
|
||||
|
||||
# Validate daemon service and skip following tasks when not being the cluster master.
|
||||
if not cluster_master and master_only:
|
||||
validate_daemon(daemon, schedule)
|
||||
continue
|
||||
|
||||
# Get metric & statistics for vms and nodes.
|
||||
node_statistics = get_node_statistics(api_object, ignore_nodes)
|
||||
vm_statistics = get_vm_statistics(api_object, ignore_vms)
|
||||
vm_statistics = get_vm_statistics(api_object, ignore_vms, balancing_type)
|
||||
node_statistics = update_node_statistics(node_statistics, vm_statistics)
|
||||
|
||||
# Calculate rebalancing of vms.
|
||||
node_statistics_rebalanced, vm_statistics_rebalanced = balancing_calculations(balancing_method, node_statistics, vm_statistics, balanciness)
|
||||
node_statistics_rebalanced, vm_statistics_rebalanced = balancing_calculations(balancing_method, balancing_mode, balancing_mode_option,
|
||||
node_statistics, vm_statistics, balanciness, rebalance=False, processed_vms=[])
|
||||
|
||||
# Rebalance vms to new nodes within the cluster.
|
||||
run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args)
|
||||
run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations)
|
||||
|
||||
# Validate for any errors
|
||||
# Validate for any errors.
|
||||
post_validations()
|
||||
|
||||
# Validate daemon service
|
||||
# Validate daemon service.
|
||||
validate_daemon(daemon, schedule)
|
||||
|
||||
|
||||
|
||||
@@ -5,8 +5,10 @@ api_pass: FooBar
|
||||
verify_ssl: 1
|
||||
[balancing]
|
||||
method: memory
|
||||
mode: used
|
||||
ignore_nodes: dummynode01,dummynode02
|
||||
ignore_vms: testvm01,testvm02
|
||||
[service]
|
||||
daemon: 1
|
||||
schedule: 24
|
||||
log_verbosity: CRITICAL
|
||||
|
||||
Reference in New Issue
Block a user