Compare commits

...

54 Commits

Author SHA1 Message Date
gyptazy
cc663c0518 docs: * Fix the rendering of the possible values of the ProxLB options in the README file
* Mention the privilege separation part on the token generation chapter

Fixes: #209
2025-04-19 06:49:04 +02:00
Florian
40de31bc3b Merge pull request #208 from gyptazy/techdebt/fix-code-style
tecdebt: Adjust code style.
2025-04-18 17:07:01 +02:00
gyptazy
5884d76ff4 tecdebt: Adjust code style. 2025-04-18 16:52:59 +02:00
Florian
7cc59eb6fc Merge pull request #202 from glitchvern/fix/200-requery-zero-guest-cpu-used2
fix: Requery a guest if that running guest reports 0 cpu usage
2025-04-18 16:38:17 +02:00
gyptazy
24b3b35640 fix: Fix the guest type relationship in the logs when a migration job failed (by @gyptazy) [#204]
feature: Providing the API upstream error message when migration fails in debug mode (by @gyptazy) [#205]

Fixes: #204
Fixes: #205
2025-04-18 16:35:02 +02:00
Florian
f2b8829299 Merge pull request #204 from sid3windr/patch-1
Fix default configuration file path in README.md
2025-04-18 12:41:22 +02:00
Tom Laermans
4b64a041cc Fix default configuration file path in README.md
With 1.1.0, the default configuration file changed from proxlb.conf to proxlb.yaml but the README was not fully updated.
2025-04-18 11:04:51 +02:00
glitchvern
bd1157127a fix: limit to 10 requerys per a guest 2025-04-17 16:13:28 +00:00
glitchvern
be6e4bbfa0 fix: Requery a guest if that running guest reports 0 cpu usage 2025-04-16 18:42:27 +00:00
Florian
25b631099c Merge pull request #199 from gyptazy/docs/193-add-chapter-ignore-vm
docs: Add documentation about ignore guests such like VMs or CTs.
2025-04-15 19:23:27 +02:00
gyptazy
1d698c5688 docs: Add documentation about ignore guests such like VMs or CTs.
Fixes: #193
2025-04-15 19:22:10 +02:00
Florian
40f848ad7f Merge pull request #198 from glitchvern/fix/197-remove-hard-coded-memory-usage-from-lowest-usage-node
fix: Use method/mode in configuration to calculate lowest_usage_node
2025-04-15 19:08:52 +02:00
Florian
fd2725c878 Merge pull request #196 from glitchvern/fix/195-cpu-used-times-cpu-cores
fix: set cpu_used to be cpu usage times number of cpu cores
2025-04-15 18:36:25 +02:00
glitchvern
34b1d72e40 fix: Use method and mode specified in configuration to calculate lowest_usage_node 2025-04-15 16:27:08 +00:00
glitchvern
ca7db26976 fix: set cpu_used to be cpu usage times number of cpu cores 2025-04-14 21:23:05 +00:00
Florian
94552f9c9e Merge pull request #194 from crandler/main
Main
2025-04-14 12:44:50 +02:00
Sven Eulberg
32c67b9c96 fix: typos 2025-04-14 12:36:28 +02:00
Florian
89f337d8c3 Merge pull request #192 from gyptazy/tecdebt/185-improve-logging-code
tecdebt: Improve logging handler creation
2025-04-14 06:55:51 +02:00
Florian Paul Azim Hoberg (@gyptazy)
8a724400b8 tecdebt: Improve logging handler creation
Fixes: #185
2025-04-14 06:52:04 +02:00
Florian
f96f1d0f64 Merge pull request #186 from glitchvern/fix/185-logging-handler-for-no-systemd-integration
fix: logging handler for no systemd integration
2025-04-14 06:46:58 +02:00
Florian
15398712ee Merge pull request #190 from mika/mika/docs
docs: Fix minor typos
2025-04-13 11:19:18 +02:00
Florian
ddb9963062 Merge pull request #191 from gyptazy/feature/184-validate-user-permissions
Feature: Add validation for the minimum required permissions of a user in Proxmox.
2025-04-13 11:16:09 +02:00
Florian Paul Azim Hoberg (@gyptazy)
f18a9f3d4c Feature: Add validation for the minimum required permissions of a user in Proxmox.
Fixes: #184
2025-04-13 11:12:30 +02:00
Michael Prokop
1402ba9732 Minor typo fixes
s/connectoing/connecting/
s/furhter/further/
s/interating/iterating/
s/ist/is/
s/maintence/maintenance/
s/performt/performed/
s/ressources/resources/
s/sucessfully/successfully/
s/the the/the/
s/timout/timeout/
s/wether/whether/
2025-04-13 10:48:23 +02:00
Florian
af51f53221 Merge pull request #188 from glitchvern/fix/187-allow-use-of-minutes-instead-of-hours
fix: allow use of minutes instead of hours
2025-04-13 08:49:17 +02:00
glitchvern
bce2d640ef fix: allow use of minutes instead of hours 2025-04-11 23:09:00 +00:00
glitchvern
1bb1847e45 fix: logging handler for no systemd integration 2025-04-11 21:55:09 +00:00
Florian
e9543db138 Merge pull request #182 from gyptazy/change/180-switch-default-balancing-to-used-instead-assigned
change: Change the default banalcing mode to used instead of assigned.
2025-04-10 09:34:19 +02:00
gyptazy
a8e8229787 change: Change the default banalcing mode to used instead of assigned.
Fixes: #180
2025-04-10 09:33:17 +02:00
Florian
d1c91c6f2a Merge pull request #179 from gyptazy/docs/164-adjust-api-token-usage
docs: Adjust docs regarding API Token and privilege separation.
2025-04-07 16:14:40 +02:00
gyptazy
843691f8b4 docs: Adjust docs regarding API Token and priviledge separation.
Fixes: #164
2025-04-07 15:51:44 +02:00
Florian
c9f14946d1 Merge pull request #178 from gyptazy/fix/174-honor-balancing-activation-value
fix: Honor the value when balancing should not be performed and stop balancing.
2025-04-07 15:41:02 +02:00
gyptazy
77cd7b5388 fix: Honor the value when balancing should not be performed and stop balancing.
Fixes: #174
2025-04-07 15:38:32 +02:00
Florian
55502f9bed Merge pull request #177 from gyptazy/change/176-change-turn-daemon-mode-on-default
change: Change the default behaviour of the daemon mode to active.
2025-04-07 15:28:12 +02:00
gyptazy
f08b823cc4 change: Change the default behaviour of the daemon mode to active.
Fixes: #176
2025-04-07 15:25:10 +02:00
Florian
f831d4044f Merge pull request #175 from gyptazy/feature/168-add-more-flexible-schedule-timers
feature: Add a more flexible way to define schedules directly in minutes or hours
2025-04-07 15:20:22 +02:00
gyptazy
e8d8d160a7 feature: Add a more flexible way to define schedules directly in minutes or hours. [#168]
Sponsored-by: @gyptazy
Fixes: #168
2025-04-07 15:16:55 +02:00
Florian
dbbd4c0ec8 Merge pull request #172 from gyptazy/changelog/171-set-correct-python-path-docker-image
changelog: Add changelog for: Fix Python 3 path for Docker entrypoint
2025-04-02 07:24:01 +02:00
Florian
fc9a0e2858 Merge pull request #171 from crandler/main
fix: path correction for docker entrypoint
2025-04-02 07:23:48 +02:00
gyptazy
17eb43db94 changelog: Add changelog for: Fix Python 3 path for Docker entrypoint
Sponsored-by: @crandler
Fixes: #170
Fixes: #171
2025-04-02 07:20:15 +02:00
Sven Eulberg
06610e9b9d Path correction 2025-04-01 18:38:58 +02:00
Florian
889b88fd6c Merge pull request #167 from gyptazy/prep/1.1.1
release: Prepare development branch for release 1.1.1
2025-04-01 08:03:36 +02:00
gyptazy
c5ca3e13e0 release: Prepare development branch for release 1.1.1 2025-04-01 08:02:40 +02:00
Florian
c1c524f092 Merge pull request #166 from gyptazy/fix/163-ignore-vm-tag
fix: Fix tag evluation for VMs for being ignored for further balancing
2025-04-01 07:01:14 +02:00
gyptazy
7ea7defa1f fix: Fix tag evluation for VMs for being ignored for further balancing
Fixes: #163
Fixes: #165
2025-04-01 06:51:42 +02:00
Florian
6147c0085b Merge pull request #161 from gyptazy/fix/spell-docs
fix: Adjust spelling in the docs
2025-03-31 07:39:40 +02:00
gyptazy
0b70a9c767 fix: Adjust spelling in the docs 2025-03-31 07:38:04 +02:00
Florian
d6d22c4096 Merge pull request #160 from gyptazy/fix/142-mutal-exclusive-on-pass
fix: Fix mutal exclusive authentication based on secrets.
2025-03-31 06:50:26 +02:00
gyptazy
6da54c1255 fix: Fix mutal exclusive authentication based on secrets.
Fixes: #142
2025-03-31 06:46:31 +02:00
Florian
b55b4ea7a0 Merge pull request #153 from gyptazy/docs/installation
release: Prepare release 1.1.0
2025-03-31 05:15:05 +02:00
Florian
51625fe09e Merge pull request #159 from gyptazy/feature/json-output
fix: Add JSON output again
2025-03-25 09:34:10 +01:00
Florian Paul Azim Hoberg (@gyptazy)
f3b9d33c87 fix: Add JSON output again
Fixes: #158
2025-03-25 09:28:33 +01:00
Florian
8e4326f77a Merge pull request #156 from gyptazy/fix/137-fix-systemd-unit
fix: Fix the systemd unit file to start after the pveproxy daemon
2025-03-24 18:25:10 +01:00
gyptazy
3d642a7404 fix: Fix the systemd unit file to start after the pveproxy daemon
Fixes: #137
2025-03-24 18:15:11 +01:00
35 changed files with 285 additions and 104 deletions

View File

@@ -0,0 +1,2 @@
fixed:
- Fix the systemd unit file to start ProxLB after pveproxy (by @robertdahlem). [#137]

View File

@@ -0,0 +1,2 @@
fixed:
- Fix tag evluation for VMs for being ignored for further balancing [#163]

View File

@@ -0,0 +1,2 @@
fixed:
- Improve logging verbosity of messages that had a wrong servity [#165]

View File

@@ -0,0 +1,2 @@
feature:
- Add a more flexible way to define schedules in minutes or hours (by @gyptazy) [#168]

View File

@@ -0,0 +1,2 @@
fixed:
- Fix Python path for Docker entrypoint (by @crandler) [#170]

View File

@@ -0,0 +1,2 @@
fixed:
- Honor the value when balancing should not be performed and stop balancing [#174]

View File

@@ -0,0 +1,2 @@
changed:
- Change the default behaviour of the daemon mode to active [#176]

View File

@@ -0,0 +1,2 @@
changed:
- Change the default banalcing mode to used instead of assigned [#180]

View File

@@ -0,0 +1,2 @@
feature:
- Add validation for the minimum required permissions of a user in Proxmox [#184]

View File

@@ -0,0 +1,2 @@
fix:
- add handler to log messages with severity less than info to the screen when there is no systemd integration, for instance, inside a docker container (by @glitchvern) [#185]

View File

@@ -0,0 +1,2 @@
fixed:
- allow the use of minutes instead of hours and only accept hours or minutes in the format (by @glitchvern) [#187]

View File

@@ -0,0 +1,2 @@
fixed:
- Set cpu_used to the cpu usage, which is a percent, times the total number of cores to get a number where guest cpu_used can be added to nodes cpu_used and be meaningful (by @glitchvern) [#195]

View File

@@ -0,0 +1,2 @@
fixed:
- Remove hard coded memory usage from lowest usage node and use method and mode specified in configuration instead (by @glitchvern) [#197]

View File

@@ -0,0 +1,2 @@
fixed:
- Requery a guest if that running guest reports 0 cpu usage (by @glitchvern) [#200]

View File

@@ -0,0 +1,2 @@
fixed:
- Fix the guest type relationship in the logs when a migration job failed (by @gyptazy) [#204]

View File

@@ -0,0 +1,2 @@
added:
- Providing the API upstream error message when migration fails in debug mode (by @gyptazy) [#205]

View File

@@ -0,0 +1 @@
date: TBD

View File

@@ -25,4 +25,4 @@ COPY requirements.txt /app/requirements.txt
RUN pip install --break-system-packages -r /app/requirements.txt
# Set the entry point to use the virtual environment's python
ENTRYPOINT ["/bin/python3", "/app/proxlb/main.py"]
ENTRYPOINT ["/usr/bin/python3", "/app/proxlb/main.py"]

View File

@@ -20,6 +20,7 @@
6. [Affinity & Anti-Affinity Rules](#affinity--anti-affinity-rules)
1. [Affinity Rules](#affinity-rules)
2. [Anti-Affinity Rules](#anti-affinity-rules)
3. [Ignore VMs](#ignore-vms)
7. [Maintenance](#maintenance)
8. [Misc](#misc)
1. [Bugs](#bugs)
@@ -153,7 +154,7 @@ vi proxlb.yaml
docker run -it --rm -v $(pwd)/proxlb.yaml:/etc/proxlb/proxlb.yaml proxlb
```
*Note: ProxLB container images are officially only available at cr.proxlb.de and cr.gyptazy.com.*
*Note: ProxLB container images are officially only available at cr.proxlb.de and cr.gyptazy.com.*
#### Overview of Images
| Version | Image |
@@ -231,35 +232,38 @@ See also: [#65: Host groups: Honour HA groups](https://github.com/gyptazy/ProxLB
### Options
The following options can be set in the configuration file `proxlb.yaml`:
| Section | Option | Example | Type | Description |
|------|:------:|:------:|:------:|:------:|
| `proxmox_api` | | | | |
| | hosts | ['virt01.example.com', '10.10.10.10', 'fe01::bad:code::cafe'] | `List` | List of Proxmox nodes. Can be IPv4, IPv6 or mixed. |
| | user | root@pam | `Str` | Username for the API. |
| | pass | FooBar | `Str` | Password for the API. (Recommended: Use API token authorization!) |
| | token_id | proxlb | `Str` | Token ID of the user for the API. |
| | token_secret | 430e308f-1337-1337-beef-1337beefcafe | `Str` | Secret of the token ID for the API. |
| | ssl_verification | True | `Bool` | Validate SSL certificates (1) or ignore (0). (default: 1, type: bool) |
| | timeout | 10 | `Int` | Timeout for the Proxmox API in sec. (default: 10) |
| `proxmox_cluster` | | | | |
| | maintenance_nodes | ['virt66.example.com'] | `List` | A list of Proxmox nodes that are defined to be in a maintenance. (default: []) |
| | ignore_nodes | [] | `List` | A list of Proxmox nodes that are defined to be ignored. (default: []) |
| | overprovisioning | False | `Bool` | Avoids balancing when nodes would become overprovisioned. |
| `balancing` | | | | |
| | enable | True | `Bool` | Enables the guest balancing. (default: True)|
| | enforce_affinity | True | `Bool` | Enforcing affinity/anti-affinity rules but balancing might become worse. (default: False) |
| | parallel | False | `Bool` | If guests should be moved in parallel or sequentially. (default: False)|
| | live | True | `Bool` | If guests should be moved live or shutdown. (default: True)|
| | with_local_disks | True | `Bool` | If balancing of guests should include local disks (default: True)|
| | balance_types | ['vm', 'ct'] | `List` | Defined the types of guests that should be honored. (default: ['vm', 'ct']) |
| | max_job_validation | 1800 | `Int` | How long a job validation may take in seconds. (default: 1800) |
| | balanciness | 10 | `Int` | The maximum delta of resource usage between node with highest and lowest usage. (default: 10) |
| | method | memory | `Str` | The balancing method that should be used. (default: memory | choices: memory, cpu, disk)|
| | mode | used | `Str` | The balancing mode that should be used. (default: used | choices: used, assigned)|
| `service` | | | | |
| | daemon | False | `Bool` | If daemon mode should be activated (default: False)|
| | schedule | 12 | `Int` | How often rebalancing should occur in hours in daemon mode (default: 12)|
| | log_level | INFO | `Str` | Defines the default log level that should be logged. (default: INFO) |
| Section | Option | Sub Option | Example | Type | Description |
|---------|:------:|:----------:|:-------:|:----:|:-----------:|
| `proxmox_api` | | | | | |
| | hosts | | ['virt01.example.com', '10.10.10.10', 'fe01::bad:code::cafe'] | `List` | List of Proxmox nodes. Can be IPv4, IPv6 or mixed. |
| | user | | root@pam | `Str` | Username for the API. |
| | pass | | FooBar | `Str` | Password for the API. (Recommended: Use API token authorization!) |
| | token_id | | proxlb | `Str` | Token ID of the user for the API. |
| | token_secret | | 430e308f-1337-1337-beef-1337beefcafe | `Str` | Secret of the token ID for the API. |
| | ssl_verification | | True | `Bool` | Validate SSL certificates (1) or ignore (0). [values: `1` (default), `0`] |
| | timeout | | 10 | `Int` | Timeout for the Proxmox API in sec. |
| `proxmox_cluster` | | | | | |
| | maintenance_nodes | | ['virt66.example.com'] | `List` | A list of Proxmox nodes that are defined to be in a maintenance. |
| | ignore_nodes | | [] | `List` | A list of Proxmox nodes that are defined to be ignored. |
| | overprovisioning | | False | `Bool` | Avoids balancing when nodes would become overprovisioned. |
| `balancing` | | | | | |
| | enable | | True | `Bool` | Enables the guest balancing.|
| | enforce_affinity | | True | `Bool` | Enforcing affinity/anti-affinity rules but balancing might become worse. |
| | parallel | | False | `Bool` | If guests should be moved in parallel or sequentially.|
| | live | | True | `Bool` | If guests should be moved live or shutdown.|
| | with_local_disks | | True | `Bool` | If balancing of guests should include local disks.|
| | balance_types | | ['vm', 'ct'] | `List` | Defined the types of guests that should be honored. [values: `vm`, `ct`]|
| | max_job_validation | | 1800 | `Int` | How long a job validation may take in seconds. (default: 1800) |
| | balanciness | | 10 | `Int` | The maximum delta of resource usage between node with highest and lowest usage. |
| | method | | memory | `Str` | The balancing method that should be used. [values: `memory` (default), `cpu`, `disk`]|
| | mode | | used | `Str` | The balancing mode that should be used. [values: `used` (default), `assigned`] |
| `service` | | | | | |
| | daemon | | True | `Bool` | If daemon mode should be activated. |
| | `schedule` | | | `Dict` | Schedule config block for rebalancing. |
| | | interval | 12 | `Int` | How often rebalancing should occur in daemon mode.|
| | | format | hours | `Str` | Sets the time format. [values: `hours` (default), `minutes`]|
| | log_level | | INFO | `Str` | Defines the default log level that should be logged. [values: `INFO` (default), `WARNING`, `CRITICAL`, `DEBUG`] |
An example of the configuration file looks like:
```
@@ -287,11 +291,13 @@ balancing:
max_job_validation: 1800
balanciness: 5
method: memory
mode: assigned
mode: used
service:
daemon: True
schedule: 12
schedule:
interval: 12
format: hours
log_level: INFO
```
@@ -300,7 +306,7 @@ The following options and parameters are currently supported:
| Option | Long Option | Description | Default |
|------|:------:|------:|------:|
| -c | --config | Path to a config file. | /etc/proxlb/proxlb.conf (default) |
| -c | --config | Path to a config file. | /etc/proxlb/proxlb.yaml (default) |
| -d | --dry-run | Performs a dry-run without doing any actions. | False |
| -j | --json | Returns a JSON of the VM movement. | False |
| -b | --best-node | Returns the best next node for a VM/CT placement (useful for further usage with Terraform/Ansible). | False |
@@ -337,6 +343,20 @@ As a result, ProxLB will try to place the VMs with the `plb_anti_affinity_ntp` t
**Note:** While this ensures that ProxLB tries distribute these VMs across different physical hosts within the Proxmox cluster this may not always work. If you have more guests attached to the group than nodes in the cluster, we still need to run them anywhere. If this case occurs, the next one with the most free resources will be selected.
### Ignore VMs / CTs
<img align="left" src="https://cdn.gyptazy.com/images/proxlb-ignore-vm-movement.jpg"/> Guests, such as VMs or CTs, can also be completely ignored. This means, they won't be affected by any migration (even when (anti-)affinity rules are enforced). To ensure a proper resource evaluation, these guests are still collected and evaluated but simply skipped for balancing actions. Another thing is the implementation. While ProxLB might have a very restricted configuration file including the file permissions, this file is only read- and writeable by the Proxmox administrators. However, we might have user and groups who want to define on their own that their systems shouldn't be moved. Therefore, these users can simpy set a specific tag to the guest object - just like the (anti)affinity rules.
To define a guest to be ignored from the balancing, users assign a tag with the prefix `plb_ignore_$TAG`:
#### Example for Screenshot
```
plb_ignore_dev
```
As a result, ProxLB will not migrate this guest with the `plb_ignore_dev` tag to any other node.
**Note:** Ignored guests are really ignored. Even by enforcing affinity rules this guest will be ignored.
## Maintenance
<img src="https://cdn.gyptazy.com/images/proxlb-rebalancing-demo.gif"/>
@@ -375,4 +395,4 @@ Connect with us in our dedicated chat room for immediate support and live intera
**Note:** Please always keep in mind that this is a one-man show project without any further help. This includes coding, testing, packaging and all the infrastructure around it to keep this project up and running.
### Author(s)
* Florian Paul Azim Hoberg @gyptazy (https://gyptazy.com)
* Florian Paul Azim Hoberg @gyptazy (https://gyptazy.com)

View File

@@ -23,9 +23,11 @@ balancing:
max_job_validation: 1800
balanciness: 5
method: memory
mode: assigned
mode: used
service:
daemon: False
schedule: 12
daemon: True
schedule:
interval: 12
format: hours
log_level: INFO

7
debian/changelog vendored
View File

@@ -1,3 +1,10 @@
proxlb (1.1.1) stable; urgency=medium
* Fix tag evluation for VMs for being ignored for further balancing. (Closes: #163)
* Improve logging verbosity of messages that had a wrong servity. (Closes: #165)
-- Florian Paul Azim Hoberg <gyptazy@gyptazy.com> Tue, 1 Apr 2025 18:55:02 +0000
proxlb (1.1.0) stable; urgency=medium
* Refactored code base of ProxLB. (Closes: #114)

View File

@@ -10,6 +10,7 @@
1. [Affinity Rules](#affinity-rules)
2. [Anti-Affinity Rules](#anti-affinity-rules)
3. [Affinity / Anti-Affinity Enforcing](#affinity--anti-affinity-enforcing)
4. [Ignore VMs](#ignore-vms)
2. [API Loadbalancing](#api-loadbalancing)
3. [Ignore Host-Nodes or Guests](#ignore-host-nodes-or-guests)
4. [IPv6 Support](#ipv6-support)
@@ -38,18 +39,21 @@ pveum acl modify / --roles proxlb --users proxlb@pve
*Note: The user management can also be done on the WebUI without invoking the CLI.*
### Creating an API Token for a User
Create an API token for user proxlb@pve with token ID proxlb and deactivated privilege separation:
```
# Create an API token for user proxlb@pve with token ID proxlb
pveum user token add proxlb@pve proxlb
pveum user token add proxlb@pve proxlb --privsep 0
```
Afterwards, you get the token secret returned. You can now add those entries to your ProxLB config.
Afterwards, you get the token secret returned. You can now add those entries to your ProxLB config. Make sure, that you also keep the `user` parameter, next to the new token parameters.
> [!IMPORTANT]
> The parameter `pass` then needs to be **absent**! You should also take care about the privilege and authentication mechanism behind Proxmox. You might want or even might not want to use privilege separation and this is up to your personal needs and use case.
| Proxmox API | ProxLB Config | Example |
|---|---|---|
| User | [user](https://github.com/gyptazy/ProxLB/blob/main/config/proxlb_example.yaml#L3) | proxlb@pve |
| Token ID | [token_id](https://github.com/gyptazy/ProxLB/blob/main/config/proxlb_example.yaml#L6) | proxlb |
| Secret | [token_secret](https://github.com/gyptazy/ProxLB/blob/main/config/proxlb_example.yaml#L7) | 430e308f-1337-1337-beef-1337beefcafe |
| Token Secret | [token_secret](https://github.com/gyptazy/ProxLB/blob/main/config/proxlb_example.yaml#L7) | 430e308f-1337-1337-beef-1337beefcafe |
*Note: The API token configuration can also be done on the WebUI without invoking the CLI.*
@@ -106,8 +110,22 @@ balancing:
*Note: This may have impacts to the cluster. Depending on the created group matrix, the result may also be an unbalanced cluster.*
### Ignore VMs / CTs
<img align="left" src="https://cdn.gyptazy.com/images/proxlb-ignore-vm-movement.jpg"/> Guests, such as VMs or CTs, can also be completely ignored. This means, they won't be affected by any migration (even when (anti-)affinity rules are enforced). To ensure a proper resource evaluation, these guests are still collected and evaluated but simply skipped for balancing actions. Another thing is the implementation. While ProxLB might have a very restricted configuration file including the file permissions, this file is only read- and writeable by the Proxmox administrators. However, we might have user and groups who want to define on their own that their systems shouldn't be moved. Therefore, these users can simpy set a specific tag to the guest object - just like the (anti)affinity rules.
To define a guest to be ignored from the balancing, users assign a tag with the prefix `plb_ignore_$TAG`:
#### Example for Screenshot
```
plb_ignore_dev
```
As a result, ProxLB will not migrate this guest with the `plb_ignore_dev` tag to any other node.
**Note:** Ignored guests are really ignored. Even by enforcing affinity rules this guest will be ignored.
### API Loadbalancing
ProxLB supports API loadbalancing, where one or more host objects can be defined as a list. This ensures, that you can even operator ProxLB without furhter changes when one or more nodes are offline or in a maintence. When defining multiple hosts, the first reachable one will be picked.
ProxLB supports API loadbalancing, where one or more host objects can be defined as a list. This ensures, that you can even operator ProxLB without further changes when one or more nodes are offline or in a maintenance. When defining multiple hosts, the first reachable one will be picked.
```
proxmox_api:
@@ -155,13 +173,15 @@ The proxlb systemd unit orchestrates the ProxLB application. ProxLB can be used
```
service:
daemon: False
schedule: 12
schedule:
interval: 12
format: hours
```
In this configuration:
* `daemon`: False indicates that the ProxLB application is not running as a daemon and will execute as a one-shot solution.
* `schedule`: 12 defines the schedule in hours, specifying how often rebalancing should be done if running as a daemon.
* `schedule`: 12 defines the interval for the schedule, specifying how often rebalancing should be done if running as a daemon.
* `format`: Defines the given format of schedule where you can choose between `hours` or `minutes`.
### SSL Self-Signed Certificates
If you are using SSL self-signed certificates or non-valid certificated in general and do not want to deal with additional trust levels, you may also disable the SSL validation. This may mostly be helpful for dev- & test labs.

View File

@@ -1,5 +1,5 @@
#!/usr/bin/env bash
VERSION="1.1.0"
VERSION="1.1.1"
sed -i "s/^__version__ = .*/__version__ = \"$VERSION\"/" "proxlb/utils/version.py"
sed -i "s/version=\"[0-9]*\.[0-9]*\.[0-9]*\"/version=\"$VERSION\"/" setup.py

View File

@@ -54,7 +54,7 @@ def main():
# Get all required objects from the Proxmox cluster
meta = {"meta": proxlb_config}
nodes = Nodes.get_nodes(proxmox_api, proxlb_config)
guests = Guests.get_guests(proxmox_api, nodes)
guests = Guests.get_guests(proxmox_api, nodes, meta)
groups = Groups.get_groups(guests, nodes)
# Merge obtained objects from the Proxmox cluster for further usage
@@ -71,9 +71,12 @@ def main():
Helper.log_node_metrics(proxlb_data, init=False)
# Perform balancing actions via Proxmox API
if not cli_args.dry_run:
if not cli_args.dry_run or not proxlb_data["meta"]["balancing"].get("enable", False):
Balancing(proxmox_api, proxlb_data)
# Validate if the JSON output should be
# printed to stdout
Helper.print_json(proxlb_data, cli_args.json)
# Validate daemon mode
Helper.get_daemon_mode(proxlb_config)

View File

@@ -52,22 +52,30 @@ class Balancing:
"""
for guest_name, guest_meta in proxlb_data["guests"].items():
# Check if the guest's target is not the same as the current node
if guest_meta["node_current"] != guest_meta["node_target"]:
guest_id = guest_meta["id"]
guest_node_current = guest_meta["node_current"]
guest_node_target = guest_meta["node_target"]
# Check if the guest is not ignored and perform the balancing
# operation based on the guest type
if not guest_meta["ignore"]:
guest_id = guest_meta["id"]
guest_node_current = guest_meta["node_current"]
guest_node_target = guest_meta["node_target"]
# VM Balancing
if guest_meta["type"] == "vm":
self.exec_rebalancing_vm(proxmox_api, proxlb_data, guest_name)
# VM Balancing
if guest_meta["type"] == "vm":
self.exec_rebalancing_vm(proxmox_api, proxlb_data, guest_name)
# CT Balancing
elif guest_meta["type"] == "ct":
self.exec_rebalancing_ct(proxmox_api, proxlb_data, guest_name)
# CT Balancing
elif guest_meta["type"] == "ct":
self.exec_rebalancing_ct(proxmox_api, proxlb_data, guest_name)
# Hopefully never reaching, but should be catched
# Just in case we get a new type of guest in the future
else:
logger.critical(f"Balancing: Got unexpected guest type: {guest_meta['type']}. Cannot proceed guest: {guest_meta['name']}.")
else:
logger.critical(f"Balancing: Got unexpected guest type: {guest_meta['type']}. Cannot proceed guest: {guest_meta['name']}.")
logger.debug(f"Balancing: Guest {guest_name} is ignored and will not be rebalanced.")
else:
logger.debug(f"Balancing: Guest {guest_name} is already on the target node {guest_meta['node_target']} and will not be rebalanced.")
def exec_rebalancing_vm(self, proxmox_api: any, proxlb_data: Dict[str, Any], guest_name: str) -> None:
"""
@@ -108,10 +116,10 @@ class Balancing:
try:
logger.debug(f"Balancing: Starting to migrate guest {guest_name} of type VM.")
job_id = proxmox_api.nodes(guest_node_current).qemu(guest_id).migrate().post(**migration_options)
job = self.get_rebalancing_job_status(proxmox_api, proxlb_data, guest_name, guest_node_current, job_id)
self.get_rebalancing_job_status(proxmox_api, proxlb_data, guest_name, guest_node_current, job_id)
except proxmoxer.core.ResourceException as proxmox_api_error:
logger.critical(f"Balancing: Failed to migrate guest {guest_name} of type CT due to some Proxmox errors. Please check if resource is locked or similar.")
logger.critical(f"Balancing: Failed to migrate guest {guest_name} of type VM due to some Proxmox errors. Please check if resource is locked or similar.")
logger.debug(f"Balancing: Failed to migrate guest {guest_name} of type VM due to some Proxmox errors: {proxmox_api_error}")
logger.debug("Finished: exec_rebalancing_vm.")
def exec_rebalancing_ct(self, proxmox_api: any, proxlb_data: Dict[str, Any], guest_name: str) -> None:
@@ -137,10 +145,10 @@ class Balancing:
try:
logger.debug(f"Balancing: Starting to migrate guest {guest_name} of type CT.")
job_id = proxmox_api.nodes(guest_node_current).lxc(guest_id).migrate().post(target=guest_node_target, restart=1)
job = self.get_rebalancing_job_status(proxmox_api, proxlb_data, guest_name, guest_node_current, job_id)
self.get_rebalancing_job_status(proxmox_api, proxlb_data, guest_name, guest_node_current, job_id)
except proxmoxer.core.ResourceException as proxmox_api_error:
logger.critical(f"Balancing: Failed to migrate guest {guest_name} of type CT due to some Proxmox errors. Please check if resource is locked or similar.")
logger.debug(f"Balancing: Failed to migrate guest {guest_name} of type CT due to some Proxmox errors: {proxmox_api_error}")
logger.debug("Finished: exec_rebalancing_ct.")
def get_rebalancing_job_status(self, proxmox_api: any, proxlb_data: Dict[str, Any], guest_name: str, guest_current_node: str, job_id: int, retry_counter: int = 1) -> bool:
@@ -184,7 +192,7 @@ class Balancing:
if job["status"] == "stopped":
if job["exitstatus"] == "OK":
logger.debug(f"Balancing: Job ID {job_id} (guest: {guest_name}) was sucessfully.")
logger.debug(f"Balancing: Job ID {job_id} (guest: {guest_name}) was successfully.")
logger.debug("Finished: get_rebalancing_job_status.")
return True
else:

View File

@@ -66,7 +66,7 @@ class Calculations:
@staticmethod
def set_node_assignments(proxlb_data: Dict[str, Any]) -> Dict[str, Any]:
"""
Set the assigned ressources of the nodes based on the current assigned
Set the assigned resources of the nodes based on the current assigned
guest resources by their created groups as an initial base.
Args:
@@ -119,10 +119,8 @@ class Calculations:
if method_value_highest - method_value_lowest > balanciness:
proxlb_data["meta"]["balancing"]["balance"] = True
logger.debug(f"Guest balancing is required. Highest value: {method_value_highest}, lowest value: {method_value_lowest} balanced by {method} and {mode}.")
logger.critical(f"Guest balancing is required. Highest value: {method_value_highest}, lowest value: {method_value_lowest} balanced by {method} and {mode}.")
else:
logger.debug(f"Guest balancing is ok. Highest value: {method_value_highest}, lowest value: {method_value_lowest} balanced by {method} and {mode}.")
logger.critical(f"Guest balancing is ok. Highest value: {method_value_highest}, lowest value: {method_value_lowest} balanced by {method} and {mode}.")
else:
logger.warning("No guests for balancing found.")
@@ -149,7 +147,9 @@ class Calculations:
# Do not include nodes that are marked in 'maintenance'
filtered_nodes = [node for node in proxlb_data["nodes"].values() if not node["maintenance"]]
lowest_usage_node = min(filtered_nodes, key=lambda x: x["memory_used_percent"])
method = proxlb_data["meta"]["balancing"].get("method", "memory")
mode = proxlb_data["meta"]["balancing"].get("mode", "used")
lowest_usage_node = min(filtered_nodes, key=lambda x: x[f"{method}_{mode}_percent"])
proxlb_data["meta"]["balancing"]["balance_reason"] = 'resources'
proxlb_data["meta"]["balancing"]["balance_next_node"] = lowest_usage_node["name"]
@@ -207,13 +207,13 @@ class Calculations:
None
"""
logger.debug("Starting: relocate_guests.")
if proxlb_data["meta"]["balancing"]["balance"] or proxlb_data["meta"]["balancing"]["enforce_affinity"]:
if proxlb_data["meta"]["balancing"]["balance"] or proxlb_data["meta"]["balancing"].get("enforce_affinity", False):
if proxlb_data["meta"]["balancing"].get("balance", False):
logger.debug("Balancing of guests will be performt. Reason: balanciness")
logger.debug("Balancing of guests will be performed. Reason: balanciness")
if proxlb_data["meta"]["balancing"].get("enforce_affinity", False):
logger.debug("Balancing of guests will be performt. Reason: enforce affinity balancing")
logger.debug("Balancing of guests will be performed. Reason: enforce affinity balancing")
for group_name in proxlb_data["groups"]["affinity"]:
@@ -248,10 +248,10 @@ class Calculations:
None
"""
logger.debug("Starting: val_anti_affinity.")
# Start by interating over all defined anti-affinity groups
# Start by iterating over all defined anti-affinity groups
for group_name in proxlb_data["groups"]["anti_affinity"].keys():
# Validate if the provided guest ist included in the anti-affinity group
# Validate if the provided guest is included in the anti-affinity group
if guest_name in proxlb_data["groups"]["anti_affinity"][group_name]['guests'] and not proxlb_data["guests"][guest_name]["processed"]:
logger.debug(f"Anti-Affinity: Guest: {guest_name} is included in anti-affinity group: {group_name}.")

View File

@@ -11,6 +11,7 @@ __license__ = "GPL-3.0"
from typing import Dict, Any
from utils.logger import SystemdLogger
from models.tags import Tags
import time
logger = SystemdLogger()
@@ -34,7 +35,7 @@ class Guests:
"""
@staticmethod
def get_guests(proxmox_api: any, nodes: Dict[str, Any]) -> Dict[str, Any]:
def get_guests(proxmox_api: any, nodes: Dict[str, Any], meta: Dict[str, Any]) -> Dict[str, Any]:
"""
Get metrics of all guests in a Proxmox cluster.
@@ -61,10 +62,22 @@ class Guests:
# resource metrics for rebalancing to ensure that we do not overprovisiong the node.
for guest in proxmox_api.nodes(node).qemu.get():
if guest['status'] == 'running':
# If the balancing method is set to cpu, we need to wait for the guest to report
# cpu usage. This is important for the balancing process to ensure that we do not
# wait for a guest for an infinite time.
if meta["meta"]["balancing"]["method"] == "cpu":
retry_counter = 0
while guest['cpu'] == 0 and retry_counter < 10:
guest = proxmox_api.nodes(node).qemu(guest['vmid']).status.current.get()
logger.debug(f"Guest {guest['name']} (type VM) is reporting {guest['cpu']} cpu usage on retry {retry_counter}.")
retry_counter += 1
time.sleep(1)
guests['guests'][guest['name']] = {}
guests['guests'][guest['name']]['name'] = guest['name']
guests['guests'][guest['name']]['cpu_total'] = guest['cpus']
guests['guests'][guest['name']]['cpu_used'] = guest['cpu']
guests['guests'][guest['name']]['cpu_used'] = guest['cpu'] * guest['cpus']
guests['guests'][guest['name']]['memory_total'] = guest['maxmem']
guests['guests'][guest['name']]['memory_used'] = guest['mem']
guests['guests'][guest['name']]['disk_total'] = guest['maxdisk']

View File

@@ -63,7 +63,7 @@ class Nodes:
nodes["nodes"][node["node"]]["maintenance"] = False
nodes["nodes"][node["node"]]["cpu_total"] = node["maxcpu"]
nodes["nodes"][node["node"]]["cpu_assigned"] = 0
nodes["nodes"][node["node"]]["cpu_used"] = node["cpu"]
nodes["nodes"][node["node"]]["cpu_used"] = node["cpu"] * node["maxcpu"]
nodes["nodes"][node["node"]]["cpu_free"] = (node["maxcpu"]) - (node["cpu"] * node["maxcpu"])
nodes["nodes"][node["node"]]["cpu_assigned_percent"] = nodes["nodes"][node["node"]]["cpu_assigned"] / nodes["nodes"][node["node"]]["cpu_total"] * 100
nodes["nodes"][node["node"]]["cpu_free_percent"] = nodes["nodes"][node["node"]]["cpu_free"] / node["maxcpu"] * 100

View File

@@ -139,7 +139,7 @@ class Tags:
tags (List): A list holding all defined tags for a given guest.
Returns:
Bool: Returns a bool that indicates wether to ignore a guest or not.
Bool: Returns a bool that indicates whether to ignore a guest or not.
"""
logger.debug("Starting: get_ignore.")
ignore_tag = False

View File

@@ -8,6 +8,7 @@ __copyright__ = "Copyright (C) 2025 Florian Paul Azim Hoberg (@gyptazy)"
__license__ = "GPL-3.0"
import json
import uuid
import sys
import time
@@ -115,12 +116,49 @@ class Helper:
None
"""
logger.debug("Starting: get_daemon_mode.")
if proxlb_config.get("service", {}).get("daemon", False):
sleep_seconds = proxlb_config.get("service", {}).get("schedule", 12) * 3600
logger.info(f"Daemon mode active: Next run in: {proxlb_config.get('service', {}).get('schedule', 12)} hours.")
if proxlb_config.get("service", {}).get("daemon", True):
# Validate schedule format which changed in v1.1.1
if type(proxlb_config["service"].get("schedule", None)) != dict:
logger.error("Invalid format for schedule. Please use 'hours' or 'minutes'.")
sys.exit(1)
# Convert hours to seconds
if proxlb_config["service"]["schedule"].get("format", "hours") == "hours":
sleep_seconds = proxlb_config.get("service", {}).get("schedule", {}).get("interval", 12) * 3600
# Convert minutes to seconds
elif proxlb_config["service"]["schedule"].get("format", "hours") == "minutes":
sleep_seconds = proxlb_config.get("service", {}).get("schedule", {}).get("interval", 720) * 60
else:
logger.error("Invalid format for schedule. Please use 'hours' or 'minutes'.")
sys.exit(1)
logger.info(f"Daemon mode active: Next run in: {proxlb_config.get('service', {}).get('schedule', {}).get('interval', 12)} {proxlb_config['service']['schedule'].get('format', 'hours')}.")
time.sleep(sleep_seconds)
else:
logger.debug("Daemon mode is not active.")
logger.debug("Successfully executed ProxLB. Daemon mode not active - stopping.")
print("Daemon mode not active - stopping.")
sys.exit(0)
logger.debug("Finished: get_daemon_mode.")
@staticmethod
def print_json(proxlb_config: Dict[str, Any], print_json: bool = False) -> None:
"""
Prints the calculated balancing matrix as a JSON output to stdout.
Parameters:
proxlb_config (Dict[str, Any]): A dictionary containing the ProxLB configuration.
Returns:
None
"""
logger.debug("Starting: print_json.")
if print_json:
# Create a filtered list by stripping the 'meta' key from the proxlb_config dictionary
# to make sure that no credentials are leaked.
filtered_data = {k: v for k, v in proxlb_config.items() if k != "meta"}
print(json.dumps(filtered_data, indent=4))
logger.debug("Finished: print_json.")

View File

@@ -9,6 +9,7 @@ __license__ = "GPL-3.0"
import logging
import sys
try:
from systemd.journal import JournalHandler
SYSTEMD_PRESENT = True
@@ -82,17 +83,22 @@ class SystemdLogger:
self.logger = logging.getLogger(name)
self.logger.setLevel(level)
# Create a JournalHandler for systemd integration if this
# is supported on the underlying OS.
# Create a logging handler depending on the
# capabilities of the underlying OS where systemd
# logging is preferred.
if SYSTEMD_PRESENT:
# Add a JournalHandler for systemd integration
journal_handler = JournalHandler()
journal_handler.setLevel(level)
# Set a formatter to include the logger's name and log message
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
journal_handler.setFormatter(formatter)
# Add handler to logger
self.logger.addHandler(journal_handler)
handler = JournalHandler()
else:
# Add a stdout handler as a fallback
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(level)
# Set a formatter to include the logger's name and log message
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
# Add handler to logger
self.logger.addHandler(handler)
def set_log_level(self, level: str) -> None:
"""

View File

@@ -94,6 +94,7 @@ class ProxmoxApi:
"""
logger.debug("Starting: ProxmoxApi initialization.")
self.proxmox_api = self.api_connect(proxlb_config)
self.test_api_user_permissions(self.proxmox_api)
logger.debug("Finished: ProxmoxApi initialization.")
def __getattr__(self, name):
@@ -115,7 +116,7 @@ class ProxmoxApi:
"token_id" and "token_secret" keys for API token authentication.
Raises:
SystemExit: If both username/password and API token authentication methods are
SystemExit: If both pass/token_secret and API token authentication methods are
provided, the function will log a critical error message and terminate
the program.
@@ -130,10 +131,10 @@ class ProxmoxApi:
sys.exit(1)
proxlb_credentials = proxlb_config["proxmox_api"]
present_auth_user = "user" in proxlb_credentials
present_auth_token = "token_id" in proxlb_credentials
present_auth_pass = "pass" in proxlb_credentials
present_auth_secret = "token_secret" in proxlb_credentials
if present_auth_user and present_auth_token:
if present_auth_pass and present_auth_secret:
logger.critical(f"Username/password and API token authentication are mutal exclusive. Please use only one!")
print(f"Username/password and API token authentication are mutal exclusive. Please use only one!")
sys.exit(1)
@@ -262,7 +263,7 @@ class ProxmoxApi:
logger.debug("Starting: test_api_proxmox_host_ipv4.")
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(timeout)
logger.warning(f"Warning: Host {host} ran into a timout when connectoing on IPv4 for tcp/{port}.")
logger.warning(f"Warning: Host {host} ran into a timeout when connecting on IPv4 for tcp/{port}.")
result = sock.connect_ex((host, port))
if result == 0:
@@ -295,7 +296,7 @@ class ProxmoxApi:
logger.debug("Starting: test_api_proxmox_host_ipv6.")
sock = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)
sock.settimeout(timeout)
logger.warning(f"Host {host} ran into a timout when connectoing on IPv6 for tcp/{port}.")
logger.warning(f"Host {host} ran into a timeout when connecting via IPv6 for tcp/{port}.")
result = sock.connect_ex((host, port))
if result == 0:
@@ -309,6 +310,36 @@ class ProxmoxApi:
logger.debug("Finished: test_api_proxmox_host_ipv4.")
return False
def test_api_user_permissions(self, proxmox_api: any):
"""
Test the permissions of the current user/token used for the Proxmox API.
This method gets all assigned permissions for all API paths for the current
used user/token and validates them against the minimum required permissions.
Args:
proxmox_api (any): The Proxmox API client instance.
"""
logger.debug("Starting: test_api_user_permissions.")
permissions_required = ["Datastore.Audit", "Sys.Audit", "VM.Audit", "VM.Migrate"]
permissions_available = []
# Get the permissions for the current user/token from API
permissions = proxmox_api.access.permissions.get()
# Get all available permissions of the current user/token
for path, permission in permissions.items():
for permission in permissions[path]:
permissions_available.append(permission)
# Validate if all required permissions are included within the available permissions
for required_permission in permissions_required:
if required_permission not in permissions_available:
logger.critical(f"Permission '{required_permission}' is missing. Please adjust the permissions for your user/token. See also: https://github.com/gyptazy/ProxLB/blob/main/docs/03_configuration.md#required-permissions-for-a-user")
sys.exit(1)
logger.debug("Finished: test_api_user_permissions.")
def api_connect(self, proxlb_config: Dict[str, Any]) -> proxmoxer.ProxmoxAPI:
"""
Establishes a connection to the Proxmox API using the provided configuration.

View File

@@ -3,5 +3,5 @@ __app_desc__ = "A DRS alike loadbalancer for Proxmox clusters."
__author__ = "Florian Paul Azim Hoberg <gyptazy>"
__copyright__ = "Copyright (C) 2025 Florian Paul Azim Hoberg (@gyptazy)"
__license__ = "GPL-3.0"
__version__ = "1.1.0"
__version__ = "1.1.1"
__url__ = "https://github.com/gyptazy/ProxLB"

View File

@@ -1,11 +1,11 @@
[Unit]
Description=ProxLB - A loadbalancer for Proxmox clusters
After=network-online.target
Wants=network-online.target
After=pveproxy.service
Wants=pveproxy.service
[Service]
ExecStart=python3 /usr/lib/python3/dist-packages/proxlb/main.py -c /etc/proxlb/proxlb.yaml
User=plb
[Install]
WantedBy=multi-user.target
WantedBy=multi-user.target

View File

@@ -2,7 +2,7 @@ from setuptools import setup
setup(
name="proxlb",
version="1.1.0",
version="1.1.1",
description="A DRS alike loadbalancer for Proxmox clusters.",
long_description="An advanced DRS alike loadbalancer for Proxmox clusters that also supports maintenance modes and affinity/anti-affinity rules.",
author="Florian Paul Azim Hoberg",