Compare commits

..

8 Commits

Author SHA1 Message Date
Florian
143135f1d8 Merge pull request #50 from gyptazy/release/v1.0.2
release: Prepare release v1.0.2
2024-08-13 17:10:37 +02:00
Florian Paul Azim Hoberg
c865829a2e release: Prepare release v1.0.2 2024-08-13 16:37:30 +02:00
Florian
101855b404 Merge pull request #46 from gyptazy/fix/45-adjust-daemon-time-mix-min-hrs
fix: Fix daemon timer to use hours instead of minutes.
2024-08-06 21:29:34 +02:00
Florian Paul Azim Hoberg
37e7a601be fix: Fix daemon timer to use hours instead of minutes.
Reported by: @mater-345
Fixes: #45
2024-08-06 18:06:05 +02:00
Florian
8791007e77 Merge pull request #43 from gyptazy/feature/40-option-run-only-on-master-node
feature: Add option to run ProxLB only on the Proxmox's master node in the cluster.
2024-08-06 18:00:26 +02:00
Florian Paul Azim Hoberg
3a2c16b137 feature: Add option to run ProxLB only on the Proxmox's master node in the cluster.
Fixes: #40
2024-08-06 17:58:34 +02:00
Florian
adc476e848 Merge pull request #42 from gyptazy/feature/41-add-option-run-migration-parallel-or-serial
feature: Add option to run migrations in parallel or sequentially
2024-08-04 08:27:04 +02:00
Florian Paul Azim Hoberg
28be8b8146 feature: Add option to run migrations in parallel or sequentially
Fixes: #41
2024-08-04 08:25:03 +02:00
12 changed files with 170 additions and 21 deletions

View File

@@ -0,0 +1,2 @@
added:
- Add option to run ProxLB only on the Proxmox's master node in the cluster (reg. HA feature). [#40]

View File

@@ -0,0 +1,2 @@
added:
- Add option to run migrations in parallel or sequentially. [#41]

View File

@@ -0,0 +1,2 @@
changed:
- Fix daemon timer to use hours instead of minutes. [#45]

View File

@@ -0,0 +1,2 @@
fixed:
- Fix CMake packaging for Debian package to avoid overwriting the config file. [#49]

View File

@@ -0,0 +1 @@
date: 2024-08-13

View File

@@ -6,6 +6,20 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [1.0.2] - 2024-08-13
### Added
- Add option to run migration in parallel or sequentially. [#41]
- Add option to run ProxLB only on the Proxmox's master node in the cluster (reg. HA feature). [#40]
### Changed
- Fix daemon timer to use hours instead of minutes. [#45]
- Fix CMake packaging for Debian package to avoid overwriting the config file. [#49]
- Fix wonkey code style.
## [1.0.0] - 2024-08-01
### Added
@@ -37,4 +51,4 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- Development release of ProxLB.
- Development release of ProxLB.

View File

@@ -109,8 +109,10 @@ The following options can be set in the `proxlb.conf` file:
| mode_option | byte | Rebalance by node's resources in `bytes` or `percent`. (default: bytes) |
| type | vm | Rebalance only `vm` (virtual machines), `ct` (containers) or `all` (virtual machines & containers). (default: vm)|
| balanciness | 10 | Value of the percentage of lowest and highest resource consumption on nodes may differ before rebalancing. (default: 10) |
| parallel_migrations | 1 | Defines if migrations should be done parallely or sequentially. (default: 1) |
| ignore_nodes | dummynode01,dummynode02,test* | Defines a comma separated list of nodes to exclude. |
| ignore_vms | testvm01,testvm02 | Defines a comma separated list of VMs to exclude. (`*` as suffix wildcard or tags are also supported) |
| master_only | 0 | Defines is this should only be performed (1) on the cluster master node or not (0). (default: 0) |
| daemon | 1 | Run as a daemon (1) or one-shot (0). (default: 1) |
| schedule | 24 | Hours to rebalance in hours. (default: 24) |
| log_verbosity | INFO | Defines the log level (default: CRITICAL) where you can use `INFO`, `WARN` or `CRITICAL` |
@@ -133,9 +135,16 @@ type: vm
# Rebalancing: node01: 41% memory consumption :: node02: 52% consumption
# No rebalancing: node01: 43% memory consumption :: node02: 50% consumption
balanciness: 10
# Enable parallel migrations. If set to 0 it will wait for completed migrations
# before starting next migration.
parallel_migrations: 1
ignore_nodes: dummynode01,dummynode02
ignore_vms: testvm01,testvm02
[service]
# The master_only option might be usuful if running ProxLB on all nodes in a cluster
# but only a single one should do the balancing. The master node is obtained from the Proxmox
# HA status.
master_only: 0
daemon: 1
```
@@ -201,8 +210,8 @@ The executable must be able to read the config file, if no dedicated config file
The easiest way to get started is by using the ready-to-use packages that I provide on my CDN and to run it on a Linux Debian based system. This can also be one of the Proxmox nodes itself.
```
wget https://cdn.gyptazy.ch/files/amd64/debian/proxlb/proxlb_1.0.0_amd64.deb
dpkg -i proxlb_1.0.0_amd64.deb
wget https://cdn.gyptazy.ch/files/amd64/debian/proxlb/proxlb_1.0.2_amd64.deb
dpkg -i proxlb_1.0.2_amd64.deb
# Adjust your config
vi /etc/proxlb/proxlb.conf
systemctl restart proxlb
@@ -294,6 +303,7 @@ Container Images for Podman, Docker etc., can be found at:
| Version | Image |
|------|:------:|
| latest | cr.gyptazy.ch/proxlb/proxlb:latest |
| v1.0.2 | cr.gyptazy.ch/proxlb/proxlb:v1.0.2 |
| v1.0.0 | cr.gyptazy.ch/proxlb/proxlb:v1.0.0 |
| v0.9.9 | cr.gyptazy.ch/proxlb/proxlb:v0.9.9 |

View File

@@ -1,5 +1,5 @@
cmake_minimum_required(VERSION 3.16)
project(proxmox-rebalancing-service VERSION 1.0.0)
project(proxmox-rebalancing-service VERSION 1.0.2)
install(PROGRAMS ../proxlb DESTINATION /bin)
install(FILES ../proxlb.conf DESTINATION /etc/proxlb)
@@ -30,12 +30,11 @@ set(CPACK_DEBIAN_PACKAGE_ARCHITECTURE "amd64")
set(CPACK_DEBIAN_PACKAGE_SUMMARY "ProxLB - Rebalance VM workloads across nodes in Proxmox clusters.")
set(CPACK_DEBIAN_PACKAGE_DESCRIPTION "ProxLB - Rebalance VM workloads across nodes in Proxmox clusters.")
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${CMAKE_CURRENT_SOURCE_DIR}/changelog_debian")
set(CPACK_DEBIAN_PACKAGE_DEPENDS "python3")
set(CPACK_DEBIAN_PACKAGE_DEPENDS "python3, python3-proxmoxer")
set(CPACK_DEBIAN_PACKAGE_LICENSE "GPL 3.0")
# Install
set(CPACK_PACKAGING_INSTALL_PREFIX ${CMAKE_INSTALL_PREFIX})
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${CMAKE_CURRENT_SOURCE_DIR}/postinst")
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${CMAKE_CURRENT_SOURCE_DIR}/postinst;${CMAKE_CURRENT_SOURCE_DIR}/conffiles")
set(CPACK_RPM_POST_INSTALL_SCRIPT_FILE "${CMAKE_CURRENT_SOURCE_DIR}/postinst")
include(CPack)

View File

@@ -1,3 +1,13 @@
proxlb (1.0.2) unstable; urgency=low
* Add option to run migration in parallel or sequentially.
* Add option to run ProxLB only on a Proxmox cluster master (req. HA feature).
* Fix daemon timer to use hours instead of minutes.
* Fix CMake packaging for Debian package to avoid overwriting the config file.
* Fix some wonkey code styles.
-- Florian Paul Azim Hoberg <gyptazy@gyptazy.ch> Tue, 13 Aug 2024 17:28:14 +0200
proxlb (1.0.0) unstable; urgency=low
* Initial release of ProxLB.

View File

@@ -1,3 +1,9 @@
* Tue Aug 13 2024 Florian Paul Azim Hoberg <gyptazy@gyptazy.ch>
- Add option to run migration in parallel or sequentially.
- Add option to run ProxLB only on a Proxmox cluster master (req. HA feature).
- Fixed daemon timer to use hours instead of minutes.
- Fixed some wonkey code styles.
* Thu Aug 01 2024 Florian Paul Azim Hoberg <gyptazy@gyptazy.ch>
- Initial release of ProxLB.

1
packaging/conffiles Normal file
View File

@@ -0,0 +1 @@
/etc/proxlb/proxlb.conf

128
proxlb
View File

@@ -33,6 +33,7 @@ except ImportError:
import random
import re
import requests
import socket
import sys
import time
import urllib3
@@ -40,7 +41,7 @@ import urllib3
# Constants
__appname__ = "ProxLB"
__version__ = "1.0.0"
__version__ = "1.0.2"
__author__ = "Florian Paul Azim Hoberg <gyptazy@gyptazy.ch> @gyptazy"
__errors__ = False
@@ -112,7 +113,7 @@ def validate_daemon(daemon, schedule):
if bool(int(daemon)):
logging.info(f'{info_prefix} Running in daemon mode. Next run in {schedule} hours.')
time.sleep(int(schedule) * 60)
time.sleep(int(schedule) * 60 * 60)
else:
logging.info(f'{info_prefix} Not running in daemon mode. Quitting.')
sys.exit(0)
@@ -145,7 +146,7 @@ def __validate_config_file(config_path):
def initialize_args():
""" Initialize given arguments for ProxLB. """
argparser = argparse.ArgumentParser(description='ProxLB')
argparser.add_argument('-c', '--config', type=str, help='Path to config file.', required=True)
argparser.add_argument('-c', '--config', type=str, help='Path to config file.', required=False)
argparser.add_argument('-d', '--dry-run', help='Perform a dry-run without doing any actions.', action='store_true', required=False)
argparser.add_argument('-j', '--json', help='Return a JSON of the VM movement.', action='store_true', required=False)
return argparser.parse_args()
@@ -183,9 +184,11 @@ def initialize_config_options(config_path):
balancing_mode_option = config['balancing'].get('mode_option', 'bytes')
balancing_type = config['balancing'].get('type', 'vm')
balanciness = config['balancing'].get('balanciness', 10)
parallel_migrations = config['balancing'].get('parallel_migrations', 1)
ignore_nodes = config['balancing'].get('ignore_nodes', None)
ignore_vms = config['balancing'].get('ignore_vms', None)
# Service
master_only = config['service'].get('master_only', 0)
daemon = config['service'].get('daemon', 1)
schedule = config['service'].get('schedule', 24)
log_verbosity = config['service'].get('log_verbosity', 'CRITICAL')
@@ -200,8 +203,8 @@ def initialize_config_options(config_path):
sys.exit(2)
logging.info(f'{info_prefix} Configuration file loaded.')
return proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v, balancing_method, balancing_mode, \
balancing_mode_option, balancing_type, balanciness, ignore_nodes, ignore_vms, daemon, schedule, log_verbosity
return proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v, balancing_method, balancing_mode, balancing_mode_option, \
balancing_type, balanciness, parallel_migrations, ignore_nodes, ignore_vms, master_only, daemon, schedule, log_verbosity
def api_connect(proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v):
@@ -231,6 +234,62 @@ def api_connect(proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_ap
return api_object
def execute_rebalancing_only_by_master(api_object, master_only):
""" Validate if balancing should only be done by the cluster master. Afterwards, validate if this node is the cluster master. """
info_prefix = 'Info: [only-on-master-executor]:'
master_only = bool(int(master_only))
if bool(int(master_only)):
logging.info(f'{info_prefix} Master only rebalancing is defined. Starting validation.')
cluster_master_node = get_cluster_master(api_object)
cluster_master = validate_cluster_master(cluster_master_node)
return cluster_master, master_only
else:
logging.info(f'{info_prefix} No master only rebalancing is defined. Skipping validation.')
return False, master_only
def get_cluster_master(api_object):
""" Get the current master of the Proxmox cluster. """
error_prefix = 'Error: [cluster-master-getter]:'
info_prefix = 'Info: [cluster-master-getter]:'
try:
ha_status_object = api_object.cluster().ha().status().manager_status().get()
logging.info(f'{info_prefix} Master node: {ha_status_object.get("manager_status", None).get("master_node", None)}')
except urllib3.exceptions.NameResolutionError:
logging.critical(f'{error_prefix} Could not resolve the API.')
sys.exit(2)
except requests.exceptions.ConnectTimeout:
logging.critical(f'{error_prefix} Connection time out to API.')
sys.exit(2)
except requests.exceptions.SSLError:
logging.critical(f'{error_prefix} SSL certificate verification failed for API.')
sys.exit(2)
cluster_master = ha_status_object.get("manager_status", None).get("master_node", None)
if cluster_master:
return cluster_master
else:
logging.critical(f'{error_prefix} Could not obtain cluster master. Please check your configuration - stopping.')
sys.exit(2)
def validate_cluster_master(cluster_master):
""" Validate if the current execution node is the cluster master. """
info_prefix = 'Info: [cluster-master-validator]:'
node_executor_hostname = socket.gethostname()
logging.info(f'{info_prefix} Node executor hostname is: {node_executor_hostname}')
if node_executor_hostname != cluster_master:
logging.info(f'{info_prefix} {node_executor_hostname} is not the cluster master ({cluster_master}).')
return False
else:
return True
def get_node_statistics(api_object, ignore_nodes):
""" Get statistics of cpu, memory and disk for each node in the cluster. """
info_prefix = 'Info: [node-statistics]:'
@@ -483,7 +542,7 @@ def balancing_calculations(balancing_method, balancing_mode, balancing_mode_opti
def __validate_balancing_method(balancing_method):
""" Validate for valid and supported balancing method. """
error_prefix = 'Error: [balancing-method-validation]:'
info_prefix = 'Info: [balancing-method-validation]]:'
info_prefix = 'Info: [balancing-method-validation]:'
if balancing_method not in ['memory', 'disk', 'cpu']:
logging.error(f'{error_prefix} Invalid balancing method: {balancing_method}')
@@ -495,7 +554,7 @@ def __validate_balancing_method(balancing_method):
def __validate_balancing_mode(balancing_mode):
""" Validate for valid and supported balancing mode. """
error_prefix = 'Error: [balancing-mode-validation]:'
info_prefix = 'Info: [balancing-mode-validation]]:'
info_prefix = 'Info: [balancing-mode-validation]:'
if balancing_mode not in ['used', 'assigned']:
logging.error(f'{error_prefix} Invalid balancing method: {balancing_mode}')
@@ -703,7 +762,31 @@ def __get_vm_tags_exclude_groups(vm_statistics, node_statistics, balancing_metho
return node_statistics, vm_statistics
def __run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args):
def __wait_job_finalized(api_object, node_name, job_id, counter):
""" Wait for a job to be finalized. """
error_prefix = 'Error: [job-status-getter]:'
info_prefix = 'Info: [job-status-getter]:'
logging.info(f'{info_prefix} Getting job status for job {job_id}.')
task = api_object.nodes(node_name).tasks(job_id).status().get()
logging.info(f'{info_prefix} {task}')
if task['status'] == 'running':
logging.info(f'{info_prefix} Validating job {job_id} for the {counter} run.')
# Do not run for infinity this recursion and fail when reaching the limit.
if counter == 300:
logging.critical(f'{error_prefix} The job {job_id} on node {node_name} did not finished in time for migration.')
time.sleep(5)
counter = counter + 1
logging.info(f'{info_prefix} Revalidating job {job_id} in a next run.')
__wait_job_finalized(api_object, node_name, job_id, counter)
logging.info(f'{info_prefix} Job {job_id} for migration from {node_name} terminiated succesfully.')
def __run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations):
""" Run & execute the VM rebalancing via API. """
error_prefix = 'Error: [rebalancing-executor]:'
info_prefix = 'Info: [rebalancing-executor]:'
@@ -715,15 +798,23 @@ def __run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args):
# Migrate type VM (live migration).
if value['type'] == 'vm':
logging.info(f'{info_prefix} Rebalancing VM {vm} from node {value["node_parent"]} to node {value["node_rebalance"]}.')
api_object.nodes(value['node_parent']).qemu(value['vmid']).migrate().post(target=value['node_rebalance'],online=1)
job_id = api_object.nodes(value['node_parent']).qemu(value['vmid']).migrate().post(target=value['node_rebalance'],online=1)
# Migrate type CT (requires restart of container).
if value['type'] == 'ct':
logging.info(f'{info_prefix} Rebalancing CT {vm} from node {value["node_parent"]} to node {value["node_rebalance"]}.')
api_object.nodes(value['node_parent']).lxc(value['vmid']).migrate().post(target=value['node_rebalance'],restart=1)
job_id = api_object.nodes(value['node_parent']).lxc(value['vmid']).migrate().post(target=value['node_rebalance'],restart=1)
except proxmoxer.core.ResourceException as error_resource:
logging.critical(f'{error_prefix} {error_resource}')
# Wait for migration to be finished unless running parallel migrations.
if not bool(int(parallel_migrations)):
logging.info(f'{info_prefix} Rebalancing will be performed sequentially.')
__wait_job_finalized(api_object, value['node_parent'], job_id, counter=1)
else:
logging.info(f'{info_prefix} Rebalancing will be performed parallely.')
else:
logging.info(f'{info_prefix} No rebalancing needed.')
@@ -784,9 +875,9 @@ def __print_table_cli(table, dry_run=False):
logging.info(f'{info_prefix} {row_format.format(*row)}')
def run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args):
def run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations):
""" Run rebalancing of vms to new nodes in cluster. """
__run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args)
__run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations)
__create_json_output(vm_statistics_rebalanced, app_args)
__create_cli_output(vm_statistics_rebalanced, app_args)
@@ -801,7 +892,7 @@ def main():
# Parse global config.
proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v, balancing_method, balancing_mode, balancing_mode_option, balancing_type, \
balanciness, ignore_nodes, ignore_vms, daemon, schedule, log_verbosity = initialize_config_options(config_path)
balanciness, parallel_migrations, ignore_nodes, ignore_vms, master_only, daemon, schedule, log_verbosity = initialize_config_options(config_path)
# Overwrite logging handler with user defined log verbosity.
initialize_logger(log_verbosity, update_log_verbosity=True)
@@ -810,6 +901,15 @@ def main():
# API Authentication.
api_object = api_connect(proxmox_api_host, proxmox_api_user, proxmox_api_pass, proxmox_api_ssl_v)
# Get master node of cluster and ensure that ProxLB is only performed on the
# cluster master node to avoid ongoing rebalancing.
cluster_master, master_only = execute_rebalancing_only_by_master(api_object, master_only)
# Validate daemon service and skip following tasks when not being the cluster master.
if not cluster_master and master_only:
validate_daemon(daemon, schedule)
continue
# Get metric & statistics for vms and nodes.
node_statistics = get_node_statistics(api_object, ignore_nodes)
vm_statistics = get_vm_statistics(api_object, ignore_vms, balancing_type)
@@ -820,7 +920,7 @@ def main():
node_statistics, vm_statistics, balanciness, rebalance=False, processed_vms=[])
# Rebalance vms to new nodes within the cluster.
run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args)
run_vm_rebalancing(api_object, vm_statistics_rebalanced, app_args, parallel_migrations)
# Validate for any errors.
post_validations()