Task #202: Update PVE từ 7.4.17 lên 8.0.4 - 15. CMC-Q9-HCM - Long Vân Redmine

Task #202

Cập nhật bởi Thanh Tâm Nguyễn cách đây gần 2 năm

Mục tiêu: update cluster proxmox CMC-Q9 lên version mới nhát7 
 Yêu cầu: lên quy trình thực hiện, thứ tự các node, update đảm bảo không ảnh hưởng đến các dịch vụ đang chạy 
 === 
 Docs hướng dẫn update: https://projects.longvan.net/projects/lvss/wiki/11-quy-trinh-update-proxmox-cluster-1-so-loi-khi-upgrade-va-cach-troubleshoot 
 Thứ tự update: thực hiện giai đoạn 2 https://docs.google.com/spreadsheets/d/1cszHUbOfqyg-AFlZrpvYnCswYI8kZIwaXx7P8PQf8oQ/edit#gid=0 
 === 
 Lưu ý: trong quá trình update phải thực hiện update trên console 
 _**Quy trình Update**_ 


     Migrate VM ra khỏi node trước khi thực hiện chạy update 

     Thực hiện Backup config 

         Backup File cp /etc/hosts /root/ cp /etc/network/interfaces /root/ 
         Lưu lại các file config này bằng SCP 

     Thực hiện tắt HA service 

         Stop service pve-ha-lrm lần lượt trên các node trong cluster. Sau đó lần lượt stop service 1 trên các node trong cluster o systemctl status pve-ha-lrm o systemctl status pve-ha-crm 
         Remove VM ra khỏi HA 

     Maintenance Ceph storage 

     Cài lại OS: 

         echo "deb https://enterprise.proxmox.com/debian/pve bookworm pve-enterprise" > /etc/apt/sources.list.d/pve-enterprise.list 
         sed -i -e 's/bullseye/bookworm/g' /etc/apt/sources.list.d/pve-install-repo.list 
         echo "deb https://enterprise.proxmox.com/debian/ceph-quincy bookworm enterprise" > /etc/apt/sources.list.d/ceph.list 
         echo "deb http://download.proxmox.com/debian/ceph-quincy bookworm no-subscription" > /etc/apt/sources.list.d/ceph.list 
         sed -i 's/bullseye/bookworm/g' /etc/apt/sources.list 
         cat /etc/apt/sources.list 
         cat /etc/apt/sources.list.d/pve-enterprise.list 
         apt update 
         apt dist-upgrade 

     Kiểm tra cluster 

         Sau khi quá trình upgrade hoàn tất thì kiểm tra lại các service. pve-ha-lrm và pve-ha-crm có thể sẽ tự 
         Start nên phải Stop ngay 
         Các service ceph sẽ restart lại, nên phải đảm bảo service ceph-osd trên node đó đã Start, quá trình recovery hoàn tất thì mới thực hiện tiếp 
         Kiểm tra service corosync đã Start, các node trong cluster đã joined hết 
         Kiểm tra giao diện có hiện đầy đủ các nodes 

     Một số lưu ý 

 * Trường hợp Ceph bị rớt phải restart lại service corosync tất cả node: systemctl restart corosync.service 
 * Rớt net( không thể ping ra net). Cách xử lý: truy cập vào server thực hiện command sau: /etc/init.d/openvswitch-switch restart ifreload -a 
 * Lỗi khi update các package. Biểu hiện, không thể ping tới node update, không nhận file /network/interface: dpkg --configre -a

Quay lại

Dự án

Tổng quan

Hồ sơ

LV SYSTEM » 11.QUI HOẠCH HỆ THỐNG » 15. CMC-Q9-HCM

Task #202