部署 MinIO 高可用集群
1. MinIO 集群搭建
$ mkdir -p /data/minio-cloud && cd /data/minio-cloud
$ vim docker-compose.yml
services:
minio1:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m1_data:/data
networks:
- pub-network
ports:
- "9000:9000"
- "9001:9001"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
minio2:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m2_data:/data
networks:
- pub-network
ports:
- "19000:9000"
- "19001:9001"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
minio3:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m3_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
minio4:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m4_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
volumes:
m1_data:
m2_data:
m3_data:
m4_data:
networks:
pub-network:
driver: bridge
$ docker-compose up -d
网页登录创建 disk 桶
2. 本地挂载测试
$ yum install -y s3fs-fuse
$ echo 'admin:12345678' > $HOME/.passwd-s3fs && chmod 600 $HOME/.passwd-s3fs
$ vim /etc/hosts
192.168.253.146 minio-cloud
$ mkdir /mnt/minio
$ s3fs -o passwd_file=$HOME/.passwd-s3fs -o url=http://minio-cloud:9000 -o allow_other -o nonempty -o no_check_certificate -o use_path_request_style -o umask=000 disk /mnt/minio
$ df -h
3. 写入测试
$ cd /mnt/minio
$ dd if=/dev/zero of=test.txt count=3 bs=1024M
[root@master-01 minio-cloud]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/centos-root 37G 21G 17G 55% /
[root@master-01 minio-cloud]# cd /mnt/minio/
[root@master-01 minio]# ls
[root@master-01 minio]# touch test.1
[root@master-01 minio]# dd if=/dev/zero of=test.txt count=3 bs=1024M
记录了3+0 的读入
记录了3+0 的写出
3221225472字节(3.2 GB)已复制,57.5726 秒,56.0 MB/秒
[root@master-01 minio]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/centos-root 37G 27G 11G 72% /
从容量使用率可以看出,文件被分成了(N/2)
4. 故障测试
4.1 停用节点4,写入测试
$ docker stop minio-cloud-minio4-1
$ cd /mnt/minio
$ echo "1" > test.1
- 读取测试
$ cat test.1
结论:在集群故障节点数小于 1/2 节点的情况下,可以正常写入和读取
4.2 停用节点4和节点3,写入测试
$ docker stop minio-cloud-minio3-1
$ cd /mnt/minio
$ echo "2" > test.2
- 读取测试
$ cat test.1
结论:在集群故障节点数等于 1/2 节点的情况下,可以正常读取但无法写入
4.3 停用节点4、节点3和节点2
$ docker stop minio-cloud-minio2-1
$ cd /mnt/minio
此时已经无法进入挂载目录,
结论:在集群故障节点数大于 1/2 的情况下,无法正常读取和写入
5. 节点故障及替换测试
5.1 启动节点2和节点3
$ docker start minio-cloud-minio2-1
$ docker start minio-cloud-minio3-1
5.2 删除节点4 的容器和 volumes 用于模拟 MinIO 某个节点产生了不可逆的故障
$ docker rm minio-cloud-minio4-1
$ docker volume rm minio-cloud_m4_data
此时集群仍处于离线一个节点的状态
5.3 重新创建 节点4 容器
注意,在同一台服务器上,且加入了相同网络的容器是不用去添加主机名解析的,如果是非容器安装,或者跨主机安装,注意要修改 host 解析
$ cd /data/minio-cloud
$ docker-compose up -d
可通过控制台看到,新的节点加入主机集群,且恢复正常状态
6. 集群扩容 --- 对等扩容
常见的集群扩容方法可分为两类:水平扩容和垂直扩容。水平扩容,一般指通过增加节点数扩展系统性能;而垂直扩容则指提升各节点自身的性能,例如增加节点的磁盘存储空间。直接采用垂直扩容方式扩容MinIO集群的节点磁盘空间,会为集群运行带来若干问题,官方也并不推荐。因此本文主要介绍MinIO的两种水平扩容方式:对等扩容和联邦扩容。
6.1 对等扩容
首先,MinIO的极简设计理念使得MinIO分布式集群并不支持向集群中添加单个节点并进行自动调节的扩容方式,这是因为加入单个节点后所引发的数据均衡以及纠删组划分等问题会为整个集群带来复杂的调度和处理过程,并不利于维护。因此,MinIO提供了一种对等扩容的方式,即要求增加的节点数和磁盘数均需与原集群保持对等。
对等扩容需要重启集群,联邦扩容无需重启集群
6.2 确保集群所有节点均处于正常状态,同时增加原集群的倍数数量节点
docker-compose.yml
services:
minio1:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m1_data:/data
networks:
- pub-network
ports:
- "9000:9000"
- "9001:9001"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
minio2:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m2_data:/data
networks:
- pub-network
ports:
- "19000:9000"
- "19001:9001"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
minio3:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m3_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
minio4:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m4_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
minio5:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m5_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
minio6:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m6_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
minio7:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m7_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
minio8:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m8_data:/data
networks:
- pub-network
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data http://minio{5...8}/data --console-address ":9001"
volumes:
m1_data:
m2_data:
m3_data:
m4_data:
m5_data:
m6_data:
m7_data:
m8_data:
networks:
pub-network:
driver: bridge
$ docker-compose up -d
此时集群已是8个节点
6.3 写入测试
$ cd /mnt/minio
$ echo "3" > test.3
6.4 读取测试
$ cat test.3
7. 集群缩容 --- 对等缩容
7.1 停止整个集群
$ docker-compose down
7.2 恢复成四个节点的docker-compose 配置
$ mv docker-compose.yml docker-compose.yml_bak8node
$ mv docker-compose.yml_bak4node docker-compose.yml
$ docker-compose up -d
8. 高可用集群
虽然我们的集群拥有多个节点,故障其中一个节点并不会影响到我们的集群状态,但如果故障的是客户端连接的节点,还是会出现客户端无法连接的情况,
比如我们客户端连接的是节点1,刚好节点1出现故障,此时节点2、节点3、节点4是正常的,但用户配置的节点1已无法正常连接,这里我们将使用 nginx ,对所有节点进行反向代理,以提高可用性
- docker-compose.yml
services:
minio1:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m1_data:/data
networks:
- pub-network
ports:
- "19000:9000"
- "9001:9001"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
minio2:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m2_data:/data
networks:
- pub-network
ports:
- "29000:9000"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
minio3:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m3_data:/data
networks:
- pub-network
ports:
- "39000:9000"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
minio4:
image: registry.cn-guangzhou.aliyuncs.com/hzbb/minio:RELEASE.2024-05-28T17-19-04Z
restart: always
volumes:
- m4_data:/data
networks:
- pub-network
ports:
- "49000:9000"
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=12345678
command: server http://minio{1...4}/data --console-address ":9001"
volumes:
m1_data:
m2_data:
m3_data:
m4_data:
networks:
pub-network:
driver: bridge
我们先将所有节点的 9000 端口映射到宿主机上,分别为19000、29000、39000、49000
8.1 nginx 安装
$ yum install -y zlib pcre2 wget
$ wget https://git.hzbb.top/Share/apps/raw/branch/main/linux/x64/nginx/nginx-1.24.0.tar.gz
$ tar -zxvf nginx-1.24.0.tar.gz
$ mkdir /data
$ mv nginx-1.24.0 /data
8.2 配置、启动 Nginx
$ vim /data/nginx-1.24.0/conf/nginx.conf
worker_processes auto;
events {
worker_connections 10240;
}
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
upstream minio {
least_conn;
server 192.168.253.146:19000;
server 192.168.253.146:29000;
server 192.168.253.146:39000;
server 192.168.253.146:49000;
}
server{
listen 9000;
server_name 192.168.253.14 minio-cloud;
ignore_invalid_headers off;
client_max_body_size 0;
proxy_buffering off;
location / {
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_connect_timeout 300;
proxy_http_version 1.1;
chunked_transfer_encoding off;
proxy_ignore_client_abort on;
proxy_pass http://minio;
}
}
}
$ cd /data/nginx-1.24.0 && ./nginx
负载策略 least_conn
最少连接,把请求转发给连接数较少的后端服务器。轮询算法是把请求平均的转发给各个后端,使它们的负载大致相同;但是,有些请求占用的时间很长,会导致其所在的后端负载较高。这种情况下,least_conn这种方式就可以达到更好的负载均衡效果
9. 高可用集群 --- keepalivd
虽然我们因集群节点故障导致无法连接的问题解决了,但用来反向代理的 nginx 仍然存在故障点,当 nginx 出现故障后,会导致我们无法连接上 minio 集群,对此我们需要引入 keepalivd 来使 nginx 高可用。
9.1 在另外一个节点()192.168.253.145 上也部署、启动nginx
$ yum install -y zlib pcre2 wget
$ wget https://git.hzbb.top/Share/apps/raw/branch/main/linux/x64/nginx/nginx-1.24.0.tar.gz
$ tar -zxvf nginx-1.24.0.tar.gz
$ mkdir /data
$ mv nginx-1.24.0 /data
9.2 配置、启动 Nginx
$ vim /data/nginx-1.24.0/conf/nginx.conf
worker_processes auto;
events {
worker_connections 10240;
}
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
upstream minio {
least_conn;
server 192.168.253.146:19000;
server 192.168.253.146:29000;
server 192.168.253.146:39000;
server 192.168.253.146:49000;
}
server{
listen 9000;
server_name 192.168.253.14 minio-cloud;
ignore_invalid_headers off;
client_max_body_size 0;
proxy_buffering off;
location / {
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_connect_timeout 300;
proxy_http_version 1.1;
chunked_transfer_encoding off;
proxy_ignore_client_abort on;
proxy_pass http://minio;
}
}
}
$ cd /data/nginx-1.24.0 && ./nginx
9.3 安装 Keepalived(两个节点)
$ yum install -y keepalived
$ cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf_bak
$ vim /etc/keepalived/keepalived.conf
- master 节点
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script nginx_check {
script "/data/keepalived/tools/nginx_check.sh"
interval 1
}
vrrp_instance VI_1 {
state BACKUP # 配置为非抢占模式
nopreempt # 配置为非抢占模式
#state MASTER # 配置为抢占模式
interface ens33
virtual_router_id 53
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass test
}
virtual_ipaddress {
192.168.253.144
}
track_script {
nginx_check
}
notify_master /data/keepalived/tools/master.sh
notify_backup /data/keepalived/tools/backup.sh
notify_fault /data/keepalived/tools/fault.sh
notify_stop /data/keepalived/tools/stop.sh
}
- backup 节点
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script nginx_check {
script "/data/keepalived/tools/nginx_check.sh"
interval 1
}
vrrp_instance VI_1 {
state BACKUP
nopreempt
interface ens33
virtual_router_id 53
priority 50
advert_int 1
authentication {
auth_type PASS
auth_pass test
}
virtual_ipaddress {
192.168.253.144
}
track_script {
nginx_check
}
notify_master /data/keepalived/tools/master.sh
notify_backup /data/keepalived/tools/backup.sh
notify_fault /data/keepalived/tools/fault.sh
notify_stop /data/keepalived/tools/stop.sh
}
9.4 创建检查脚本 (两个节点)
$ mkdir -p /data/keepalived/tools/
$ cd /data/keepalived/tools/
- nginx_check.sh
$ vim nginx_check.sh
#!/bin/bash
result=`pidof nginx`
if [ ! -z "${result}" ];
then
exit 0
else
exit 1
fi
注意:这里通过查询是否有nginx进程运行进行判断,请确保主机下只有一个需要我们进行高可用配置的nginx,否则请通过端口的方式判断
- master.sh
$ vim master.sh
ip=$(hostname -I | awk '{print $1}')
dt=$(date +'%Y%m%d %H:%M:%S')
echo "$0--${ip}--${dt}" >> /tmp/kp.log
- backup
$ vim backup.sh
ip=$(hostname -I | awk '{print $1}')
dt=$(date +'%Y%m%d %H:%M:%S')
echo "$0--${ip}--${dt}" >> /tmp/kp.log
- fault.sh
$ vim fault.sh
ip=$(ip addr|grep inet| grep 192.168.253 |awk '{print $2}')
dt=$(date +'%Y%m%d %H:%M:%S')
echo "$0--${ip}--${dt}" >> /tmp/kp.log
- stop.sh
$ vim stop.sh
ip=$(ip addr|grep inet| grep 192.168.253 |awk '{print $2}')
dt=$(date +'%Y%m%d %H:%M:%S')
echo "$0--${ip}--${dt}" >> /tmp/kp.log
注意:192.168.253 为网卡 IP 关键字
9.5 脚本授权
$ chmod 755 *
9.6 重启 Keepalived(两个节点)
$ systemctl restart keepalived.service
$ systemctl enable keepalived.service
10. 总结
这里使用为了方便演示,使用docker-compose在一台机上部署多个minio节点,在生产环境中推荐使用二进制部署或者docker多主机部署,nginx 出现故障切换时存在一定时间的卡顿,这个卡顿相比故障导致无法连接的问题不值一提。Keepalived分为抢占模式和非抢占模式,这里使用了非抢占模式。在抢占模式下,当 master 节点上线时,流量会切换回 master 节点再次造成短暂的卡顿,所以更推荐使用非抢占的方式。
参考链接:
github - 数据无界,存储有方:MinIO,为极致性能而生! - 辣码甄源 - SegmentFault 思否
minio原理和使用 - 简书 (jianshu.com)