环境
vip 192.168.1.101
slave 192.168.1.16 5.7.17 3306
master 192.168.1.135 5.7.17 3306 proxysql 192.168.1.16(为方便proxysql放在了16节点上)
一 MHA的搭建
1.安装MHA软件,首先安装epel源。(2台机器)
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
2.安装perl相关组件(2台机器)
yum install perl-DBD-MySQLyum install perl-Config-Tinyyum install perl-Log-Dispatchyum install perl-Parallel-ForkManager
3.安装MHA软件 (两台机器建议都安装,切换方便)(2台机器)
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpmrpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
4.建立SSH信任关系
5.授权
GRANT ALL PRIVILEGES ON *.* TO 'zhuch'@'%' IDENTIFIED BY "zhuch"GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%' IDENTIFIED BY "oracle"
6.创建应用目录
mkdir /etc/masterha
拷贝如下文件到 /etc/masterha
[root@mysql3 masterha]# ls -ltotal 32-rw-r--r--. 1 root root 509 Feb 10 02:29 app1.conf-rw-r--r--. 1 root root 55 Feb 10 03:15 drop_vip.sh-rw-r--r--. 1 root root 57 Feb 10 03:15 init_vip.sh-rw-r--r--. 1 root root 354 Feb 10 02:25 masterha_default.conf-rwxr-xr-x. 1 root root 3978 Feb 10 03:16 master_ip_failover-rwxr-xr-x. 1 root root 10390 Feb 10 03:17 master_ip_online_change
app1.conf MHA相关配置文件(在软件包解压后的目录里面有样例配置文件,只不过这里我们直接创建一个重新编辑)
[root@mysql3 masterha]# cat app1.conf
[server default]#mha manager工作目录
manager_workdir = /var/log/masterha/app1manager_log = /var/log/masterha/app1/app1.logremote_workdir = /var/log/masterha/app1[server1]
hostname=192.168.1.16master_binlog_dir = /data/mysql/mysql3306/logscandidate_master = 1check_repl_delay = 0 #用防止master故障时,切换时slave有延迟,卡在那里切不过来。 [server3]hostname=192.168.1.135master_binlog_dir=/data/mysql/mysql3306/logscandidate_master = 1check_repl_delay = 0drop_vip.sh 解除绑定vip
[root@mysql3 masterha]# cat drop_vip.sh vip="192.168.1.101/24"/sbin/ip addr del $vip dev eth0
init._vip.sh 绑定vip
[root@mysql3 masterha]# cat init_vip.sh vip="192.168.1.101/24"/sbin/ip addr add $vip dev eth0
masterha_default.conf 全局级配置文件
[root@mysql3 masterha]# cat masterha_default.conf
[server default]#MySQL的用户和密码user=zhuchpassword=zhuch#系统ssh用户
ssh_user=root#复制用户
repl_user=slaverepl_password=oracle #监控ping_interval=1#shutdown_script=""#切换调用的脚本
master_ip_failover_script= /etc/masterha/master_ip_failovermaster_ip_online_change_script= /etc/masterha/master_ip_online_changemaster_ip_failover 自动failover脚本
[root@mysql3 masterha]# cat master_ip_failover #!/usr/bin/env perl# Copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation; either version 2 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program; if not, write to the Free Software# Foundation, Inc.,# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA## Note: This is a sample script and is not complete. Modify the script based on your environment.use strict;use warnings FATAL => 'all';use Getopt::Long;use MHA::DBHelper;#自定义该组机器的vipmy $vip = "192.168.1.101";my $if = "eth0";my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password);GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password,);sub add_vip { my $output1 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`; my $output2 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`;}exit &main();sub main { if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { # updating global catalog, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print "Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print "Creating app user on the new master..\n"; #FIXME_xxx_create_user( $new_master_handler->{dbh} ); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc &add_vip(); $exit_code = 0; }; if ($@) { warn $@; # If you want to continue failover, exit 10. exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; }}sub usage { print"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";}
master_ip_online_change 手动failover脚本
[root@mysql3 masterha]# cat master_ip_online_change#!/usr/bin/env perl# Copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation; either version 2 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program; if not, write to the Free Software# Foundation, Inc.,# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA## Note: This is a sample script and is not complete. Modify the script based on your environment.use strict;use warnings FATAL => 'all';use Getopt::Long;use MHA::DBHelper;use MHA::NodeUtil;use Time::HiRes qw( sleep gettimeofday tv_interval );use Data::Dumper;my $_tstart;my $_running_interval = 0.1;#添加vip定义my $vip = "192.168.1.101";my $if = "eth0";my ( $command, $orig_master_is_new_slave, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, $orig_master_ssh_user, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $new_master_ssh_user,);GetOptions( 'command=s' => \$command, 'orig_master_is_new_slave' => \$orig_master_is_new_slave, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'orig_master_user=s' => \$orig_master_user, 'orig_master_password=s' => \$orig_master_password, 'orig_master_ssh_user=s' => \$orig_master_ssh_user, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, 'new_master_ssh_user=s' => \$new_master_ssh_user,);exit &main();sub drop_vip { my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`; #mysql里的连接全部干掉 #FIXME}sub add_vip { my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`;}sub current_time_us { my ( $sec, $microsec ) = gettimeofday(); my $curdate = localtime($sec); return $curdate . " " . sprintf( "%06d", $microsec );}sub sleep_until { my $elapsed = tv_interval($_tstart); if ( $_running_interval > $elapsed ) { sleep( $_running_interval - $elapsed ); }}sub get_threads_util { my $dbh = shift; my $my_connection_id = shift; my $running_time_threshold = shift; my $type = shift; $running_time_threshold = 0 unless ($running_time_threshold); $type = 0 unless ($type); my @threads; my $sth = $dbh->prepare("SHOW PROCESSLIST"); $sth->execute(); while ( my $ref = $sth->fetchrow_hashref() ) { my $id = $ref->{Id}; my $user = $ref->{User}; my $host = $ref->{Host}; my $command = $ref->{Command}; my $state = $ref->{State}; my $query_time = $ref->{ Time}; my $info = $ref->{Info}; $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info); next if ( $my_connection_id == $id ); next if ( defined($query_time) && $query_time < $running_time_threshold ); next if ( defined($command) && $command eq "Binlog Dump" ); next if ( defined($user) && $user eq "system user" ); next if ( defined($command) && $command eq "Sleep" && defined($query_time) && $query_time >= 1 ); if ( $type >= 1 ) { next if ( defined($command) && $command eq "Sleep" ); next if ( defined($command) && $command eq "Connect" ); } if ( $type >= 2 ) { next if ( defined($info) && $info =~ m/^select/i ); next if ( defined($info) && $info =~ m/^show/i ); } push @threads, $ref; } return @threads;}sub main { if ( $command eq "stop" ) { ## Gracefully killing connections on the current master # 1. Set read_only= 1 on the new master # 2. DROP USER so that no app user can establish new connections # 3. Set read_only= 1 on the current master # 4. Kill current queries # * Any database access failure will result in script die. my $exit_code = 1; eval { ## Setting read_only=1 on the new master (to avoid accident) my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error(die_on_error)_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); print current_time_us() . " Set read_only on the new master.. "; $new_master_handler->enable_read_only(); if ( $new_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } $new_master_handler->disconnect(); # Connecting to the orig master, die if any database error happens my $orig_master_handler = new MHA::DBHelper(); $orig_master_handler->connect( $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, 1 ); ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand $orig_master_handler->disable_log_bin_local(); # print current_time_us() . " Drpping app user on the orig master..\n"; print current_time_us() . " drop vip $vip..\n"; #drop_app_user($orig_master_handler); &drop_vip(); ## Waiting for N * 100 milliseconds so that current connections can exit my $time_until_read_only = 15; $_tstart = [gettimeofday]; my @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_read_only > 0 && $#threads >= 0 ) { if ( $time_until_read_only % 5 == 0 ) { printf"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_read_only * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_read_only--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Setting read_only=1 on the current master so that nobody(except SUPER) can write print current_time_us() . " Set read_only=1 on the orig master.. "; $orig_master_handler->enable_read_only(); if ( $orig_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } ## Waiting for M * 100 milliseconds so that current update queries can complete my $time_until_kill_threads = 5; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_kill_threads > 0 && $#threads >= 0 ) { if ( $time_until_kill_threads % 5 == 0 ) { printf"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_kill_threads * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_kill_threads--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Terminating all threads print current_time_us() . " Killing all application threads..\n"; $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 ); print current_time_us() . " done.\n"; $orig_master_handler->enable_log_bin_local(); $orig_master_handler->disconnect(); ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { ## Activating master ip on the new master # 1. Create app user with write privileges # 2. Moving backup script if needed # 3. Register new master's ip to the catalog database# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.# If exit code is 0 or 10, MHA does not abort my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print current_time_us() . " Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print current_time_us() . " Creating app user on the new master..\n"; print current_time_us() . "Add vip $vip on $if..\n"; # create_app_user($new_master_handler); &add_vip(); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; }}sub usage { print"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; die;}
7.在主库绑定vip(执行脚本)
sh init._vip.sh
8.检测SSH 是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..Sat Feb 10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf..Sat Feb 10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf..Sat Feb 10 22:00:34 2018 - [info] Starting SSH connection tests..Sat Feb 10 22:00:36 2018 - [debug] Sat Feb 10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22)..Sat Feb 10 22:00:36 2018 - [debug] ok.Sat Feb 10 22:00:41 2018 - [debug] Sat Feb 10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22)..Sat Feb 10 22:00:41 2018 - [debug] ok.Sat Feb 10 22:00:41 2018 - [info] All SSH connection tests passed successfully.
9.检测主从复制情况是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..Sat Feb 10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf..Sat Feb 10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf..Sat Feb 10 22:00:34 2018 - [info] Starting SSH connection tests..Sat Feb 10 22:00:36 2018 - [debug] Sat Feb 10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22)..Sat Feb 10 22:00:36 2018 - [debug] ok.Sat Feb 10 22:00:41 2018 - [debug] Sat Feb 10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22)..Sat Feb 10 22:00:41 2018 - [debug] ok.Sat Feb 10 22:00:41 2018 - [info] All SSH connection tests passed successfully.[root@mysql2 opt]# [root@mysql2 opt]# [root@mysql2 opt]# masterha_check_repl --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:26:50 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..Sat Feb 10 22:26:50 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf..Sat Feb 10 22:26:50 2018 - [info] Reading server configuration from /etc/masterha/app1.conf..Sat Feb 10 22:26:50 2018 - [info] MHA::MasterMonitor version 0.56.Sat Feb 10 22:26:50 2018 - [info] GTID failover mode = 1Sat Feb 10 22:26:50 2018 - [info] Dead Servers:Sat Feb 10 22:26:50 2018 - [info] Alive Servers:Sat Feb 10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306)Sat Feb 10 22:26:50 2018 - [info] 192.168.1.135(192.168.1.135:3306)Sat Feb 10 22:26:50 2018 - [info] Alive Slaves:Sat Feb 10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabledSat Feb 10 22:26:50 2018 - [info] GTID ONSat Feb 10 22:26:50 2018 - [info] Replicating from 192.168.1.135(192.168.1.135:3306)Sat Feb 10 22:26:50 2018 - [info] Primary candidate for the new Master (candidate_master is set)Sat Feb 10 22:26:50 2018 - [info] Current Alive Master: 192.168.1.135(192.168.1.135:3306)Sat Feb 10 22:26:50 2018 - [info] Checking slave configurations..Sat Feb 10 22:26:50 2018 - [info] read_only=1 is not set on slave 192.168.1.16(192.168.1.16:3306).Sat Feb 10 22:26:50 2018 - [info] Checking replication filtering settings..Sat Feb 10 22:26:50 2018 - [info] binlog_do_db= , binlog_ignore_db= Sat Feb 10 22:26:50 2018 - [info] Replication filtering check ok.Sat Feb 10 22:26:50 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.Sat Feb 10 22:26:50 2018 - [info] Checking SSH publickey authentication settings on the current master..Sat Feb 10 22:26:51 2018 - [info] HealthCheck: SSH to 192.168.1.135 is reachable.Sat Feb 10 22:26:51 2018 - [info] 192.168.1.135(192.168.1.135:3306) (current master) +--192.168.1.16(192.168.1.16:3306)Sat Feb 10 22:26:51 2018 - [info] Checking replication health on 192.168.1.16..Sat Feb 10 22:26:51 2018 - [info] ok.Sat Feb 10 22:26:51 2018 - [info] Checking master_ip_failover_script status:Sat Feb 10 22:26:51 2018 - [info] /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.135 --orig_master_ip=192.168.1.135 --orig_master_port=3306 Sat Feb 10 22:26:51 2018 - [info] OK.Sat Feb 10 22:26:51 2018 - [warning] shutdown_script is not defined.Sat Feb 10 22:26:51 2018 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.
10.设置从库上的 relay_log_purge=0 以及 read_only=1 (只读)
'set global relay_log_purge=0'
'set global read_only=1'
应用差异的中继日志到其他从库的时候也许会用到 ,但是我们这里一主一从其实不必配置,如果设置了 relay_log_purge=0 的话,又怕从库的relay log产生过多,这时候我们可以使用purge_relay_logs 命令定时删除,这个是MHA自带的
可以写成一个脚本定时删除 如下:
#!/bin/bashuser=zhuchpasswd=zhuchport=3306log_dir='/etc/masterha/log'work_dir='/etc/masterha/relay_log_node'purge='/usr/bin/purge_relay_logs'if [ ! -d $log_dir ]then mkdir $log_dir -pfiif [ ! -d $work_dir ]then mkdir $work_dir -pfi$purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
基本上MHA 就已经搭建完了 ,主库挂掉后会切换到从库 并且vip 也会漂移到从库
二 安装配置proxysql
1.安装
下载地址 https://www.percona.com/downloads/proxysql/
rpm -ivh proxysql-1.4.3-1-centos67.x86_64.rpm
2.配置 登入proxysql 把MySQL主从信息添加进去,将主库master放入写节点中,也加就是hostgroup_id 为100中,slave节点做读放到1000中
mysql -uadmin -padmin -P6032 -h127.0.0.1
但是注意:这里我直接将写节点的 设置为 VIP 192.168.1.101
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(100,'192.168.1.101',3306,1,1000,10,'vip');
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.16',3306,1,1000,10,'slave'
admin@ 23:16: [(none)]> select * from mysql_servers;+--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+| hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |+--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql || 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql |+--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+
3. 配置后端使用的MySQL用户,需要先在后端MySQL(135,16) 里真实存在,一个是监控账号,一个是程序账号:
GRANT ALL PRIVILEGES ON *.* TO 'proxysql'@'192.168.1.16' identified by 'proxysql' GRANT ALL PRIVILEGES ON *.* TO 'sbuser'@'%' identified by 'sbuser'
在后端MySQL里添加完之后再配置proxysql: 这里需要注意,default_hostgroup需要和上面的对应
insert into mysql_users(username,password,active,default_hostgroup,transaction_persistent) values('sbuser','sbuser',1,100,1);
admin@ 23:37: [(none)]> select * from mysql_users;+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+| username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+| sbuser | sbuser | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 |+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
4.设置健康的监测账号
admin@ 23:37: [(none)]>set mysql-monitor_username='proxysql';
admin@ 23:37: [(none)]>set mysql-monitor_password='proxysql';
-- 应用到线上
load mysql servers to runtime;load mysql users to runtime;load mysql variables to runtime;-- 持久化save mysql servers to disk;save mysql users to disk;save mysql variables to disk;
要是是用明文密码设置mysql_users,在这里可以用save命令来转换成了hash值的密码:
save mysql users to mem;
admin@ 23:39: [(none)]> select * from mysql_users;+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+| username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+| sbuser | *CA96E56547F43610DDE9EB7B12B4EF4C51CDDFFC | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 |+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
5.配置路由
-- 发送到Madmin@127.0.0.1 : (none) 04:58:11>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT.*FOR UPDATE$',100,1);Query OK, 1 row affected (0.00 sec)-- 发送到Sadmin@127.0.0.1 : (none) 05:08:17>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT',1000,1);Query OK, 1 row affected (0.00 sec)
admin@127.0.0.1 : (none) 05:09:37>load mysql query rules to runtime;Query OK, 0 rows affected (0.00 sec)admin@127.0.0.1 : (none) 05:09:57>save mysql query rules to disk;Query OK, 0 rows affected (0.00 sec)
6.连接数据库6033 测试读写分离
[root@mysql2 sysbench]# mysql -usbuser -psbuser -P6033 -h192.168.1.16
sbuser@ 23:59: [(none)]> show databases;+--------------------+| Database |+--------------------+| information_schema || mysql || performance_schema || sys || z1_email || z1_exchange || z1_relation |+--------------------+7 rows in set (0.03 sec)sbuser@ 00:02: [(none)]> sbuser@ 00:02: [(none)]> sbuser@ 00:02: [(none)]> use z1_email;Database changed, 2 warningssbuser@ 00:02: [z1_email]> sbuser@ 00:02: [z1_email]> insert into a1 values(134);Query OK, 1 row affected (0.01 sec)sbuser@ 00:03: [z1_email]> insert into a1 values(146);Query OK, 1 row affected (0.01 sec)sbuser@ 00:03: [z1_email]> insert into a1 values(157);Query OK, 1 row affected (0.02 sec)sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> selet * from a1;ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'selet * from a1' at line 1sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> select * from a1;+------+| id |+------+| 1 || 2 || 12 || 13 || 14 || 111 || 222 || 333 || 250 || 5 || 6 || 7 || 8 || 9 || 10 || 11 || 12 || 13 || 14 || 15 || 15 || 15 || 16 || 123 || 124 || 17 || 1000 || 1001 || 1002 || 1003 || 1003 || 1004 || 1004 || 134 || 146 || 157 |+------+36 rows in set (0.00 sec)
进入管理账户6032端口查看,可以看到的确有读写分离已经完成了
admin@ 00:10: [(none)]> select * from stats_mysql_query_digest;+-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+| hostgroup | schemaname | username | digest | digest_text | count_star | first_seen | last_seen | sum_time | min_time | max_time |+-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+| 1000 | z1_email | sbuser | 0xB17CC7AAA7E39A4A | select * from a1 | 1 | 1518278606 | 1518278606 | 2123 | 2123 | 2123 || 100 | z1_email | sbuser | 0x496C8B86BBC0D398 | insert into a1 values(?) | 3 | 1518278580 | 1518278588 | 30478 | 6373 | 16671 || 1000 | information_schema | sbuser | 0x620B328FE9D6D71A | SELECT DATABASE() | 1 | 1518278568 | 1518278568 | 508 | 508 | 508 || 100 | information_schema | sbuser | 0x02033E45904D3DF0 | show databases | 1 | 1518278563 | 1518278563 | 30233 | 30233 | 30233 |+-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+4 rows in set (0.00 sec)
三 测试
1. 模拟主库宕机的情况
分析:主库挂掉后proxysql的写入情况
主库故障,使用MHA 手动failover 将 vip 切换到从库 192.168.1.16上 ,此时 192.168.1.16 上的 vip是192.168.1.101
admin@ 00:17: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+--------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+--------+--------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 || 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |+--------------+---------------+------+--------+--------+2 rows in set (0.00 sec)
从上面可以看出来 mysql_servers 中的 hostname 的写是192.168.1.101 读是192.168.1.16,这样一来是不是 主库挂了后手动切换后就可以直接写了呢? 测试一下
在主节点上模拟主库挂掉的情况 [root@mysql3 masterha]# ps -ef |grep mysqlmysql 2020 65360 0 Feb10 pts/1 00:00:58 mysqld --defaults-file=/etc/my.cnfroot 5356 65360 0 00:43 pts/1 00:00:00 grep mysql[root@mysql3 masterha]# [root@mysql3 masterha]# [root@mysql3 masterha]# kill -9 2020
然后去6033 程序端口查看是否可以写 发现报错了,超时
sbuser@ 00:57: [z1_email]> insert into a1 values(158);ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
然后去6033 程序端口查看是否可以读 发现也报错了,超时 (这里很奇怪按理说可以读才对)
sbuser@ 18:59: [z1_email]> select * from a1;ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
现在进行手动切换
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.135 --master_state=dead --new_master_host=192.168.1.16 --ignore_last_failover
现在已经切换完毕了 并且vip已经切换到了 192.168.1.16上
[root@mysql2 masterha]# ip addr show1: lo:mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever2: eth0: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:92:bf:e3 brd ff:ff:ff:ff:ff:ff inet 192.168.1.16/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.101/24 scope global secondary eth0 inet6 fe80::20c:29ff:fe92:bfe3/64 scope link valid_lft forever preferred_lft forever
这时候再去程序端口 6033 进行插入和读取的操作,发现可以进行读写了
sbuser@ 19:08: [z1_email]> select * from a1;+------+| id |+------+| 1 || 2 || 12 || 13 || 14 || 111 || 222 || 333 || 250 || 5 || 6 || 7 || 8 || 9 || 10 || 11 || 12 || 13 || 14 || 15 || 15 || 15 || 16 || 123 || 124 || 17 || 1000 || 1001 || 1002 || 1003 || 1003 || 1004 || 1004 || 134 || 146 || 157 |+------+36 rows in set (0.00 sec)sbuser@ 19:08: [z1_email]> sbuser@ 19:08: [z1_email]> insert into a1 values(1590);Query OK, 1 row affected (0.00 sec)
此时主库恢复后 change 到新的主库
root@ 19:21: [(none)]> change master to master_host='192.168.1.16', -> master_user='slave', -> master_password='oracle', -> master_auto_position=1;
查看主从同步状态是OK的
root@ 19:51: [(none)]> show slave status\G;*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.16 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000018 Read_Master_Log_Pos: 2231 Relay_Log_File: mysql3-relay-bin.000002 Relay_Log_Pos: 675 Relay_Master_Log_File: mysql-bin.000018 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2231 Relay_Log_Space: 883 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 330616 Master_UUID: 25aa2017-083b-11e8-b78a-000c2992bfe3 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:59554 Executed_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:1-59554,7af79590-0840-11e8-ac17-000c29459399:1-10 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
此时我们再去管理端口查看一下,发现其实管理端口只有192.168.1.16 和 vip 192.168.1.101 并且vip 已经漂移到了 192.168.1.16这台机器上
[root@mysql2 opt]# mysql -uadmin -padmin -P6032 -h127.0.0.1mysql: [Warning] Using a password on the command line interface can be insecure.Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 19Server version: 5.5.30 (ProxySQL Admin Module)Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.admin@ 19:53: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+--------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+--------+--------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 || 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |+--------------+---------------+------+--------+--------+2 rows in set (0.00 sec)
然后我们加入192.168.1.135 并且我这里分配的权重是9
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.135',3306,9,1000,10,'test proxysql');
admin@ 19:59: [(none)]> load mysql servers to runtime;
Query OK, 0 rows affected (0.01 sec)admin@ 19:59: [(none)]> save mysql servers to disk;
Query OK, 0 rows affected (0.05 sec)
查看runtime_mysql_servers ,有 十分之九的概率的 读操作会分配到 192.168.1.135 十分之一的读会在 192.168.1.16 并且全部的写操作都在 192.168.1.16(因为VIP 192.168.1.101在16上)
admin@ 19:59: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+--------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+--------+--------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 || 1000 | 192.168.1.135 | 3306 | ONLINE | 9 || 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |+--------------+---------------+------+--------+--------+
现在主库为192.168.1.16 如果此时主库挂了怎么办? 是否还会影响在proxysql中的读写操作呢?
我们再次模拟 主库挂掉的情况 此时主库是 192.168.1.16
[root@mysql2 opt]# ps -ef |grep mysqlroot 2976 21612 0 19:50 pts/4 00:00:00 mysql -uroot -px xxxxroot 2983 6583 0 19:53 pts/3 00:00:00 mysql -uadmin -px xxx -P6032 -h127.0.0.1root 3369 15620 0 22:09 pts/1 00:00:00 grep mysqlmysql 28714 15620 0 Feb10 pts/1 00:01:51 mysqld --defaults-file=/etc/my.cnfroot 31851 21524 0 Feb10 pts/0 00:00:00 mysql -usbuser -px xxxx -P6033 -h192.168.1.16[root@mysql2 opt]# [root@mysql2 opt]# [root@mysql2 opt]# kill -9 28714
此时再去 proxysql的程序端口6033中做读操作 超时不可读
sbuser@ 20:35: [z1_email]> select * from a1;ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
此时再去 proxysql的程序端口6033中做写操作 超时不可写
sbuser@ 22:17: [z1_email]> insert into a1 values(1591);ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
这时候我们做基于MHA 的手动failover操作
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.16 --master_state=dead --new_master_host=192.168.1.135 --ignore_last_failover
此时vip 已经漂移到192.168.1.135 上了 ,并且我们进proxysql管理端口 6032 看看
admin@ 20:35: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+---------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+---------+--------+| 100 | 192.168.1.101 | 3306 | SHUNNED | 1 || 1000 | 192.168.1.135 | 3306 | ONLINE | 9 || 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 |+--------------+---------------+------+---------+--------+
我们再进入 proxysql的 6033 端口看看是否可以做读操作 因为此时 192.168.1.135 的状态还是online的
sbuser@ 22:22: [z1_email]> select * from a1;+------+| id |+------+| 1 || 2 || 12 || 13 || 14 || 111 || 222 || 333 || 250 || 5 || 6 || 7 || 8 || 9 || 10 || 11 || 12 || 13 || 14 || 15 || 15 || 15 || 16 || 123 || 124 || 17 || 1000 || 1001 || 1002 || 1003 || 1003 || 1004 || 1004 || 134 || 146 || 157 || 1590 |+------+37 rows in set (0.00 sec)
可见是可以读的,那么我们vip 已经漂移到了192.168.1.135上了啊 是否可以写呢?
sbuser@ 22:23: [z1_email]> insert into a1 values(1591);Query OK, 1 row affected (0.14 sec)
发现可以写的,我们再回到管理端口6302 去看看居然发现 vip 192.168.1.101 的状态又变回了ONLINE (emmmm.....)
admin@ 22:21: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+---------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+---------+--------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 || 1000 | 192.168.1.135 | 3306 | ONLINE | 9 || 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 |+--------------+---------------+------+---------+--------+3 rows in set (0.01 sec)
所以这里我觉得 应该是proxysql 没有立刻获取 vip 已经漂移的状态,显示的是 SHUNNED ,但是并不影响使用 只是显示有问题
最后我们再把 192.168.1.16 恢复起来 change 到 新的master 192.168.1.135上
[root@mysql2 masterha]# mysqld --defaults-file=/etc/my.cnf &[2] 3490[root@mysql2 masterha]# mysql -uroot -poraclemysql: [Warning] Using a password on the command line interface can be insecure.Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 3Server version: 5.7.17-log MySQL Community Server (GPL)Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.root@ 22:28: [(none)]> change master to master_host='192.168.1.135', -> master_user='slave', -> master_password='oracle', -> master_auto_position=1;Query OK, 0 rows affected, 2 warnings (0.07 sec)root@ 22:35: [(none)]> start slave;Query OK, 0 rows affected (0.02 sec)root@ 22:36: [(none)]> show slave status\G;*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.135 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000016 Read_Master_Log_Pos: 743 Relay_Log_File: mysql2-relay-bin.000002 Relay_Log_Pos: 675 Relay_Master_Log_File: mysql-bin.000016 Slave_IO_Running: Yes Slave_SQL_Running: Yes ........
再查看一下 proxysql的管理端口 6032,发现192.168.1.16显示状态还是
admin@ 22:24: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+---------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+---------+--------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 || 1000 | 192.168.1.135 | 3306 | ONLINE | 9 || 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 |+--------------+---------------+------+---------+--------+3 rows in set (0.01 sec)
我们去proxysql的程序端口6033 进行查询一次
sbuser@ 22:23: [z1_email]> select * from a1;+------+| id |+------+| 1 || 2 || 12 || 13 || 14 || 111 || 222 || 333 || 250 || 5 || 6 || 7 || 8 || 9 || 10 || 11 || 12 || 13 || 14 || 15 || 15 || 15 || 16 || 123 || 124 || 17 || 1000 || 1001 || 1002 || 1003 || 1003 || 1004 || 1004 || 134 || 146 || 157 || 1590 || 1591 |+------+38 rows in set (0.00 sec)
再查看一下 proxysql的管理端口 6032看看 可见都显示ONLINE 了
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+--------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+--------+--------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 || 1000 | 192.168.1.135 | 3306 | ONLINE | 9 || 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |+--------------+---------------+------+--------+--------+3 rows in set (0.01 sec)
最后做一个总结:
MHA + proxysql 可以做到高可用和读写分离,在主库挂掉后切换到从库,通过主库的vip漂移的特性将proxysql中的写节点配置成vip,
并且总是主库在做写操作的,因为vip在哪台机器哪台机器就是主库。
而且如果我们做了如下结构的proxysql策略,则无论是 哪台机器挂掉 ,只要进行切换就不会影响读和写
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+--------+--------+| hostgroup_id | hostname | port | status | weight |+--------------+---------------+------+--------+--------+| 100 | 192.168.1.101 | 3306 | ONLINE | 1 || 1000 | 192.168.1.135 | 3306 | ONLINE | 9 || 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |+--------------+---------------+------+--------+--------+3 rows in set (0.01 sec)