Glusterfs – Advanced Troubleshooting Tips & Tricks Part-2


Friends continuing with the advanced know how and troubleshooting on glusterfs.In this article we have a 3 Node cluster running on glusterfs3.4 , For glusterfs installation steps , please refer GlusterFS On Centos 6 / RHEL 6. Below are the steps that are used for glusterfs troubleshooting.

Step:1 Check the Gluster volume status and information


[root@gluster1 ~]# gluster volume info


Step:2 To verify all the details of the replication in Bricks

The Below mentioned commands will show complete statistics of what data has been replicated and how much is to be replicated by checking on the size of the total disk space free

Note : there is however a discrepancy of few MB’s in the size shown as free in the stats , this is due to the fact that application might have open connections to the files and due to this there is a variance in df and du value.

[root@gluster1 ~]# gluster volume status all detail


Step:3 Now we need to have certain configuration to improve the performance and healing characterstics of the glusterfs

# gluster volume set gluster cluster.min-free-disk 5%
# gluster volume set cluster.rebalance-stats  on
# gluster volume set cluster.readdir-optimize  on
# gluster volume set cluster.background-self-heal-count  20
# gluster volume set cluster.metadata-self-heal  on
# gluster volume set  on
# gluster volume set cluster.entry-self-heal: on
# gluster volume set cluster.self-heal-daemon  on
# gluster volume set cluster.heal-timeout  500
# gluster volume set cluster.self-heal-window-size  2
# gluster volume set  diff
# gluster volume set cluster.eager-lock  on
# gluster volume set cluster.quorum-type  auto
# gluster volume set cluster.self-heal-readdir-size  2KB
# gluster volume set  5


Then run

# service glusterd restart

After we have set the cluster properties we can check the volume information as shown below.

[root@gluster1 ~]# gluster volume info







[root@gluster1 ~]# gluster volume status


Please not that Self-heal Daemon should be running on each system in the cluster since its responsible for healing if in case some node is down for some time from the cluster

Step:4 Now to remove a machine gluster0 from cluster.

Unmount the Volume mounted on the gluster0 machine

[root@gluster0 ~]# umount /mnt 
[root@gluster1 ~]# gluster volume remove-brick gluster replica 2 gluster0:/gluster0 commit

gluster volume info ( to verify)

[root@gluster1 ~]# gluster volume info


On gluster1 run the following command

# gluster peer detach gluster0

gluster0 server’s brick is removed from cluster.