Proxmox clustering and multicast (also, DNS)

After much problems with getting a new Proxmox cluster up and running two things have helped:

Putting the SAN IP addresses in the hosts file, avoiding DNS dependancies (especially when one of the systems isn’t in there yet…). This put me on track: http://blog.rhavenindustrys.com/2013/04/curious-proxmox-clustering-fix.html

Important bit:

127.0.0.1 localhost.localdomain localhost
169.254.0.1 proxmox1.local proxmox1 pvelocalhost
169.254.0.2 proxmox2.local proxmox2

10.10.5.101 proxmox1.example.com
10.10.5.102 proxmox2.example.com

This associates the short aliases with the network used by corosync, while leaving the long addresses to the outside world.

Then tried tests detailed at: https://pve.proxmox.com/wiki/Troubleshooting_multicast,_quorum_and_cluster_issues

The multicast test failed – i.e. running

omping -c 10000 -i 0.001 -F -q <list of all nodes>

on both nodes at the same time resulted in 100% loss. Fixed this by disabling IGMP snooping on the SAN VLAN. ExtremeOS command is:

disable igmp snooping <vlanname>

Hey presto, after getting the second node to join properly:

pvecm add 192.168.xxx.xxx -force

it gets quorum immediately. I suspect this issue was causing a lot of the historical issues with getting quorum to work on this switch.

Published by

Jamie Scott

IT Administrator at the Institute for Gravitational Research, University of Glasgow