Using Ip Address On Bonded Channels In A Cluster

Home » CentOS » Using Ip Address On Bonded Channels In A Cluster
CentOS 5 Comments

I’m creating a firewall HA cluster. The proof of concept for the basic firewall cluster is OK. I can bring up the cluster, start the iptables firewall, and move all of this with no problem. I’m using Conga to do all of this configuration on CentOS 6.3 servers.

To extend the “HA” part of this, I’d like to use bonded channels instead of plain old NICs. The firewall uses the “IP address” service for the outside firewall IP addresses. Each server behind the firewall is NATted to one of these external IPs on the firewall’s external interface.

I’m not seeing how I can use bonded channels anywhere for these “IP
address” services. Part of the problem is that Conga will “guess” at which interface to place the ip address service upon. In the case of bonded channels, I don’t think Conga is even aware of the “bondx”
interface, and Conga only uses interfaces like eth0, eth1, etc.

I realize that the sysconfig network scripts will come into play here as well, but that’s another problem for me to tackle.

Does anyone have any experience with bonded channels and Conga? I could sure use some help with this.


steve campbell

5 thoughts on - Using Ip Address On Bonded Channels In A Cluster

  • I use bonding extensively, but I always edit cluster.conf directly. If conga doesn’t support “bond*” device names, please file a bug in red hat’s bugzilla.

    Once the bondX device is up, it will have the IP and the “ethX” devices can be totally ignored from the cluster’s perspective. Use the bondX
    device just as you would have used simple ethX devices.

    In case it helps, here is how I setup bonded interfaces on red hat clusters for complete HA;

  • Digimer,

    Thanks very much for the reply. I believe you had pointed out the link to me before on a more basic query. It was very helpful in giving me a real nice introduction to all the new stuff in CentOS 6 for clustering.

    After reading this page once again, I think my question is not being understood. It seems to be a problem of mine to not state those questions plainly.

    In your example, you use a VM to move the entire server from one VM host to another (or how ever you have that configured). That VM is a
    “service” defined under the cluster and it carries the IPs along with the VM.

    In my situation, my cluster consists of non-VM servers. The servers are real, with an inside and outside interface and IPs. They become firewalls by moving the external IPs and iptables rules as services. So in my situation, I use “ip address” and “script” to only move the IP
    addresses and start and stop iptables. The IP addresses would be bonded channels, much like you do in your VMs.

    If I’m not mistaken, the parameters for “ip address” do not offer anything like device or interface, so I’m failing to see how I can move the IPs between nodes as bonded channels. Individual IP addresses are not a problem. It works as expected.

    My network experience is not strong enough to know why I’d need a bridge in my situation as well.

    Perhaps I should back up and consider VMs. The main problem I see there is the time it might take to shutdown one VM and start another VM as opposed to just moving IPs and starting iptables.

    I’ve still not attacked conntrack yet either, so there’s plenty more for me to do.

    Thanks again for your very helpful reply.


  • Ah, ok, I think I get it.

    The ip resource agent looks for the interface that matches the managed IP’s subnet, and uses it. So if your bondX interface has an IP on the same subnet as your virtual IP, it will be used.

    Think of a bonded network device like you would a traditional mdadm based RAID array. Say you have /dev/sda5 + /dev/sdb5 and they create
    /dev/md0. Once created, you only look at/use /dev/md0 and you can effectively pretend that the two backing devices no longer exist. The software raid stack handles and hides failure management.

    In your case, you would, for example, take eth0 + eth1 and create bond0. Once done, eth{0,1} no longer have an IP address, only the bondX device does. The failure of a slaved interface is totally handled behind the scenes by the bond driver. So your application (cluster, iptables) will not know or care that the link changed behind the scenes.

    As for the VMs;

    In the tutorial, the VMs are indeed the HA service, but you can imagine your firewall in place of the VM, so far as the cluster is concerned. It’s just another resource. Also, if you do decide to go to a VM, you can live-migrate a VM between nodes, so there is no interruption. Of course, if the node backing the VM dies dramatically, the VM will need to reboot on the remaining good node, causing an outage of (in my experience) roughly 30 seconds. Again though, the VM approach is just one of many… Making a firewall the HA service directly is just fine.

    Of course, one benefit of VMs (and the reason I prefer them) is that the configuration of the software in the VM is trivial… No special consideration is needed on an app by app bases. Once you have your first VM cluster running, you can make anything (on any supported OS) HA.


  • I’m not sure the gratuitous arp thing would work as effectively when moving a VM as it does when moving an ip address. In the firewall scenario, with conntrack running and gratuitous arp, there should be little if any delay and little to no loss of connections to be transparent.

    I’ll try the bonded channel once I get some real servers running. For now, I’ve just used VMs to ensure the IP and iptables move as expected, which they appear to do. It’ll also give be a chance to try some real fencing, which I also don’t use on the VMs.

    Again, thanks for your documentation on how you did this all. You don’t realize how helpful it was in understanding the newer clustering software. For the most part, all the examples I could find used the older heartbeat.


  • (Almost) All hypervisors have a fence agent now, so you can in fact use
    “real” fencing with VM’ed nodes.

    Given HA is your priority, be sure to use mode=1 bonding (aka Active/Passive). It has the fastest/smoothest failover. You don’t get any aggregation of the bandwidth, but I suspect that’s not a concern for you.

    As for migrating VMs, it works smoothly with node network interruption. There is a very brief interruption when the processing actually kicks over to the new host, but it’s <1s. Again though, the real question is tolerable down time. If you lose the node hosting a VM, you're down until the VM reboots (say one minute, to guess high). If you make the iptables firewall the service, then even a total node failure recovers in seconds. The trade-off being complexity in the configuration.

    I learned a lot writing those docs, and it was a great way to convince people to help me learn the inners of clustering. So it was as much a selfish endeavour or collecting what others know more than anything. I
    love hearing that it has helped others though, so thanks! :)


    PS – #linux-cluster on freenode is a good place to ask questions and learn about clustering, too. Of course, questions here get archived.