LVS Tun | Notion

LVS: LVS-Tun
LVS: LVS-Tun 8.1. LVS-Tun Intro LVS-Tun is an LVS original. It is based on LVS-DR. The LVS code encapsulates the original packet (CIP->VIP) inside an ipip packet of DIP->RIP, which is then put into the OUTPUT chain, where it is routed to the realserver. (There is no tunl0 device on the director; ip_vs() does its own encapsulation and doesn't use the standard kernel ipip code. This possibly is the reason why PMTU on the director does not work for LVS-Tun - see MTU.) The realserver receives the packet on a tunl0 device (see need tunl0 device) and decapsulates the ipip packet, revealing the original CIP->VIP packet.

Initially only Linux could decapsulate IPIP packets, but recently FreeBSD and w2k can now do it too (hmm 2005, Microsoft has dropped support for IPIP).

If you want to try a test LVS-Tun setup on the bench, take a standard LVS-DR setup LVS-DR example, change lo on the realservers to tunl0 (and handle the ARP problem on tunl0) and change the ipvsadm switch from -g to -i . If your clients are going to be sending large packets, you need to set the MTU (see MTU for the ipip packet DIP->RIP). This can be done on the realserver with iptables (see tunl MTU solved) or iproute2 (see setting the MTU by route).

As with LVS-DR, the director doesn't know about the VIP on the realserver (it only knows about the RIP). Health checking of a service listening on the VIP on the realserver then must use a connection between the DIP and the RIP (if the demon is listening on both the RIP and DIP, the service listening on the RIP can be a proxy for the service listening on the VIP).

LVS-Tun allows the realservers to be geographically remote from the director (this is the main point of LVS-Tun). If your realservers cannot do ipip decapsulation, you can still have geographically remote realservers using other techniques (see non tunnelling realservers).

(see also Julian's LVS-Tun write up and postings to the mailing list).

8.2. LVS-Tun example setup Here's an example set of IPs for a LVS-Tun setup. For (my) convenience the servers are on the same network as the client. The only restrictions for LVS-Tun with remote hosts are that the client must be able to route to the director and that the realservers must be able to route to the client (the return packets to the client come directly from the realservers and do not go back through the director).

Normally for LVS-Tun, the client is on a different network to the director/server(s), and each server has its own route to the outside world. In the simple test case below where all machines are on the 192.168.1.0 network there would be no default route for the servers, and routing for packets from the servers to the client would use the device on the 192.168.1.0 network (presumably eth0). In reallife, the realservers would have their own router/connection to the internet and packets returning to the client would go through this router. In any case reply packets do not go back through the director.

Machine IP client CIP=192.168.1.254 director DIP=192.168.1.1 VIP=192.168.1.110 (arps, IP clients connect to) realserver-1 RIP1=192.168.1.2, VIP (tunl0, non-arping, 192.168.1.110) realserver-2 RIP2=192.168.1.3, VIP (tunl0, non-arping, 192.168.1.110) realserver-3 RIP3=192.168.1.4, VIP (tunl0, non-arping, 192.168.1.110) . . realserver-n RIPn=192.168.1.n+1, VIP (tunl0, non-arping, 192.168.1.110) #lvs_tun.conf LVS_TYPE=VS_TUN INITIAL_STATE=on VIP=eth0:110 192.168.1.110 255.255.255.255 192.168.1.110 DIP=eth0 192.168.1.9 192.168.1.0 255.255.255.0 192.168.1.255 DIRECTOR_DEFAULT_GW=client SERVICE=t telnet rr realserver1 realserver2 SERVER_VIP_DEVICE=tunl0 SERVER_NET_DEVICE=eth0 SERVER_DEFAULT_GW=client #----------end lvs_tun.conf------------------------------------

| | | client | || CIP=192.168.1.254 | CIP->VIP | | ^ v | | VIP->CIP | VIP=192.168.1.110 | (eth0:1, arps) | __ | | | | | director |------- |**| | DIP=192.168.1.1 | (eth0) | | DIP->RIP(CIP->VIP) | | v

| | | | | | RIP1=192.168.1.2 RIP2=192.168.1.3 RIP3=192.168.1.4 (eth0) VIP=192.168.1.110 VIP=192.168.1.110 VIP=192.168.1.110 (all tunl0,non-arping)

| | | | | | | realserver | | realserver | | realserver | || || |**___|

Here's a likely production setup (I haven't done this one myself). It assumes the realservers are on a different network to the DIP. Here x.x.x.? and y.y.y.? are public IPs. The 176 and 10 addresses are for communication between the different locations and will be assigned by the ISP.

                   ________
                  |        |
                  | client |
                  |________|
                  CIP=x.x.x.1
                      |
        CIP->VIP |    |---------------------------------
                 v    |                                 |
                  __________                            | 
                 |          |                           |
                 | D-router |                           |
                 |__________|                           |
                      |                                 |
        CIP->VIP |    |                                 |
                 v    |                                 |
                      |                                 |
            VIP=y.y.y.110(eth0, arps)                   |
                  __________                            |
                 |          |                           |
                 | director |                           |
                 |__________|                           |
            DIP=176.0.0.1 (eth1)                        |
                      |                              ^  |




RIP1=10.0.0.1(eth0) RIP2=10.0.0.2(eth0)
VIP=y.y.y.110(tunl0) VIP=y.y.y.110(tunl0)

| | | | | realserver | | realserver | | tunl0: CIP->VIP | | | | eth0: VIP->CIP | | | |_________________| |___________________| 8.3. You need a tunl0 device Note tunl0 is a networking device like eth0, lo, and dummy0. In LVS-Tun, the tunl0 device holds the VIP, just as the lo device holds the device for LVS-DR. You need to build the tunl0 device into the Linux kernel (in networking options - IP:tunneling) - it is turned off by default. The tunnelling (ipip) can be built as a module, in which case you'll have to insmod ipip before you can use it, or you can build ipip directly into the kernel. With a kernel enabled for ipip, you should be able to see the unconfigured tunl0 device with ifconfig or with ip addr show (Feb 2004 - myifconfig used to see the unconfigured tunl0, but it doesn't anymore.)

Then you configure the tunl0 device (even if ifconfig can't see it).

ifconfig tunl0 192.168.1.110 netmask 255.255.255.255 broadcast 192.168.1.110 when the tunl0 device becomes visible to ifconfig

ip addr add dev tunl0 192.168.1.110/32 brd 192.168.1.110 Note the VIP is a /32 addr, so the brd addr is the VIP, not x.x.x.255. 8.4. the ARP problem with LVS-Tun If the realservers and director are on a different network (e.g. the realservers are geographically remote), then the router infront of the realservers will not be advertising routes to the VIP and you won't need to handle the ARP problem on the realservers. In effect you are using Lars' method without having to do anything special.

If the realservers are using the same router as the director you need to handle the ARP problem for the realservers (set tunl0 to not reply to arp queries). This networking is the same as for LVS-DR and you'd only do this to test LVS-Tun. (there's no other reason to use LVS-Tun with the LVS-DR network). However all my LVS-Tun test cases used the same networking as for LVS-DR, i.e. the DIP and RIPs were on the same network and only one router (actually none, the client with 1 or 2 NICs, faced directly onto the director and realservers). In this case I had to handle the ARP problem for the realservers.

| | | client | |________| CIP=192.168.1.254 | CIP->VIP | | ^ v | | VIP->CIP | VIP=192.168.1.110 | (eth0:1, arps) | __________ | | | | | director |------- |**| | DIP=192.168.1.1 | (eth0) | | DIP->RIP(CIP->VIP) | | v

| | | client | || CIP=192.168.1.254 | CIP->VIP | | ^ v | | VIP->CIP | VIP=192.168.1.110 | (eth0:1, arps) | __ | | | | | director |------- |**| | DIP=192.168.1.1 | (eth0) | | DIP->RIP(CIP->VIP) | | v