wiki:Linux/KvmQemuEasyRoutedNetwork

KVM / QEMU Easy Routed Networking

Spurred on by a question on IRC freenode #kvm channel I decided to figure out how to implement kernel-level network routing for virtual machines.

This avoids using bridging, sockets, the 'user-mode' networking, VDE (Virtual Distributed Ethernet) switching, and other fabulously confusing methods.

The basic principle is to use the existing Linux kernel mechanisms to route traffic to the VM guest(s) network interfaces, rather than using netfilter rules (set by iptables), nat, or various user-mode mechanisms like vde_switch.

I'll cover the settings and configuration required first, then I'll show how to use automatic scripts to handle it all.

Manual Host configuration

Add an alias to the physical network interface that connects to upstream (whether that be the wider LAN, or WAN/Internet). Use some environment variables to make this process easily scriptable:

export WAN_IF=eth0
export VMNET_GATEWAY_IF=eth0:0
export VMNET_GATEWAY_IP=10.254.254.1
export VMNET_NETMASK=255.255.255.0
export VMNET_BROADCAST=10.254.254.255
export VMNET_FIRST_GUEST_IP=10.254.254.2
# set to 0 to disable; 1 to enable
export PROXY_ARP=1

Create a New Interface Alias

/sbin/ifconfig $VMNET_GATEWAY_IF $VMNET_GATEWAY_IP netmask $VMNET_NETMASK broadcast $VMNET_BROADCAST

This creates the interface eth0:0.

Enable IP Forwarding

Ensure IP forwarding is enabled on the interface:

sudo sh -c "echo 1 > /proc/sys/net/ipv4/conf/${WAN_IF}/forwarding"

Enable ARP Proxy

Optionally, if you need to 'hide' the MAC (Media Access Control) address(es) of the VM guest(s) from the rest of the network, enable the ARP (Address Resolution Protocol) proxy on the host and later, on each tap interface:

sudo sh -c "echo $PROXY_ARP > /proc/sys/net/ipv4/conf/${WAN_IF}/proxy_arp"

Create tap Interfaces

Each simultaneous VM guest requires its own tap interface:

# interface numbers start at 0. GUEST_MAX=1 will create 1 interface: tap0
GUEST_MAX=1
IP=$VMNET_FIRST_GUEST_IP

# Double parentheses, and "GUEST_MAX" with no "$"
for ((IF=0; IF < GUEST_MAX ; IF++)); do
 # create the persistent tap interface
 sudo /usr/sbin/tunctl -t tap${IF} -u `id -un` -g `id -gn`
 # start the interface
 sudo /sbin/ip link set tap${IF} up
 # configure proxy_arp according to the environment variable setting
 sudo sh -c "echo $PROXY_ARP > /proc/sys/net/ipv4/conf/tap${IF}/proxy_arp"
 # route packets destined for the VM guest's IP address to this interface
 sudo /sbin/ip route add unicast $IP dev tap${IF}
 # figure out the IP address the next VM guest will use (increments the least-significant byte)
 IP=$(echo $IP | awk '{split($0 ,parts, "."); for(i=1;i<=3;i++) { printf("%d.", parts[i]);} print ++parts[4];}')
done

Note: set the tunctl user to sane settings based on your own security requirements.

Host Review

Check the interfaces are configured:

ifconfig -s | grep "^${WAN_IF}\|^tap"

eth0       1500 0     54543      0      0 0         56841      0      0      0 BMRU
eth0:0     1500 0       - no statistics available -                        BMRU
tap0       1500 0       355      0      0 0            25      0     17      0 BMRU

Check the route(s)

ip route | grep tap

10.254.254.2 dev tap0  scope link

Virtual Machine Network Options

KVM/QEMU should attach to a TAP interface (a virtual Ethernet interface) as part of its start-up. Here's an example of the network options required:

-net nic,model=rtl8139,macaddr=56:44:45:30:31:32,vlan=0 -net tap,script=no,ifname=tap0,vlan=0

-net nic defines which network hardware to emulate - this is the network adaptor the VM guest 'sees'.

-net tap defines how to connect to the host's network stack.

Notice that in this example the tap interface name is defined (ifname=tap0). In scripts this would usually be omitted so that the next available tap interface would be assigned (tap0, tap1, tap2, tap3...). Also, if this automated a network ifup script would be given, or the option would be omitted so that the KVM/QEMU default script is executed (usually /etc/qemu-ifup). We'll deal with that later.

Once the VM guest has started, set the static IP address (10.254.254.2, mask 255.255.255.0), gateway (10.254.254.1 - the IP of eth0:0), and preferred DNS server (preferably one on the LAN). Later, we'll do all this automatically using DHCP.

Now test network connectivity from host to VM guest, guest to host, guest DNS name resolving, and guest to Internet.

DHCP Service

dnsmasq won't attach to alias interfaces (eth0:0) so it must be set to listen to a specific address.

sudo /usr/sbin/dnsmasq --interface=tap0  --except-interface=lo --bind-interfaces --user=nobody \
 --dhcp-range=vmnet,10.254.254.2,10.254.254.253,255.255.255.0,10.254.254.255,8h \
 --domain=lan.tjworld.net --pid-file=/var/run/vmnet_dnsmasq.pid --conf-file