KVM / QEMU Easy Routed Networking
Spurred on by a question on IRC freenode #kvm channel I decided to figure out how to implement kernel-level network routing for virtual machines.
This avoids using bridging, sockets, the 'user-mode' networking, VDE (Virtual Distributed Ethernet) switching, and other fabulously confusing methods.
The basic principle is to use the existing Linux kernel mechanisms to route traffic to the VM guest(s) network interfaces, rather than using netfilter rules (set by iptables), nat, or various user-mode mechanisms like vde_switch.
I'll cover the settings and configuration required first, then I'll show how to use automatic scripts to handle it all.
Manual Host configuration
Add an alias to the physical network interface that connects to upstream (whether that be the wider LAN, or WAN/Internet). Use some environment variables to make this process easily scriptable:
export WAN_IF=eth0 export VMNET_GATEWAY_IF=eth0:0 export VMNET_GATEWAY_IP=10.254.254.1 export VMNET_NETMASK=255.255.255.0 export VMNET_BROADCAST=10.254.254.255 export VMNET_FIRST_GUEST_IP=10.254.254.2 # set to 0 to disable; 1 to enable export PROXY_ARP=1
Create a New Interface Alias
/sbin/ifconfig $VMNET_GATEWAY_IF $VMNET_GATEWAY_IP netmask $VMNET_NETMASK broadcast $VMNET_BROADCAST
This creates the interface eth0:0.
Enable IP Forwarding
Ensure IP forwarding is enabled on the interface:
sudo sh -c "echo 1 > /proc/sys/net/ipv4/conf/${WAN_IF}/forwarding"
Enable ARP Proxy
Optionally, if you need to 'hide' the MAC (Media Access Control) address(es) of the VM guest(s) from the rest of the network, enable the ARP (Address Resolution Protocol) proxy on the host and later, on each tap interface:
sudo sh -c "echo $PROXY_ARP > /proc/sys/net/ipv4/conf/${WAN_IF}/proxy_arp"
Create tap Interfaces
Each simultaneous VM guest requires its own tap interface:
# interface numbers start at 0. GUEST_MAX=1 will create 1 interface: tap0 GUEST_MAX=1 IP=$VMNET_FIRST_GUEST_IP # Double parentheses, and "GUEST_MAX" with no "$" for ((IF=0; IF < GUEST_MAX ; IF++)); do # create the persistent tap interface sudo /usr/sbin/tunctl -t tap${IF} -u `id -un` -g `id -gn` # start the interface sudo /sbin/ip link set tap${IF} up # configure proxy_arp according to the environment variable setting sudo sh -c "echo $PROXY_ARP > /proc/sys/net/ipv4/conf/tap${IF}/proxy_arp" # route packets destined for the VM guest's IP address to this interface sudo /sbin/ip route add unicast $IP dev tap${IF} # figure out the IP address the next VM guest will use (increments the least-significant byte) IP=$(echo $IP | awk '{split($0 ,parts, "."); for(i=1;i<=3;i++) { printf("%d.", parts[i]);} print ++parts[4];}') done
Note: set the tunctl user to sane settings based on your own security requirements.
Host Review
Check the interfaces are configured:
ifconfig -s | grep "^${WAN_IF}\|^tap" eth0 1500 0 54543 0 0 0 56841 0 0 0 BMRU eth0:0 1500 0 - no statistics available - BMRU tap0 1500 0 355 0 0 0 25 0 17 0 BMRU
Check the route(s)
ip route | grep tap 10.254.254.2 dev tap0 scope link
Virtual Machine Network Options
KVM/QEMU should attach to a TAP interface (a virtual Ethernet interface) as part of its start-up. Here's an example of the network options required:
-net nic,model=rtl8139,macaddr=56:44:45:30:31:32,vlan=0 -net tap,script=no,ifname=tap0,vlan=0
-net nic defines which network hardware to emulate - this is the network adaptor the VM guest 'sees'.
-net tap defines how to connect to the host's network stack.
Notice that in this example the tap interface name is defined (ifname=tap0). In scripts this would usually be omitted so that the next available tap interface would be assigned (tap0, tap1, tap2, tap3...). Also, if this automated a network ifup script would be given, or the option would be omitted so that the KVM/QEMU default script is executed (usually /etc/qemu-ifup). We'll deal with that later.
Once the VM guest has started, set the static IP address (10.254.254.2, mask 255.255.255.0), gateway (10.254.254.1 - the IP of eth0:0), and preferred DNS server (preferably one on the LAN). Later, we'll do all this automatically using DHCP.
Now test network connectivity from host to VM guest, guest to host, guest DNS name resolving, and guest to Internet.
DHCP Service
dnsmasq won't attach to alias interfaces (eth0:0) so it must be set to listen to a specific address.
sudo /usr/sbin/dnsmasq --interface=tap0 --except-interface=lo --bind-interfaces --user=nobody \ --dhcp-range=vmnet,10.254.254.2,10.254.254.253,255.255.255.0,10.254.254.255,8h \ --domain=lan.tjworld.net --pid-file=/var/run/vmnet_dnsmasq.pid --conf-file