- VXLAN is essentially a tunnelling technology which establishes a logical tunnel on the IP network between the source and destination network.
- VXLAN is a Layer 2 overlay network over a Layer 3 underlay network.
- An overlay network is a virtual network that runs on top of a physical underlay network(Transport Network).
- VXLAN defines a MAC-in-UDP encapsulation scheme where the original Layer 2 frame has a VXLAN header added and is then placed in a UDP-IP packet. Thus it is MAC-in-UDP encapsulation.
VXLAN Tunnel Technology
- Live migration of VMs across servers.
- Higher scalability.
- Smaller MAC table.
- Multitenant Environments.
- Improved network utilization.
Live Migration of VM’s:
- It is movement of a virtual machine from one physical server to another, while ensuring continuity of services deployed on the VMs.
- End users are unaware of the process, so administrators can flexibly allocate server resources or maintain and upgrade the physical servers without affecting normal server use by end users.
- To ensure service continuity during the migration of a VM, IP of VMs must remain unchanged.
- Therefore, VMs can be dynamically migrated only within a Layer 2 domain but not across Layer 2 domains.
- This can be ensured by VXLAN which establishes a layer 2 virtual network over Layer 3 networks.
- VXLAN encapsulates original packets sent by VMs over a VXLAN tunnel.
- In this way, VMs using IP addresses in the same network segment are in a Layer 2 domain logically, even if they are on different physical Layer 2 networks.
- VXLAN virtualizes the entire data center network into a large Layer 2 virtual switch.When a VM is migrated from one port of the Layer 2 virtual switch to another port, the IP address of the VM does not need to be changed.
- This ensures the live migration of VMs from one server to another.
Smooth VM migration using VXLAN
- VXLAN has 24 bit address space which allows scaling virtual network upto 16 millions which is much larger than 12 bits address space of VLAN.
- A 12-bit VLAN ID is used in data frames to divide a larger Layer 2 networks into multiple broadcast domains. This serves well for data centers, which require fewer than 4096 VLAN’s.
- It is typically deployed in data centers which may spreads over multiple racks where each server can have multiple VMs with different IP and mac address.
- The individual racks may be parts of a different Layer 3 network or they could be in a single Layer 2 network.
- Thus it can support 16 million network segment for large networks such as clouds that typically include many virtual machines.
Smaller MAC table:
- Since 16 million subnet can be created with VXLAN, number of host per subnet can be very small.
- The result is that the MAC address tables created for each subnet remain small which will be useful while scanning the mac table.
Improved Network Utilization:
- VXLAN overcomes the drawback of STP protocol which is used in VLAN for avoiding any loop in layer 2 network.
- In process of avoiding loop, the bandwidth(path) is not fully utilized.
- VXLAN packets are transferred through the underlying network based on its Layer 3 header and can take complete advantage of Layer 3 routing, equal-cost multipath (ECMP) routing, and link aggregation protocols to use all available paths.
- Thus network is utilized to maximum.
- For layer 2 networks, VLAN’s are often used to segregate traffic, so that a tenant can be identified by its own VLAN ID.
- The 4096 VLAN limit becomes inadequate when it comes to providing service to a large number of tenants. Further, this issue exacerbates when there is often a need for multiple VLAN’s per tenant.
- For example: Having 500 customers, each VLAN can have maximum 8 VLAN.
VXLAN Header Format:
- The size of VXLAN header is 8 bytes.
- It consist of four fields:
VXLAN Flags: It is of 1 byte and if the I bit is 1, the VXLAN ID is valid. If the I bit is 0,
the VXLAN ID is invalid. All other bits are reserved and set to 0.
Reserved: It is of 3 bytes.
VNI ID: It is of 3 bytes and used for identifying VXLAN segment.
Reserved: It is of 1 byte
VXLAN Header: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R|R|R|R|I|R|R|R| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | VXLAN Network Identifier (VNI) | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
VXLAN Encapsulation and Packet Format:
- VXLAN is a layer 2 overlay over a layer 3 network.
- It uses MAC Address-in-User Datagram Protocol(MAC-in-UDP) encapsulation to provide a means to extend layer 2 network over layer 3 network.
- VXLAN defines a MAC-in-UDP encapsulation scheme where the original Layer 2 frame has a VXLAN header added and is then placed in a UDP-IP packet.
- With this MAC-in-UDP encapsulation, VXLAN tunnels Layer 2 network over Layer 3 network.
- VXLAN encapsulation adds 50 bytes(14 + 20 + 8 + 8) to the overall size of ethernet frame.
- The destination port number in UDP header for VXLAN is fixed at 4789.
VXLAN Packet Format
VXLAN Tunnel Establishment:
There are two entities which need to be known before moving ahead:
- Virtual End Point(VTEP) and
- Virtual Network Identifier(VNI)
- Bridge Domain(BD)
Virtual End Point(VTEP):
- It is an edge device on a VXLAN network.
- It is a start and end point of a VXLAN tunnel.
- It is responsible fro encapsulating the original data frames sent by source server into a VXLAN packet and transmit the VXLAN packets to the destination VTEP on the IP Network.
- The destination VTEP then decapsulates the VXLAN packets into the original data frames and forwards the frames to the destination server.
Virtual Network Identifier(VNI):
- It is of 24 bits and is used for identifying the VXLAN segment.
- It is the VXLAN network ID.
- It is equivalent to having VLAN ID.
- Each bridge domain represents a Layer 2 broadcast domain.
- It is a set of logical ports which belongs to same broadcast domain.
- It can span one or more port of multiple devices.
- Each BDs are identified by a VNI.
- There is 1:1 mapping between BDs and VNI.
There are few aspect to ponder over:
- Which packets enter the VXLAN tunnel ?
- Which BDs the packet belong to ?
- How VXLAN tunnel is established?
- Which VXLAN tunnel the packet should enter ?
- How packets are forwarded on VXLAN Network ?
Which packets enter the VXLAN tunnel ?
In general when a packet reaches an interfaces, the interfaces has two important goals:
- Which packets are allowed to pass through based on configuration.
- How to process the packet that is allowed to pass through.
Same is the case with VXLAN, VTEP interfaces have similar responsibilities. These interfaces are logical layer 2 sub-interfaces.
A default sub-interface accepts all the packets and for VXLAN encapsulation and decapsulation, a default sub-interface does not perform any VLAN tag-related action on the original packets, including the addition, replacement, and removal of VLAN tags.
The VXLAN sub-interface is added to a BD
interface eth1.1 /* Create Layer 2 sub-interface eth1.1 */ encapsulation dot1q vid 10 //Configure it for packets with VLAN tag 10 to enter a VXLAN tunnel. bridge-domain 10 /* Add eth1.1 to BD 10 */
Which BDs the packet belong to ?
- This is based on the configuration, BD is created and VNI is mapped to it.
- This can be done on the VTEP using command
bridge-domain <domain instance> vxlan vni <vni id> bridge-domain 10 vxlan vni 5000
- So Bridge domain 10 is identified by VNI 5000.
- Then BDs to which a packet belong to can be determined based on the Layer 2 sub-interface configuration that is adding BD to an interface.
- After the mapping table is generated, the VTEP can add VNIs to the incoming packets based on the BDs to which the packets belong to.
How VXLAN tunnel is established?
There are typically two ways to establish VXLAN tunnel: Manual and Automatic
Manual Establishment of Tunnel:
- This method requires user to manually set the source and destination IP address of the tunnel that is IP addresses of local and remote VTEP’s respectively.
- This means a static VXLAN tunnel is manually established between the local and remote VTEP’s.
- There can be number of tunnels originating from same source and ending to different destination as shown above.
- This is again by handled by configuring the VTEP’s with : Source IP address, Tunnel Destination IP address and the VNI ID.
- A peer list is formed with the tunnel end point address and VNI.
interface eth2.1 /* Create logical interface eth2.1 source 126.96.36.199 /* Configure the IP address of the source VTEP. vni 5000 head-end peer-list 188.8.131.52 vni 5000 head-end peer-list 184.108.40.206
Thus with the above manual configuration, two tunnel are established with source: 220.127.116.11 and destination: 18.104.22.168 and 22.214.171.124
Automatic Establishment of a VXLAN Tunnel:
This establishment depends ethernet virtual private (EVPN). This is not covered for now.
Which Tunnel should packet Enter?
- There may be more than one VXLAN Tunnel that belong to same Bridge Domain(BD).
- For example, in the above mentioned peer list, the source 126.96.36.199 has two peer 188.8.131.52 and 184.108.40.206 in the same Bridge Domain(5000).
- So once the packets comes to VTEP-1 (220.127.116.11), which tunnel the packet supposed to enter that is tunnel with:
Destination IP Address: 18.104.22.168
Destination IP Address: 22.214.171.124
- In basic Layer 2 and Layer3 forwarding, Layer 2 forwarding relies on MAC address and Layer 3 relies on Forwarding Information Base(FIB) table.
- So in this MAC based learning is used for forwarding the packet to appropriate tunnel.
How Packets are forwarded on a VXLAN Network?
- A frame arrives on a switch port from a host. This port is a regular untagged (access) port, which assigns a VLAN to the traffic.
- The switch determines that the frame needs to be forwarded to another location. The remote switch is connected by an IP network. It may be close or many hops away.
- The VLAN is associated with a VNI, so a VXLAN header is applied. The VTEP encapsulates the traffic in UDP and IP headers. UDP port 4789 is used as the destination port. The traffic is sent over the IP network
- The remote switch receives the packet and decapsulates it. A regular layer-2 frame with a VLAN ID is left.
- The switch selects an egress port to send the frame out. This is based on normal MAC lookup. The rest of the process is as normal.
MAC Address Learning:
This section uses centralized VXLAN gateway networking where VXLAN tunnels are established manually and MAC address learning is used to forward the packet.
Again this can be seen for:
- Intra-Subnet Communication in centralized VXLAN Gateway networking
- Inter-Subnet Communication in Centralized VXLAN Gateway Networking
Intra-Subnet Communication in centralized VXLAN Gateway networking
- This section assumes that all the VTEP’s are on the same subnet 10.1.1.0/24 and belong to same VNI 5000.
- VM_A wishes to communicate with VM_C.
- Since this is the first communication with VM_C, VM_A does not have VM_C mac address.
- Hence sends a broadcast ARP request packet to get its MAC address.
Communication between VMs on same subnet
- VM_A will create an ARP request packet where the source MAC address is MAC_A, the destination MAC address is all Fs, the source IP address is IP_A and the destination IP address is IP_C.
- Once this ARP request packet reaches the VTEP_1, it updates its MAC table with MAC address of A(MAC_A), VNI(5000) and its interface.
- Now the same ARP request packet is encapsulated with VXLAN header, where the destination port of UDP header is 4789 and the source IP address is of VTEP_1(IP_1), the destination IP address is of VTEP_2(IP_2), source MAC address is MAC_1 and destination MAC address is of next hop towards the destination.
- Similarly one ARP request packet also goes to VTEP_3, where the destination port of UDP header is 4789 and the source IP address is of VTEP_1(IP_1), the destination IP address is of VTEP_3(IP_3), source MAC address is MAC_1 and destination MAC address is of next hop towards the destination.
- Once the encapsulated packet reaches the remote VTEP(VTEP_2), it is decapsulated to obtain the original packet and update their MAC address table with VM_A’s MAC address, VNI and the IP address of VTEP_1(IP_1).
- Similarly once the encapsulated packet reaches the remote VTEP(VTEP_3), it is decapsulated to obtain the original packet and update their MAC address table with VM_A’s MAC address, VNI and the IP address of VTEP_1(IP_1).
- Now this ARP request reaches VM_C and VM_B, they check whether the destination IP address of the incoming packet is their local IP address.
- VM_C local address matches with destination address of ARP request, it responds with ARP reply and VM_B discards as local address does not matches with ARP request destination IP address.
- Once the VM_C receives the ARP request and its local address matches with that of destination IP address of ARP request, it sends a unicast ARP reply.
- In ARP reply, the source IP address is the IP_C , the destination IP address is IP_A, the source MAC address is MAC_C and destination MAC address is MAC_A.
- After VTEP_3 receives the ARP reply from VM_C, it learns the MAC address entry(MAC_C, VNI 5000 and inbound interface and updates its MAC table.
- VTEP_3 then encapsulates the ARP reply where the outer source IP address is IP_3, the destination IP address is IP_1, source MAC address is MAC_3 and destination MAC address is that of next hop towards the destination. After encapsulation, the packet is transmitted on the IP network according to outer IP and MAC address, until it arrives at the remote VTEP(VTEP_1).
- Once the encapsulated packet reaches the remote VTEP(VTEP_1),it is decapsulated to obtain the original packet.
- VTEP_1 learns the MAC address entry(VM_C, VNI 5000 and IP_3) and updates in its MAC table and the original packet is send to VM_A.
VM_A and VM_C have learned each other MAC address and further communicate in unicast mode.
- This considers that the two VTEP’s are on different subnet that is they belong to different VNI.
- VM_A is on subnet 10.1.1.0/24 and belong to VNI 5000 and VM_B is on 10.1.20.0/24 and belongs to VNI 6000.
- The Layer 3 gateway addresses are 10.1.10.1 and 10.1.20.10 and MAC address are MAC_10 and MAC_20 on VM_A and VM_B side respectively.
- The gateway has a routes for for reaching both the subnet.
- The MAC address of VM_A and VM_B are already learned by each other.
- VM_A wishes to communicate with VM_B.
Communication between VMs on the same subnet
Packet Forwarding between VMs on different subnet:
- VM_A sends a packet to VM_B with source IP address IP_A , the destination IP address IP_B ,the source mac address MAC_A and destination mac that of Gateway interface mac MAC_10.
- Once the packet reaches VTEP_1, it is encapsulated with the outer packet, the VNI is 5000, the destination port is 4789, the source IP address is IP_1, the destination IP address is IP_3, the source mac address is MAC_1 and destination mac is that next hop mac address.
- After the packet reaches the gateway, it is decapsulated to get the original packet from VM_A and then the routing table is scanned to get the next hop(10.1.20.10) for reaching the destination(IP_B = 10.1.20.1).
- Once the next hop is known , the ARP table is scanned to get the mac address of next hop.
- The packet is with the source IP is IP_A, the destination IP is IP_B, source mac address is gateway interface mac MAC_20 and destination mac is that of MAC_B(after scanning the ARP table).
- Above packet is again encapsulated with VNI 6000, UDP destination port 4789, the source IP address is IP_3, the destination IP address is IP_2, the source mac address is MAC_3 and destination mac address is next hop MAC.
- After VXLAN encapsulation, the packet is transmitted on to the IP network according tho outer IP and mac address, until it arrives the remote VTEP.
- Once the packet reaches VTEP_2, it decapsulates the packet to obtain the inner data packet and send to VM_B.
- The process of sending reply is similar to above.
Create a VXLAN Packet with Scapy:
Scapy is a powerful Python-based interactive packet manipulation program and library.It can forge or decode packets, send them on the wire, capture them, and match requests and replies.
""" Packages """ from scapy.all import Ether, IP, UDP, TCP from scapy.layers.vxlan import VXLAN from scapy.packet import Raw from scapy.contrib.gtp import GTP_U_Header """ Packet Parameters """ dest_ip, src_ip = '126.96.36.199', '188.8.131.52' sport = dport = 4789 inner_src_mac, inner_dst_mac = '00:02:04:06:08:0c', '00:02:04:06:08:0d' inner_src_ip, inner_dest_ip = '10.10.1.2', '184.108.40.206' inner_sport, inner_dport = 1000, 2000 """ Original Packet """ org_packet = Ether(dst=inner_dst_mac, src=inner_src_mac) / IP(dst=inner_dest_ip, src=inner_src_ip) / UDP(sport=inner_sport, dport=inner_dport) / 'VXLAN packet') """ Encapsulate the original packet within vxlan packet """ def encapsulate(self, org_packet): vxlan_packet = (Ether() / IP(dst=dest_ip, src=src_ip) / UDP(sport=sport, dport=dport) / VXLAN(vni=110, flags="Instance")) return vxlan_packet/org_packet; """ Test whether the encapsulation is valid or not """ def test_encapsulation(self, vxlan_packet): # check if is set I flag self.assertEqual(vxlan_packet[VXLAN].flags, int('0x8', 16)) "" Decapsulate the VXLAN packet """ def decapsulation(self, vxlan_packet): return vxlan_packet[VXLAN].payload """Encapsulate the the GTP tunnel within VXLAN traffic""" vxlan_req_pkt = (Ether(dst=inner_dst_mac) / IP(dst=dest_ip, src=src_ip) / UDP(sport=sport, dport=dport) / VXLAN(vni=110, flags="Instance")) inner_req = IP(src=inner_src_ip, dst=inner_dest_ip) / UDP(dport=inner_dport, sport=inner_sport) / "DATA" innerraw_req = str(inner_req) exthdr = Raw('\x01' + '\x10\x01' + '\x00') # len + data + next exthdr encap_packet = (vxlan_packet / Ether(dst='80:20:30:40:50:60', src='80:60:70:80:90:20') / IP(src='10.0.0.2', dst='10.0.0.1') / UDP(dport=2152, sport=2152) / GTP_U_Header(version=1, gtp_type=0xff, PT=1, E=1, length=4 + len(exthdr) + len(innerraw_req), teid=98765432, next_ex=133) / exthdr / innerraw_req)
Leave a Reply