Link aggregation or IEEE 802.1AX-2008 is a computer networking term which describes using multiple network cables/ports in parallel to increase the link speed beyond the limits of any one single cable or port, and to increase the redundancy for higher availability.
Most implementations now conform to what used to be clause 43 of IEEE 802.3-2005 Ethernet standard, usually still referred to by its working group name of "IEEE 802.3ad". The definition of link aggregation has since moved to a standalone IEEE 802.1AX standard.
Other terms for link aggregation include Ethernet trunk, NIC teaming, port channel, port teaming, port trunking, link bundling, EtherChannel, Multi-link trunking (MLT), NIC bonding, network bonding,[1] and Network Fault Tolerance (NFT).
Link Aggregation between a switch and a server
Description
Link aggregation addresses two problems with Ethernet connections: bandwidth limitations and lack of redundancy.
With regard to the first issue: bandwidth requirements do not scale linearly. Ethernet bandwidths historically have increased by an order of magnitude each generation (10 Megabit/s, 100 Mbit/s, 1000 Mbit/s, 10000 Mbit/s). If one started to bump into bandwidth ceilings, then the only option was to move to the next generation which could be cost prohibitive. An alternative solution, introduced by many of the network manufacturers in the early 1990s, is to combine two physical Ethernet links into one logical link via channel bonding. Most of these solutions required manual configuration and identical equipment on both sides of the aggregation.[2]
The second problem involves the three single points of failure in a typical port-cable-port connection. In either the usual computer-to-switch or in a switch-to-switch configuration, the cable itself or either of the ports the cable is plugged into can fail. Multiple physical connections can be made, but many of the higher level protocols were not designed to failover completely seamlessly.
IEEE Link Aggregation
Standardization process
By the mid 1990s, most network switch manufacturers had included aggregation capability as a proprietary extension to increase bandwidth between their switches. But each manufacturer developed its own method, which led to compatibility problems. The IEEE 802.3 group took up a study group to create an inter-operable link layer standard in November 1997 meeting.[2] The group quickly agreed to include an automatic configuration feature which would add in redundancy as well. This became known as "Link Aggregation Control Protocol".
Initial release 802.3ad in 2000
As of 2009[update] most gigabit channel-bonding uses the IEEE standard of Link Aggregation which was formerly clause 43 of the IEEE 802.3 standard added in March 2000 by the IEEE 802.3ad task force.[3] Nearly every network equipment manufacturer quickly adopted this joint standard over their proprietary standards.
Move to 802.1 layer in 2008
David Law noted in 2006 that certain 802.1 layers (such as 802.1X security) were positioned in the protocol stack above Link Aggregation which was defined as an 802.3 sublayer.[4] This discrepancy was resolved with formal transfer of the protocol to the 802.1 group with the publication of IEEE 802.1AX-2008 on 3 November 2008.
Link Aggregation Control Protocol
Within the IEEE specification the Link Aggregation Control Protocol (LACP) provides a method to control the bundling of several physical ports together to form a single logical channel. LACP allows a network device to negotiate an automatic bundling of links by sending LACP packets to the peer (directly connected device that also implements LACP).
Advantages over static configuration
- Failover when a link fails and there is (for example) a Media Converter between the devices which means that the peer will not see the link down. With static link aggregation the peer would continue sending traffic down the link causing it to be lost.
- The device can confirm that the configuration at the other end can handle link aggregation. With Static link aggregation a cabling or configuration mistake could go undetected and cause undesirable network behavior.[5]
Practical notes
LACP works by sending frames (LACPDUs) down all links that have the protocol enabled. If it finds a device on the other end of the link that also has LACP enabled, it will also independently send frames along the same links enabling the two units to detect multiple links between themselves and then combine them into a single logical link. LACP can be configured in one of two modes: active or passive. In active mode it will always send frames along the configured links. In passive mode however, it acts as "speak when spoken to", and therefore can be used as a way of controlling accidental loops (as long as the other device is in active mode). An example of this is setting a switch to passive and a server to active, this is great when a server is moved or removed because the LACP traffic will stop and the switch will not continue to use CPU and memory on an lacp channel-group that is no longer in use.[3]
Usage
Network backbone
Link aggregation offers an inexpensive way to set up a high-speed backbone network that transfers much more data than any one single port or device can deliver. Although, in the past, various vendors used proprietary techniques, the preference today is to use the IEEE standard, which can either be set up statically or by using the Link Aggregation Control Protocol (LACP). This allows several devices to communicate simultaneously at their full single-port speed while not allowing any one single device to monopolize all available backbone capacity.
The actual benefits vary based on the load-balancing method used on each device (administrators can configure different balancing algorithms at each end and this is actually encouraged to avoid path polarization).
Link aggregation also allows the network's backbone speed to grow incrementally as demand on the network increases, without having to replace everything and buy new hardware.
Most backbone installations install more cabling or fiber optic pairs than initially necessary, even if they have no immediate need for the additional cabling. This is done because labor costs are higher than the cost of the cable, and running extra cable reduces future labor costs if networking needs change. Link aggregation can allow the use of these extra cables to increase backbone speeds for little or no extra cost if ports are available.
Order of frames
When balancing traffic, network administrators often wish to avoid reordering Ethernet frames. For example, TCP suffers additional overheads when dealing with out-of-order packets. This goal is approximated by sending all frames associated with a particular session across the same link[6]. The most common implementations use L3 hashes (i.e. based on the IP address), ensuring that the same flow is always sent via the same physical link.
However, depending on the traffic, this may not provide even distribution across the links in the trunk. It effectively limits the client bandwidth in an aggregate to its single member's maximum bandwidth per session. Principally for this reason 50/50 load balancing is almost never reached in real-life implementations; around 70/30 is more usual. Advanced switches can employ an L4 hash (i.e. using TCP/UDP port numbers), which will bring the balance closer to 50/50 as different L4 flows between two hosts can make use of different physical links.
Efficiency of equipment
Aggregation becomes inefficient beyond a certain bandwidth — depending on the total number of ports on the switch equipment. A 24-port gigabit switch with two 8-gigabit trunks is using sixteen of its available ports just for the two interswitch connections, and leaves only eight of its 1-gigabit ports for other devices. This same configuration on a 48-port gigabit switch leaves 32 1-gigabit ports available, and so it is much more efficient (assuming of course that those ports are actually needed at the switch location).
When a switch utilizes 40-50% of its ports for backbone trunking, upgrading to a switch with either more ports or a higher base-operating speed may be a better option than simply adding more switches, especially if the old switch can be re-used elsewhere on a less performance-critical part of the network.
Use on network interface cards
Network interface cards (NICs) trunked together can also provide network links beyond the throughput of any one single NIC. For example, this allows a central file server to establish an aggregate 2-gigabit connection using two 1-gigabit NICs trunked together. Note the data signaling rate will still be 1Gb/s, which can be misleading depending on methodologies used to test throughput after link aggregation is employed.
Note that Microsoft Windows does not natively support link aggregation (at least up to Windows 2003).[7] However, some manufacturers provide software for aggregation on their multiport NICs at the device-driver layer. Intel, for example, has released a package for Windows called Advanced Networking Services (ANS) to bind Intel Fast Ethernet and Gigabit cards.[8] Nvidia also supports "teaming" with their Nvidia Network Access Manager/Firewall Tool.
Linux, FreeBSD, NetBSD, OpenBSD, Mac OS X, OpenSolaris, Citrix XenServer, VMware ESX, and commercial Unix distributions such as AIX implement Ethernet bonding (trunking) at a higher level, and can hence deal with NICs from different manufacturers or drivers, as long as the NIC is supported by the kernel.
Limitations
Single switch
All physical ports in the link aggregation group must reside on the same logical switch, which in most scenarios will leave a single point of failure when the physical switch to which both links are connected goes offline.
However, almost all vendors have proprietary extensions address this issue: they aggregate multiple physical switches into one logical switch. As of 2009[update], the IEEE has not yet committed resources to standardize this feature.
Same link speed
In most implementations, all the ports used in an aggregation consist of the same physical type, such as all copper ports (CAT-5E/CAT-6), all multi-mode fiber ports (SX), or all single-mode fiber ports (LX). However, all the IEEE standard requires is that the each link be full duplex and all of them have an identical speed (10, 100, 1000 or 10000 Mbps).
Many switches are PHY independent, meaning that a switch could have a mixture of copper, SX, LX, LX10 or other GBICs. While maintaining the same PHY is the usual approach. It is possible to aggregate a 1000BASESX fiber for one link and a 1000BaseLX (longer, diverse path) for the second link, but the important thing is that the speed will be 1 Gbit/s full duplex for both links. One path may have a slightly longer transit time but the standard has been engineered so this will not cause an issue.
See also
References