3   CESNET2 Backbone Network - Operations and Development

CESNET2 backbone network was entirely reconstructed during the year 2003 both with regard to the logic topology, technology, stability and general availability of the operated services. We succeeded in fulfilling most of the stated goals; however, in the meantime it became obvious that in some cases interim solutions must be used. The final planned stage should be reached in the beginning of 2004. The problems were partly caused by a delayed delivery of new hardware (Supervisor Engine Sup720 with HW support for IPv6 ordered for access routers OSR7609 in the network core ). CESNET2 network is the first client of Cisco Systems to deploy these new high-performance control cards in a production network.

The development and changes of the backbone network can be briefly summarised as follows:

3.1   Topology of the backbone network

The basic logical topology of the backbone network is formed by ten nodes (GigaPoPs) interconnected over data circuits with a transmission capacity of at least 1 Gbps (see Figure). The backbone circuits use GE (Gigabit Ethernet) and POS STM-16/OC-48 having transmission capacity of 2.5 Gbps. Most gigabit nodes are connected to the backbone network via two data circuits for redundancy. The optical circuits are operated in several configurations:

[Figure]

Figure 3.1: Current topology of CESNET2 backbone network (large image)

GBIC-CWDM-1550

CWDM GBIC can be used only for GE on shorter distances. We use a version for the 1550 nm wavelength, which has minimal signal attenuation (according to its producer's statement, a typical operating distance is around 100 km with the attenuation limit of 32 dBm). These are used for example on the links Prague-Pardubice, Prague-Ústí nad Labem, Ústí nad Labem-Liberec and Olomouc-Zlín.

These GBICs are also deployed on the international link Brno-Bratislava where we had to use the Catalyst 3524 switch as a signal regenerator due to the total distance that had to be spanned.

Optical EDFA amplifiers

Following the extensive tests performed by the project Optical Networks and their Development (see below), we started deploying EDFA amplifiers on the majority of our long-haul optical links (Prague-Pilsen, Pilsen-Èeské Budìjovice, Èeské Budìjovice-Brno, Brno-Ostrava etc.). In all these cases, active elements are located only at the ends of the routes, hence the name "Nothing in Line" (NIL). Furthermore, the optical routes are protocol transparent.

Optical SDH regenerators STM-16/OC-48

We used Cisco ONS15104 optical regenerators for the first optical routes of the backbone network (for example Prague-Brno). The regenerators perform a complete opto-electro-optic regeneration according to ITU SONET/SDH standards (with a signal delay of approximately 20 us) on the line and section layer levels. They use 192 kbps SDCC channel for in-band management (telnet/ssh access and SNMP). The disadvantage of the regenerators is their relatively small operating distance (approximately 80 km) so they must be deployed in the middle of an optical route (for example we used three of them on a 320 km-long route Prague-Brno). Another drawback is their protocol dependence (support only SONET/SDH).

From the operational point of view the regenerators are perfectly suitable for a production environment because of their administration possibilities, flawless function and reliability - we have not experienced any problem with them since they were deployed (in 1999). We expect to construct a new route Prague-Brno with the use of DWDM which will replace the current technology.

The transition of all backbone links to the dark optical fibres will be finished by the end of January 2004. We were sometimes forced to adapt the network topology to the availability of the optical routes between the individual nodes. Some planned routes could not be implemented because of prohibitive cost (due to long distances), and in some cases it turned out that a particular optical route passes through the location of one of our nodes and thus it made more sense to split the route into two (for example, the route from Ústí nad Labem to Pilsen crosses Prague and we thus split it up into two routes, both terminated in our Prague PoP).

The new routes to be implemented are Hradec Králové-Olomouc and Olomouc-Ostrava. The latter will substitute the current route Hradec Králové-Ostrava. In order to complete the transition to the leased optical fibres, it still remains to solve some issues like delayed deliveries of optical amplifiers or technical problems with certain equipment.

The basic transmission protocol of the backbone network is IP/MPLS. We use OSPFv2 as the IGP protocol of the MPLS network. The logic topology of the network is divided into two functional levels to which the topology of individual GigaPoPs is adapted:

Network Core (Core Backbone Area)

The network core itself is formed by GSR12016 routers (red color in the Figure), on which all backbone data circuits are terminated. They perform only MPLS functions (except for IP multicast) and are transparent from the viewpoint of unicast IP. The IOS version used (12.0(21)S7) supports only TDP, not LDP. The switch-over to backup circuits is controlled by the OSPF protocol (we reduced the timing parameters towards a faster convergence, for example OSPF Hello to 1 s). The type of HW interface we use and the transition to GE will not allow to use the more efficient and faster re-routing mechanisms (Fast Re-routing or DPT) with re-routing time around 50 ms.

Access network part (GigaPoP Area)

On the edge of our backbone - as MPLS Provider Edge (PE) devices - we use Cisco OSR7609 access routers (with Supervisor2/PFC2 Engine), except for smaller nodes (Ústí nad Labem, Zlín) where Cisco 7206-VXR access routers with NPE-G1 are used. These routers perform all functions and transport services of the backbone network (MPLS, MPLS VPN, QoS, IPv4 routing, IPv4 multicast, export of NetFlow statistics, access filters) for the connected sites. While academic metropolitan networks (PASNET, BAPS,..) and university networks are connected over gigabit Ethernet, other participants and remote work places are often connected with smaller capacities (Ethernet, Fast Ethernet). The access routers and network core routers are always connected through two interfaces GE or POS STM-16/OC-48 with load distribution at the OSPF level.

The smaller network nodes (PoP) are connected to these access routers by 100 Mbps optical circuits (a single fiber connection, green lines in the Figure), 34 or 10 Mbps microwave links (we use the conversion to 100BASE-T at both ends of the 34 Mbps circuits) and leased fixed circuits. Smaller routers with a limited functionality are used in these nodes as MPLS Customer Edge (CE) devices (Cisco 2621, C2651-XM or C2691).

A typical GigaPoP topology is shown in the Figure - the architecture of the Liberec node serving as an example. Due to the limited hardware support of MPLS in the OSR7609 router (with the current Supervisor2/PFC2 engines, MPLS is supported only on OSM modules), the connection to the core router is accomplished by two interfaces of the OSM-4GE-WAN module. Unfortunately, MPLS is not supported on WS-X6516-GBIC and WS-X6548-RJ45 access modules. We had to create an MPLS Intra-PoP external connection for connecting other routers with MPLS support (a port of OSM-4GE-WAN 3/3 module and GE1/2 on the supervisor) and distributing MPLS signalisation to the adequate LAN port over 802.1Q VLANs.

Each node's functionality is divided into three parts: MPLS support, connection of service servers and connection of end users. Cisco 7500 routers serve as 6PE routers (see below) or as test PE/6PE routers, e.g., for IPv6 multicast experiments. The service segments include OOB access, VoIP gateway, UPS and other devices and servers. End users are connected over 802.1Q virtual LANs. If MPLS VLANs are needed, we use the 802.1Q trunk and take care of mapping MPLS VLANs into 802.1Q.

[Figure]

Figure 3.2: GigaPoP topology example (Liberec) (large image)

3.2   IPv4 Routing

We use internal BGPv4 (iBGP) between PE routers as the internal routing protocol conveying information about network prefixes reachable in individual nodes (see Figure). The external routers R84, R85 and R21 are (temporarily) configured as redundant route reflectors RR1, RR2 and RR3 (iBGP neighbours are shown in Figure only for RR1). The other PE routers run as route reflector clients. The use of route reflectors reduces the number of neighbor adjacencies in the backbone network.

The GigaPoP access routers propagate only static aggregated prefix blocks to the backbone, i.e., no redistribution from the inner routing protocols takes place. Metropolitan networks use OSPFv2 internally (with an OSPF process identifier different from the one used on the backbone network) and smaller networks often use static routes. Two notable exceptions are the networks of PASNET and ÈVUT that act as pseudo-autonomous systems (using private AS numbers 65001 and 65002) and must thus be routed using external BGP.

[Figure]

Figure 3.3: Internal unicast routing (route reflector RR1 on R84) (large image)

3.3   IPv4 Multicast Routing

We use internal MBGP for the exchange of multicast routing information. The configuration of iMBGP is similar to the iBGP configuration of PE routers with three route reflectors on R84, R85 and (temporarily) R21. However, for multicast the route reflector clients must be configured on all P and PE routers (see Figure), or otherwise the RPF (Reverse Path Forwarding) checks would fail on the core routers and all multicast packets would be discarded.

[Figure]

Figure 3.4: Multicast Routing (iBGP RR1 on R84) (large image)

We use PIMv2-SM for multicast routing. The backbone network is divided into separate PIM domains (see Figure), each having its own rendezvous point (RP). A multicast data source first registers with the nearest RP, which results in creating the so-called SPT (Shortest-path Tree) from RP towards the source. Starting from the last hop designated router, which connects a network where one or more recipients exist, all routers along the way to the RP send their register messages (hop by hop). This procedure results in the so-called shared tree (*,G) where the asterisk is a wildcard indicating any source. The initial data transfer proceeds towards RP via SPT and then towards the recipient via the shared tree. However, the data transfer using the shared tree is often not optimal and so a path optimization can take place - the last hop router switches over to the optimal shortest path tree.

The RP in each of the multicast domains is used for all connected networks (with an exception of PASNET and ÈVUT). An RP is elected dynamically through the use of the Auto-RP protocol (Cisco proprietary protocol), or the BSR (Bootstrap Router) protocol. The latter is an IETF standard implemented in the routers of other producers. While most participants are already connected over a PIMv2-SM interface, some participants are unfortunately still connected over PIMv2-DM, mostly because of the limitations imposed by the technology they use (for example, Extreme Networks switches). As the SM/DM boundary creates a number of problems, we ultimately aim at using PIMv2-SM exclusively.

We use MSDP protocol in the full-mesh group configuration between all nodes for propagating information about active sources of multicast data. This configuration is rather complicated and difficult to administer. The use of the mesh group reduces the volume of reports that have to be exchanged between the MSDP peers and enables the exchange of SA (Source Active) reports between all iMSDP routers regardless of the RPF check mechanism.

RPF check failures in relation to MSDP, resulting in SA messages being dropped, was the main source of problems with multicast distribution in the backbone network. The reason behind is that, logically, the unicast and multicast topologies are not congruent since unicast is encapsulated in MPLS whereas multicast is transported as plain IP packets.

Following the recommendations from Cisco Systems and the GÉANT network, we have also set access lists for MSDP reports filtering out certain reported active sources, thus eliminating the unwanted traffic resulting from misconfigured protocols and applications like Novell NDS, ImageCast and others.

[Figure]

Figure 3.5: CESNET2 multicast topology (large image)

At present, we prepare a new backbone setup for IPv4 multicast in cooperation with Cisco NSA. The expected implementation of the Anycast RP mechanism should lead to a more reliable multicast operation - its advantage is the possibility of load-balancing and setting up redundant RPs. In this case, the RP configuration is entirely static, which should make troubleshooting easier and the network more resistant against flooding.

Unfortunately, the implementation of the Anycast RP technology will have a considerable impact on the configuration of multicast in the off-backbone nodes and further in the networks of the connected organisations, due to the necessity of a manual RP configuration and the use of MSDP protocol on their upstream interfaces. We also plan to install the HP OpenView NNM Multicast Monitor on our second monitoring station for monitoring the backbone multicast operation. The statistical evaluation of multicast traffic requires that the routers support both SNMP and NetFlow v9.

While multicast distribution on the level of the network backbone is essentially flawless, many problems still persist on the side of the connected participants (diverse technologies, incompatibilities etc.). Therefore, we cannot yet claim to have a reliable end-to-end multicast service and potential users of multicast applications (e.g., videoconferences) thus often resort to alternative unicast-based technologies.

3.4   IPv6 Implementation

In 2003, one of the important goals was to integrate the experimental IPv6 backbone, which was based mainly on PC routers connected by tunnels, into the standard production environment of the backbone network. Although hardware support for IPv6 forwarding was one of the tender requirements already in mid 2002, by the end of 2003 the selected OSR7609 platform still lacks this functionality, mainly due to the Supervisor 720 engine being seriously delayed. Consequently, we had to look for alternative solutions.

A beta version of IOS with IPv6 software support, which was offered as an interim solution for the current Supervisor2/Engine2, turned out to be unusable in the production network and, as a matter of fact, the producer stopped its development. A reasonable solution thus seemed to be to use the existing Cisco 7500 routers, which had been replaced by the new OSR7609 routers during the backbone upgrade. The 7500 routers, configured as so-called 6PE devices, encapsulate IPv6 in MPLS packets and and forward them to the OSR7609 routers via a VLAN - see Figure.

[Figure]

Figure 3.6: Description of the interim IPv4/IPv6 implementation

The advantages of this solution are:

During the deployment of 6PE we encountered a number of various bugs and IOS-related problems. At present, the routers run an engineering version derived from IOS 12.3.

IPv6 multicast has not been supported yet on the backbone. The current architecture requires that the external neighbours be treated as CE routers connected to our 6PE routers. This turned out to be impossible for the existing IPv4 peering with the GÉANT network where we had an OSR7609 router (R85) with a POS STM-16/OC-48 interface. The only chance for a native IPv6 peering was to use temporarily a Cisco GSR12008 router (R21) with IOS 12.0(25)S1 supporting IPv4/IPv6 dual-stack. The current IPv4/IPv6 topology is shown in Figure.

[Figure]

Figure 3.7: IPv4/IPv6 MPLS backbone topology (large image)

We use iBGP as an internal routing protocol for IPv6 (as well as for IPv4) with two route reflectors on the external routers R1 and R62. The R1 router is also used for IPv6 peerings towards NIX. IPv6 traffic is merged with IPv4 on an L2 switch (Cat43), where the second GE circuit is also terminated. Dual-stack is configured, apart from R21, on the routers in Ústí nad Labem and Zlín (R90 and R91) - both are Cisco 7206-VXR with NPE-G1 and IOS version 12.(3)1. Connections of end sites are implemented either by tunnels (if the end site is not IPv6-ready), or natively (e.g., over a dedicated VLAN) using static routing.

The connection of smaller PoPs outside of the MPLS core network, will be also implemented natively and OSPFv3 will be used as the internal gateway protocol. At present, we use this configuration between GigaPoP Ostrava and PoPs Karviná and Opava. However, IOS version 12.2(15)T is needed, which requires at least 128 MB memory and is not available for older routers like Cisco 2600 that are still servicing a number of smaller PoPs. Therefore, it will be necessary to upgrade these routers in order to further extend native IPv6 backbone operation.

The necessary condition for the transfer of the entire backbone network to the dual-stack IPv4/IPv6 operation is the upgrade of all Cisco 7609 routers to the new supervisor engines Sup720 together with a stable IOS supporting 6PE. Sup720 has higher demands on both the input power and cooling compared to the current Supervisor2/PFC2 engine. The existing OSR7609 chassis cannot cope with these requirements and have thus to be exchanged as well.

A stable IOS release supporting Sup720 is not available either, but is planned for early 2004 under the code name 12.2S Tetons-2. The vendor provided us with a beta version of IOS, which we currently test on the standby devices. We have been communicating the problems discovered during this early Field test (EFT) directly to the developers so that they can be corrected. The aim of our participation in this EFT is first to get acquainted with the features of the new IOS, and second to help debug and finalise the awaited production version. We are ready to deploy it in our backbone as soon as it proves to be stable.

During the upgrade, we will have to rearrange all modules in the chassis because the existing Supervisor2/PFC2 occupies positions 1 and 2 while Sup720 must be placed in positions 5 and 6, originally reserved for the switching matrix. Configurations will also have to be appropriately changed.

After the upgrade, the OSR7609 routers will assume the role of the existing 6PE routers Cisco 7500, though only for the unicast IPv6. IPv6 multicast support is currently unclear and we plan to utilise the old Cisco 7500 routers again, this time as tunnel end-points (this solution is currently being verified).

3.5   External Connectivity

CESNET2 network uses the following external connections and peerings (see Figure):

[Figure]

Figure 3.8: External connectivity of CESNET2 network (large image)

3.5.1   Foreign Connectivity (Commodity Internet)

Telia International Carrier provides the global foreign connectivity. The connection capacity is 800 Mbps (software-limited) on a POS STM-16/OC-48 circuit terminated at the R84 router. The backup connection is realised by the analog circuit from the Telia node in Prague to the R85 router.

Telia provides the unicast connectivity for IPv4 (IPv4 multicast is planned) and IPv6 (a tunnel connection to London).

3.5.2   GÉANT Network Connection

Our connection to the GÉANT network is realised by a POS STM-16/OC-48 circuit to the collocated GÉANT node. The capacity of this connection is limited to 1.2 Gbps. On our side it is temporarily terminated at the dual-stack IPv4/IPv6 router R21 (GSR12008) in order to allow the native IPv6 connection. IPv4 multicast is routed through this connection as well.

The GÉANT node in Prague has a 10 Gbps connection to Germany (Frankfurt am Main) and two POS STM-16/OC-48 circuits to Poland (Poznañ) and Slovakia (Bratislava).

The link to GÉANT is mainly useful for a communication with national research and education networks (NRENs), but generally not with other pan-European Internet providers and peering centres. The detailed information on the GÉANT network topology can be found at www.geant.net.

3.5.3   National Peering in NIX.CZ

NIX.CZ access is realised by two GE circuits terminated at two border routers R84 and R85 (for back-up purposes). The circuits are set up on leased optical fibres using corresponding GBICs (GBIC-LX/LH and CWDM-GBIC on the longer backup line). Native IPv6 peering is implemented via the 6PE router R1.

3.5.4   Peering with SANET

The connection to Slovak academic network SANET has uses leased optical fibres Brno-Bratislava. The link is equipped with CWDM-GBIC-1550 and a Catalyst 3524 switch is used instead of a repeater.

We use 802.1Q VLAN for the device administration and point-to-point links for connecting the border routers in Brno (R89) and on the SANET side. Through this peering we also provide SANET access to NIX.CZ and SANET provides CESNET2 reciprocally access to the Slovak peering centre SIX, saving both networks some capacity on their international links. Slovak networks are advertised to NIX.CZ and Czech ones to SIX tagged with appropriate BGP communities.

3.6   Backbone network administration

The backbone network is being continuously supervised by the CESNET NOC (Network Operating Centre) on a 24/7 basis. Network problems that require a detailed diagnosis and/or reconfiguration of active elements are reported to the administrator in service by the NOC operators. CESNET has enhanced hardware and software service agreements covering all network devices with a guaranteed time for resolving the defects (for key devices it is 4 hours). Cisco NSA (Advanced Services) support is used for escalating the pending issues if necessary. As part of this support, two TAC engineers have been assigned to the CESNET2 network with a detailed insight into the network topology, services used and a direct access to network devices.

We use the following tools for the backbone network administration:

Management of the whole backbone network
The real-time network status is monitored and predefined events observed by means of an HP OpenView NNM 6.31 monitoring station (UltraSparc 420R, Solaris 2.8) equipped with the ET (Extended Topology) license. Backup station for network administration is planned to be used for multicast monitoring. To this end, an installation of the HP OpenView NNM Multicast monitor is currently under preparation.
Management of active network elements (routers, switches,...)
We use CiscoWorks 2000 RWAN 1.3 over the ssh protocol for secure access to network devices. The most used functions from the CiscoWorks portfolio are the following: RME (Resource Management Essential) for creating device surveys and automatic configuration backup, and Syslog Analyzer for evaluating specific records and logs about the operation of network devices.
Network Services Monitoring
We also use the Nagios v1.0 program (www.nagios.org), which is a follow-up of the previously used NetSaint. Our installation primarily monitors availability of the network servers (mail, DNS, WWW). The Nagios system server is also used for monitoring the IPv6 network. The advantage of Nagios system is the fact that it is an open-source system and we can easily add our own changes and extensions.
Statistics about devices and network traffic
GTDMS is a system developed in-house. Its main purpose is to gather different statistics about the operation and performance of network devices. We also added a number of alarms that fire when certain router- and circuit-related thresholds are exceeded (CPU overload, free memory size, feeding sources, interior temperature, congestion and error rate on circuits). In addition, we use our own system called NetFlow Monitor for processing and viewing the statistics (see below). It is intended first for assessing the traffic volumes recorded for individual member and non-member institutions, and second for tracking security incidents (evaluation of the observed flows according to predefined conditions). All backbone PE routers export NetFlow data. At present, we are able to collect only IPv4 operation statistics because the implementation of NetFlow v9, which includes statistics about IPv6 and multicast operation, is not available yet for the production versions of IOS.
Request Tracker (RT)
The role of this trouble ticket system is to process various requests (their creation, monitoring and archiving) pertaining to almost all areas of network operation and administration. A detailed description of the RT software can be found at www.fsck.com. Messages from each of the specialised queues (noc, trouble, admin,...) are distributed to a designated group of receivers (network administrators, NOC or users).
Out-of-Band management (OOB)
Remote access to all active network elements in all backbone nodes is guaranteed even in the case of network failures.

3.7   Statistical Traffic Analysis

3.7.1   Average Long-term Utilization of Backbone Network Core

At the first sight, the core of the CESNET2 backbone seems to have enough free capacity. However, we cannot simply characterise it as an "over-provisioned" network. The gigabit backbone lines have a long-term average utilization below 15 percent. Nevertheless, in 2003 we have been observing much more frequent traffic peaks that significantly exceed the average level. On certain lines these peaks cannot be considered a transient and random phenomenon anymore. In the 2002 report we illustrated the dependence of the measured line utilisation on the time step of the measurement. It turned out the actual line load over short periods highly exceeds the values obtained from the standard operational measurement, whose time steps are in the range of several minutes. During 2003 we recorded cases where the average load during the whole (usually 5-10 minutes) measurement step highly exceeded the long-term average and the sustained load on the corresponding lines was more than 30 % of the nominal line rate.

[Figure]

Figure 3.9: Long-time traffic peaks appearance on the Prague-Brno line 2.5 Gbps in 2003, direction to Brno

In the graph above, the noticeably less frequent appearance of traffic peaks before September 2003 is mainly due to a different aggregation strategy of aged data that was in place at that time.

3.7.2   Traffic trends in 2003

Compared to year 2002 we observed very few changes in the general trends of network utilisation. Specifically, the overall traffic volume was slowly but steadily growing with the typical plateau during the summer months. In the last three months of 2003 we can see a dynamical increase, especially on the lines that aggregate several traffic tributaries.

[Figure]

Figure 3.10: Traffic growth dynamic significant from September to November on Prague-Brno line 2.5 Gbps in 2003

3.7.3   CESNET2 network utilization in 2003

Some backbone core lines experienced a noticeable growth of the total traffic volume. For the sake of clarity we plotted volumes for both traffic directions separately including, whenever possible, the total amount of transferred data. The missing parts of some graphs are caused by changes in network architecture that took place during the year (topology, line changes) or by changes in the measurement system configuration. The low frequency of peaks before September 2003 is again an artefact induced by a change in the strategy of aged data aggregation in the measurement system.

[Figure]

Figure 3.11: Load on backbone lines in 2003 (large image)

[Figure]

Figure 3.12: Load on backbone lines in 2003 (large image)

[Figure]

Figure 3.13: Load on backbone lines in 2003 (large image)

[Figure]

Figure 3.14: Load on backbone lines in 2003 (large image)

[Figure]

Figure 3.15: Load on backbone lines in 2003 (large image)

3.7.4   External lines

All external lines have enough free capacity, the only exception being our connection to the global Internet in the outgoing direction. This line is rate-limited by software means, hence some traffic peaks can exceed the configured maximum. As before, the gaps in the graphs are caused by problems in the measurement system rather than traffic outages. The data, including the total volumes of transferred data, are again plotted separately for each traffic direction.

[Figure]

Figure 3.16: Load on external lines in 2003 (large image)

3.8   Future plans for the backbone network development

In the first quarter of 2004 we plan to finish the upgrade of the backbone Cisco 7609 routers to the Supervisor 720 engine and, consequently, reconfigure the PE routers for full IPv4/IPv6 dual-stack operation. Another important change will be the new architecture of multicast operation and administration. Regarding IPv6, we will continue the deployment of native IPv6 unicast in the remaining PoPs and also implement the interim solution for IPv6 multicast using Cisco 7500 routers and tunnels.

We can provide other services that may be demanded either by projects or connected institutions, such as MPLS VPN, Ethernet over MPLS (see [MTV01]) or Quality of Service. As concerns the latter, so far there has been very little motivation for implementing QoS due to quite sufficient available bandwidth.

Further development of the optical routes involves a gradual transition to 10GE on the route Prague-Brno, including a necessary upgrade of the routers in the network core and external routers. Other technological changes will be mostly driven by specific needs of both research projects and connected institutions.

previous
contents
next
metacentrum elearning liberouter live shows videoserver eduroam