In the graphs below, the flatline between Wed evening and Thur afternoon represents the time when that backup Sup 720 blade was in place. We were not collecting data when it was in place.
It turned out that the crash was related to a firmware bug that was triggered under rare circumstances.
It took most of Thursday with Cisco tech support to determine that. Cisco provided us with updated firmware. Once installed, we switched the Sup720 module back into 6509b late Thur afternoon. You can see the data collection begin again.
(More detailed data are further below.)
In the Monthly graph further down, the overall the broadcasts are seen to have been about 340 kb/s for quite some time (except for Thanksgiving week).
As can be seen in the graph above, after we were back up and running, the broadcasts were over 1000 kb/s. Presumably on Thur when we were not collecting data they were equally high.
This high level of broadcasts severely interfered with network performance. It is not clear how the broadcast levels related to the firmware bug in the switch and its crash.
The source of the broadcasts were traced to a particular switch in Mead on Friday morning. There is some indication that the broadcasts were being replicated, multiplying the effects of normal broadcast traffic. Once that switch was rebooted, the problem vanished, as can be seen by the dramatic decrease in traffic early Friday.
Broadcasts are used, for example, when a computer is looking to send traffic to an IP number it hasn't recently contacted. It first sends out a broadcast asking where it should send traffic for the particular IP. After receiving a response telling it where the destination is, traffic goes directly to that destination address.
Another common use is when the computer is requesting its own IP. Since it doesn't have an IP, it sends a broadcast on the network asking a DHCP server to provide an IP based on its ethernet address.
High broadcast traffice can interfere with the network operations. Next summer we are planning on segmenting the network to reduce the overall number of computers within a segment (VLAN) which will reduce the broadcast traffic within each segment. Such a network topology would have reduced the effects of the problems we've just experienced.
| System: | 6509b.mtholyoke.edu in chp |
| Maintainer: | network@mtholyoke.edu |
| Description: | GigabitEthernet7/6 |
| ifType: | ethernetCsmacd (6) |
| ifName: | Gi7/6 |
| Max Speed: | 125.0 MBytes/s |
| Max In: | 0.0 b/s (0.0%) | Average In: | 0.0 b/s (0.0%) | Current In: | 0.0 b/s (0.0%) | ||
| Max Out: | 1722.7 kb/s (0.2%) | Average Out: | 293.8 kb/s (0.0%) | Current Out: | 54.0 kb/s (0.0%) |
| Max In: | 0.0 b/s (0.0%) | Average In: | 0.0 b/s (0.0%) | Current In: | 0.0 b/s (0.0%) | ||
| Max Out: | 1321.2 kb/s (0.1%) | Average Out: | 212.9 kb/s (0.0%) | 54.0 kb/s (0.0%) |
| Max In: | 0.0 b/s (0.0%) | Average In: | 0.0 b/s (0.0%) | Current In: | 0.0 b/s (0.0%) | ||
| Max Out: | 1321.2 kb/s (0.1%) | Average Out: | 212.9 kb/s (0.0%) | Current Out: | 52.5 kb/s (0.0%) |
| Max In: | 0.0 b/s (0.0%) | Average In: | 0.0 b/s (0.0%) | Current In: | 0.0 b/s (0.0%) | ||
| Max Out: | 1241.1 kb/s (0.1%) | Average Out: | 235.0 kb/s (0.0%) | Current Out: | 57.0 kb/s (0.0%) |
| Max In: | 0.0 b/s (0.0%) | Average In: | 0.0 b/s (0.0%) | Current In: | 0.0 b/s (0.0%) | ||
| Max Out: | 357.2 kb/s (0.0%) | Average Out: | 228.7 kb/s (0.0%) | Current Out: | 269.8 kb/s (0.0%) |