Monday, April 18, 2011

Understanding Cisco Traffic Storm Control

By Pete Welcher

This blog is a quick note about an easily misunderstood set of switch commands, Cisco Traffic Storm Control. The commands are very useful, and work. However, they do seem to be commonly misunderstood -- or else the documentation is wrong. Part of the confusion may be due to different behavior on the large switches compared to the smaller Cisco switches. I get asked about this a lot when teaching the Nexus classes. I've seen Networkers slides that seem to think storm control behaves differently.

I'm going by the documentation here, I don't have easy access to a 6500 or Nexus 7000 for lab testing (most N7K's tend to be in production).

The traffic storm control command(s) are still very useful for mitigating the effects of a Spanning Tree loop. My co-workers tell me that you do want hardware-based storm control, for example by the time a 6500 Sup 2 (MSFC2) notices it should be doing software-based storm control, it is already toast. Toast = CPU spun up, stops doing BPDU's, stops sending UDLD so peers errdisable connections, etc.

Traffic storm control is most useful in Cisco 6500 and Nexus 7000 switches. The documentation for the two models matches. (One suspects the source code is rather similar too.)
Where does the confusion arise?

Well, the manual says "Traffic storm control (also called traffic suppression) allows you to monitor the levels of the incoming broadcast, multicast, and unicast traffic over a 1-second interval. During this interval, the traffic level, which is a percentage of the total available bandwidth of the port, is compared with the traffic storm control level that you configured. When the ingress traffic reaches the traffic storm control level that is configured on the port, traffic storm control drops the traffic until the interval ends."

It goes on to specify the syntax, which is to configure an interface with:

storm-control {broadcast | multicast | unicast} level percentage[.fraction]

The standard example is:

interface Ethernet1/1
    storm-control broadcast level 40
    storm-control multicast level 40
    storm-control unicast level 40

The problem comes about in that people think they get different thresholds for each of the three types of traffic: broadcast, multicast, unicast. WRONG! First hint: the thresholds in the example are all 40.

Now read the syntax introduction carefully: "Traffic storm control uses a bandwidth-based method to measure traffic. You set the percentage of total available bandwidth that the controlled traffic can use. Because packets do not arrive at uniform intervals, the 1-second interval can affect the behavior of traffic storm control."

 (I put the key words in bold characters.) Each time you enter a storm-control command, you are adding to the flavors or types of controlled traffic. The threshold is the same threshold, which is applied to the sum total of the controlled traffic.

The manual goes on to provide examples of how this works. It starts with

If you enable broadcast traffic storm control, and broadcast traffic exceeds the level within the 1-second interval, traffic storm control drops all broadcast traffic until the end of the interval.

This is what we all would probably expect, either way we interpret the operation of storm control. The manual goes on with:

If you enable broadcast and multicast traffic storm control, and the combined broadcast and multicast traffic exceeds the level within the 1-second interval, traffic storm control drops all broadcast and multicast traffic until the end of the interval.
If you enable broadcast and multicast traffic storm control, and broadcast traffic exceeds the level within the 1-second interval, traffic storm control drops all broadcast and multicast traffic until the end of the interval.
If you enable broadcast and multicast traffic storm control, and multicast traffic exceeds the level within the 1-second interval, traffic storm control drops all broadcast and multicast traffic until the end of the interval.

This makes it fairly clear that the aggregate of all the controlled types is what is being measured against the threshold -- and that there is only one threshold.

The Nexus 7000 command reference pretty much clarifies it: "Enter the storm-control level command to enable traffic storm control on the interface, configure the traffic storm-control level, and apply the traffic storm-control level to all traffic storm-control modes that are enabled on the interface. Only one suppression level is shared by all three suppression modes. For example, if you set the broadcast level to 30 and set the multicast level to 40, both levels are enabled and set to 40."

Unfortunately, "both levels" sounds like two different levels, each set to 40 -- unfortunate wording. The other documentation and the command behavior makes much more sense if the one and only threshold level is being set to 40.

The blog at http://blog.ipexpert.com/2010/03/15/old-ccie-myths-storm-control/ tackles testing this, albeit on a small Catalyst switch. The testing methodology at the end of the blog unfortunately turns only one of broadcast and multicast storm control on at a time, which seems to me to defeat the purpose. My tentative conclusion: it seems likely the behavior of small Catalyst switches may differ from that of the Catalyst 6500 and Nexus 7000. The small switch documentation is a bit ambiguous either way. 

1 comment:

  1. When it comes to storm control, I think the only thing Cisco is consistent about is in producing inconsistent implementations of it.

    ReplyDelete