Network Enhancers - "Delivering Beyond Boundaries" Headline Animator

Showing posts with label Nexus. Show all posts
Showing posts with label Nexus. Show all posts

Thursday, March 12, 2015

Cisco ACI CLI Commands "Cheat Sheet"



Introduction
The goal of this document is to provide a concise list of useful commands to be used in the ACI environment. For in-depth information regarding these commands and their uses, please refer to the ACI CLI Guide.

Please note that legacy style commands (show firmware, show version, etc) will not be included in this guide. The below commands are new for ACI. Legacy commands may be added later on, but the point of this document is to be short and sweet.

Formatting
This document is formatted in the following way: commands are surrounded by <> in bold and possible user-given arguments within commands (if necessary) are surrounded by () with a | in between multiple arguments. Brackets [] will be used for mandatory verbatim arguments. A dash (-) will be the barrier between a command and the explanation for a command. For example:

     - shows the status of a given interface as well as statistics
        interface ID is in () because it is a user-specified argument, you can put any interface you want

     - show the MAC port status
        ns|alp and 0|1 are in brackets because you must use either one of those arguments

Command Completion and Help
Context sensitive help and command completion in ACI is a bit different than in other command line interfaces from Cisco.  Since iShell builds mostly on Bash, these features tend to build off of the standard bash Programmable Completion feature.  

  • Tab - Use the tab key to auto complete commands.  In cases where there are multiple commands that match the typed characters, all options should be displayed horizontally.  

    Example Usage:

     
    admin@tsi-apic1-211:~> mo
    moconfig     mocreate     modelete     modinfo      modprobe     modutil      mofind       moprint      more         moset        mostats      mount        mount.fuse   mount.nfs    mount.nfs4   mountpoint   mountstats   mount.tmpfs
    admin@tsi-apic1-211:~> mo

    This is more than just iShell, it includes all Bash commands.  Hitting Tab before typing any CLI command on the APIC results in:
     
    admin@tsi-apic1-211:~> 
    Display all 1430 possibilities? (y or n)
  • Esc Esc - Use Double escape to get context sensitive help for available ishell commands.  This will display short help for each command.  [Side note: In early beta code, Double Escape after typing a few characters would only show one of the matching commands rather than all of them.  This is addressed via CSCup27989 ]

    Example Usage:

     
    admin@tsi-apic1-211:~> 
     attach           Show a filesystem object
     auditlog         Display audit-logs
     controller       Controller configuration
     create           create an MO via wizard
     diagnostics      Display diagostics tests for equipment groups
     dn               Display the current dn
     eraseconfig      Erase configuration, restore to factory settings
     eventlog         Display event-logs
     fabricnode       Commission/Decommission/Wipeout a fabric node
     faults           Display faults
     firmware         Add/List/Upgrade firmware
     health           Display health info
     loglevel         Read/Write loglevels
     man              Show man page help
     moconfig         Configuration commands
     mocreate         Create an Mo
     modelete         Delete an Mo
    [snip]
    admin@tsi-apic1-211:~>
  • man  - All commands should have man pages.  [Side note: If you find an iShell command without a man page - open a bug]  The manual page for the commands will give you more detailed info on what the commands do and how to use them.

Cisco Application Centric Infrastructure CLI Commands (APIC, Leaf/Spine)


Clustering User Commands
 - shows the current cluster size and state of APICs
- changes the size of the cluster
 - Decommissions the APIC of the given ID
 - Factory resets APIC and after reboot will load into setup script
 - Reboots the APIC of the given ID
 - shows replica which are not healthy
 - shows the state of one replica
 - large output which will show cluster size, chassisID, if node is active, and summary of replica health
 - shows fabric node vector
 - shows appliance vector
 - verifies APIC hardware
 - shows link status
 - shows the status of bond link
 - shows dhcp client information to confirm dhcp address from APIC
 - commissions, decommissions, or wipes out given node. wipeout will completely wipeout the node including configuration. Use sparingly.

SSL Troubleshooting
 - tries to connect ssl between APIC and Node and gives output of SSL information
 -shows logging of DME-logs for node
 - shows policy-element logs for SSL connectivity
Can also check logs in the /var/log/dme/log directory

Switch Cert Verification
 - Next to PRINTABLESTRING, it will list Insieme or Cisco Manufacturing CA. Cisco means new secure certs are installed, Insieme means old unsecure are installed
- Shows start and end dates of certificate. Must be within range for APIC to accept
- Shows keypairs of specified cert

Switch Diagnostics
 - shows bootup tests and diagnostics of given module
 - shows ongoing tests of given module
 - shows diagnostic result of given module or all modules
 - shows diagnostic result of given test on given module
 - show debug information for the diagnostic modules

Debug Commands
 - shows debug output of given argument
 - enables/disables given argument on all modules
 - gets the interval of given argument
 - EPC mon statistics
 - EPC mon statistics
 - EOBC/EPC switch status (0: EOBC, 1: EPC)
 - SC card broadcom switch status

Insieme ELTM VRF, VLAN, Interface Commands
 - dumps ELTM trace to output file
 - dumps eltm trace to console
 - shows vrf table of given vrf
 - 
 - 
 - vrf summary, shows ID, pcTag, scope
 - shows vlan information. Can substitute (brief) for a vlan ID
 -

OSPF CLI Commands
 - shows OSPF neighbors of given vrf
 - shows OSPF routes of given vrf
 - shows ospf interfaces of given vrf
 - shows ospf information of given vrf
 - shows ospf traffic of given vrf

External Connectivity
 - shows arp entries for given vrf
 - shows ospf neighbors for given vrf
 - shows bgp sessions/peers for given vrf
 - shows ospf routes for given vrf
 - shows bgp unicast routes for given vrf
 - shows static routes for given vrf
 - shows routes for given vrf
 - shows external LPMs
 - shows next hops towards NorthStar ASIC or external router
 - HigigDstMapTable Indexed using DMOD/DPORT coming from T2. Provides a pointer to DstEncapTable. 
 - DstEncapTable Indexed using the HigigDstMapTable’s result. Gives tunnel forwarding data.
 - RwEncapTable Indexed using the HigigDstMapTable’s result. Gives tunnel encap data.

ISIS Fabric Unicast Debugging
 - shows ISIS statistics
 - shows ISIS adjacencies for given vrf. Can also add detail
 - shows lldp neigbor status
 - shows interface status information and statistics
 - shows isis database, can also add detail
 - shows isis route information
 - shows isis traffic information
 - shows all discovered tunnel end points
 - shows isis statistics of given vrf
 - shows isis event history
 - shows isis memory statistics
 - provides isis tech-support output for TAC

ASIC Platform Commands
 - shows the MAC port status
 - shows the MAC port counters
 - shows ASIC block counters for given ASIC. Can also add [detail] for more details
 - shows interrupts for given ASIC

ASIC Platform Commands - T2 Specific
 - shows receive counters for T2
 - shows transmit counters for T2
 - shows per port packet type counters
 - shows ingress drop counters
 - shows egress drop counters
&  - setting register to specific trigger. 9 registers per port (0-8)
    ex -   - sets 4th register to select RFILDR selector (bit 13)
 - checking the stats for above command

ASIC Platform Commands - NS Specific
 - shows port counters
 - shows internal port counters
 - shows vlan counters
 - shows per-tunnel counters
 - shows ASIC block counters
 - shows well-defined tables

Fabric Multicast - General
 - shows currecnt state of FTAG, cost, root port, OIF list
 - shows GM-LSP database
 - shows GIPO routes, Local/transit, OIF list
 - shows topology and compute stats, MRIB update stats, Sync+Ack packet stats, Object store stats
 - shows isis multicast event history logs
 - more detailed than above command, specifically dealing with forwarding events and forwarding updates

Fabric Multicast Debugging - MFDM
 - flood/OMF/GIPi membership
 per BD

 - GIPi membership
 - specific
 - per BD
 - specific per BD

 - flood membership
 - per BD

 - OMF membership
 - per BD

 - IPMC membership 
 - specific IPMC



Fabric Multicast Debugging - L2 Multicast
 - flood/OMF/GIPi membership
 - per BD

 - GIPi membership
 - specific
 - per BD
 - specific per BD

 - flood membership
 - per BD

 - MET membership
 - specific MET
 - flood MET
 - GIPi MET
 - per BD
 - specific per BD
 - IPMC membership
 - specific IPMC

Fabric Multicast Debugging - MRIB
 - shows IP multicast routing table for given vrf

Fabric Multicast Debugging - MFIB
 - shows FTAGs
 - shows GIPO routes

Fabric Multicast Debugging - IGMP
 - shows multicast route information in IGMP
 - shows multicast router information IGMP
 - FD to BD vlan mapping. IGMP gets FD and G from Istack. It needs to know the BD to create (BD, G)
 - verify BD membership of a port in IGMP. Only when ports are part of BD joins are processed
 - verify the tunnel to IF mapping in IGMP. IGMP uses this to get the groups on VPC and only sync them.

Fabric Multicast Debugging - MFDM
 - shows IPv4 multicast routing table for given vrf
 - Verify FD to BD vlan mapping. MFDM gets (FD,port) memberships from vlan_mgr and uses this information go create BD floodlists.
 - BD to GIPO mapping. GIPO is used by Mcast in Fabric
 - FD-vxlan to GIPO mapping
 - tunnel to phy mapping

Fabric Multicast Debugging - M2rib
 - shows multicast route information in M2rib
 - shows multicast route informatino in M2rib

Fabric Multicast Debugging - PIXM
 - RID to IPMC mapping. IFIDX is RID and LTL is IPMC

Fabric Multicast Debugging - VNTAG Mgr
 - IPMC to DVIF mapping. LTL is IPMC

EP Announce - Debugging







iBash CLI

 - show endpoint information

BCM Table Dump



Fabric QoS Debugging - CoPP CLI

 - CoPP statistics (red = dropped, green = allowed)
 - shows QoS classes configured
 - shows QoS classes/policices configured per vlan
 - shows ppf details
 - shows QoS classes configured in hardware
 - shows the QoS DSCP/dot1p policy configured for a vlan in HW
 - shows QoS DSCP/dot1p policy summary
 - shows QoS DSCP/dot1p policy in detail
 - shows T2 TCAM entries for specified group
 - shows QoS counters on each port
 - shows QoS counters on each port (internal)
 - shows QoS counters for each class for all ports

MCP CLI
 - shows the edge port config on the HIF (FEX) ports, the internal VLAN mapping and the STP TCN packet statistics received on the fabric ports
 - shows mcp information by interface
 - shows stats for all interfaces
 - shows mcp information per vlan
 - shows stats for all vlans
 - shows mcp information per msti region
 - shows stats for all msti regions

iTraceroute CLI
 - node traceroute
 - Tenant traceroute for vlan encapped source EP
 - Tenant traceroute for vxlan encapped source EP

ELAM Setup and debugging (follow commands in order)
 - starts ELAM on given ASIC
 - sets trigger for ELAM
 - sets source and destination mac addresses
 - Starts capture
 - shows capture status
 - shows report of the capture

VMM Troubleshooting
 - shows VM controllers and their attributes such as IP/hostname, state, model, serial number
 - shows hypervisor inventory of given VM controller



TOR Sync Troubleshooting







 - can see which VLAN is learn disable
 - can see which VLAN is learn disable
 - see if timer is attached on the VLAN/vrf



OpFlex Debugging
 - shows if OpFlex is online (status = 12 means OpFlex is online, remoteIP is anycast IP, intra vlan is vlan used by VTEP, FTEP IP is the iLeaf's IP)
 - check if DPA is running

 - uplinks and vtep should be in forwarding state. PC-LTL of uplink port should be non-zero
 - Check port channel type
 - if port channel type is LACP, can use this command to see the individual uplink LACP state
 - verify if the VTEP received a valid DHCP IP address

SPAN Debugging


BPDU Debugging
 - shows if BPDU Guard/Filter is enabled or disabled
 - check if the bpdu-drop stats are incrementing on the uplinks/virtual ports

VEM Misc Commands
 - show channel status
 - check port status
 - check per EPG flood lists
 - check vLeaf multicast membership
 - show packet stats
 - show packet counters

 - debug vxlan packet path
 - debug vxlan packet path
 - show above logging output









FEX Troubleshooting
 - shows all FEXs and their states
 - gives detailed stats of given FEX
 - gives environmental stats of FEX
 
 - shows FEX version
 - shows FEX fabric interface information
 - shows logging information for FEX
 - shows transceiver information for FEX
 - show FEX reset reason
 - shows FEX module information
 - shows debugging information and you can grep to find what you want
 - use to find out which service is failing the sequence and you can debug that process further



Monday, February 9, 2015

Cisco offers ACI alternative for Nexus 9000 switches



Cisco is adding a new control plane capability to its Nexus 9000 switches for customers not yet opting for or needing a full-blown application policy infrastructure.

Cisco’s BGP Control Plane for VXLAN is designed to appeal to operators of multitenant clouds looking for familiar BGP routing protocol features with which to scale their networks and make them more flexible for the demands of cloud networking. VXLAN, which scales VLAN segmentation to 16 million endpoints, does not specify a control plane and relies on a flood-and-learn mechanism for host and endpoint discovery, which can limit scalability, Cisco says.

BGP Control Plane for VXLAN can also serve as an alternative to Cisco’s Application Centric Infrastructure (ACI) control plane for the Nexus 9000s. The ACI fabric is based on VXLAN routing and an application policy controller called Application Policy Infrastructure Controller (APIC).

“This is definitely an alternative deployment model,” said Michael Cohen, director of product management in Cisco’s Insieme Networks Business Unit. “It’s a lighter weight (ACI) and some customers will just use this.”

BGP Control Plane for VXLAN runs on the standalone mode versions of the Nexus 9000, which requires a software upgrade to operate in ACI mode.

Cohen sidestepped questions on whether Cisco would now offer another controller just for the BGP Control Plane for VXLAN environments in addition to the ACI APIC and APIC Enterprise Module controllers it now offers.

Cisco says BGP Control Plane for VXLAN will appeal to customers who do not want to deploy multicast routing or who have scalability concerns related to flooding. It removes the need for multicast flood-and-learn to enable VXLAN tunnel overlays for network virtualization.

The new control plane uses the Ethernet virtual private network (EVPN) address-family extension of Multiprotocol BGP to distribute overlay reachability information. EVPN is a Layer 2 VPN technology that uses BGP as a control-plane for MAC address signaling / learning and VPN endpoint discovery.

The EVPN address family carries both Layer 2 and 3 reachability information, which allows users to build either bridged overlays or routed overlays. While bridged overlays might be simpler to deploy, routed ones are easier to scale out, Cisco says.

BGP authentication and security constructs provide more secure multitenancy, Cisco says, and BGP policy constructs can enhance scalability by constraining route updates where they are not needed.

The BGP Control Plane for VXLAN now allows the Cisco Nexus 9300 and 9500 switches to support VXLAN in both multicast flood-and-learn and the BGP-EVPN control plane. Cisco says dual capability allows resiliency in connectivity for servers attached to access or leaf switches with efficient utilization of available bandwidth.

The 9300 leaf switch can also route VXLAN overlay traffic through a custom Cisco ASIC, which the company touts as a benefit over Broadcom Trident II-based platforms from competitors – like Arista. VXLAN routing at the leaf allows customers to bring their boundary between Layer 2 and 3 overlays down to the leaf/access layer, which Cisco says facilitates a more scalable design, contains network failures, enables transparent mobility, and offers better abstract connectivity and policy.

Cisco says BGP Control Plane for VXLAN works with platforms that are consistent with the IETF draft for EVPN. Several vendors, including Juniper and Alcatel-Lucent, have implemented or have plans to implement EVPN in network virtualization offerings. AT&T and Verizon are co-authors of some of the IETF drafts on this capability.

BGP Control Plane for VXLAN is available now on the Nexus 9300 and 9500 switches. It will be available on the Cisco Nexus 7000 switches and ASR 9000 routers in the second quarter.



Friday, November 7, 2014

[Insieme and Cisco ACI] Cisco Nexus 9000 Part 2 – Programmability




Introduction to Application-Centric Infrastructure

In the last post, we discussed the hardware that was being announced from Cisco’s Insieme spin-in.  While the hardware that is comprising the new Nexus 9000 series is certainly interesting, it wouldn’t mean nearly as much without some kind of integration on an application level.

Traditionally, Cisco networking has been relatively inaccessible to developers or even infrastructure folks looking to automate provisioning or configuration tasks. It looks like the release of ACI and the Nexus 9000 switch line is aiming to change that.

The Nexus 9000 family of switches will operate in one of two modes:

NXOS Mode – If you’ve worked with Cisco’s DC switches like the Nexus 7K or 5K, this should be very familiar to you. In this mode, you essentially have a 10GbE or 40GbE switch, with the features that are baked into that

In NXOS Mode, all of the additional custom ASICs that are present on the switch fabric are used primarily for enhancing the functionality of the merchant silicon platform, such as increasing buffer space, etc.

ACI Mode – This is a completely different mode of operation for the Nexus 9000 switch. In this mode, the switch participates in a leaf-spine based architecture that is purely driven by application policy. It is in this mode that we are able to define application relationships, and imprint them onto the fabric.

ACI is meant to provide that translation service between apps and the network. I was very happy to see this video posted on release day, as Joe does a great job at explaining the reasons for the product that he and the rest of the folks at Insieme have created:





This Nexus 9000 product family was built from the ground up, not just to be a cheap 10/40GbE platform, but also to be a custom fit for the idea of ACI. In this post, we’ll discuss this ACI engine (called the Application Policy Infrastructure Controller or APIC), as well as the direction that Cisco’s going from a perspective of programmability.

Programmability in NXOS Mode

Before I get too deep into ACI, I do want to call out some of the programmability features present in the Nexus 9000 series even without ACI (since NXOS mode is all we’ll be able to play with initially). The fact that the list below is so long, even in a mode that really only requires you to purchase the hardware and little else, is impressive, and certainly a refreshing turn for the better from a Cisco perspective.

The folks over at Insieme have gone through some great lengths to enable the Nexus 9000 switches with some great programmable interfaces, which is  a huge move forward for the Nexus family, and with Cisco products in general, frankly. Here are some of the ways you’ll be able to interact with a Nexus 9000 switch even in the absence of ACI:


  • Power On Auto Provisioning (PoAP)
  • OpenStack Plugin
  • OnePK Capable
  • Puppet/Chef
  • Python Scripting
  • Linux container
  • Many NXOS commands are available via XML and JSON
  • XMPP integration
  • OpenFlow


This is a stark contrast to what we were given in the past. We’ve been able to do really cool stuff with interfaces like these on competing platforms like Juniper and Arista, but not with a Cisco switch. So – this is very good news. Again, these interfaces do not require ACI – though obviously ACI is needed if you want to administer an entire fabric of switches as one.

By the way, the long-term vision is to move all of these features into every NXOS device, which is very good news for those that recently bought a lot of existing Nexus 5000/7000 gear, for instance.


Application-Centric Infrastructure

ACI is all about policy-based fabric automation. When we’re talking about network agility, and creating an environment where the application and developer teams are able to consume network services without having to understand how the network works, this policy abstraction is crucial.

ACI extends the same concept we saw with UCS Service Profiles by using Application Network Profiles. These contain network-related attributes about an application. They can define things like:


  • Application Tiers
  • Connectivity policies
  • L4-L7 services
  • SML/JAON schema


When you’re working with a fabric composed of Nexus 9000 hardware, ACI becomes the layer that sits on top and enables network orchestration, and integration with other tools like OpenStack, etc.

Policy Concepts

Application policies will be configured in terms that make sense to application folks, abstracting away all of the network-specific nerd knobs for those with that skillset. The main focus is to remove the complexity out of the network to allow for true network automation.

NOTE: It should be noted that there is still a need for traditional network engineers to understand the fabric design underneath, which we’ll get into in a bit.

The main task is to define your application profiles to essentially describe their respective application’s impact on the network. Once done, policies can be configured by the same application folks to define relationships between these profiles.




Note that zero networking knowledge is needed for this. As an application developer or owner, you’re configuring relationships using verbs like “consume” or “register” or “store”. Those are the words you’re used to using. That’s the idea here – abstract the networking nerd knobs away and let the network engineers maintain them.

All of this is possible through the Application Policy Infrastructure Controller, or APIC. This is the policy server for ACI.  In order to create an ACI fabric, you connect the APIC into a leaf switch. As soon as you plug this in, it discovers the entire topology in an automated fashion. The APIC is responsible for taking created policies and imprinting them onto the network.

ACI is being positioned to act as a translator between the application teams and network teams. It allows for a plug and play semantic of network elements allowing policy groups to pick and choose from a menu list of network structures that they want to utilize (QoS, FW, etc.). This is nothing new for Vmware admins who have worked with port groups on a vSwitch or anyone familiar with vNIC or vHBA templates in Cisco UCS – except that with solutions like Cisco ACI or Vmware NSX, the connectivity options offered behind the scenes is much more rich.

These attributes can be tied to network connectivity policies so that they’re abstracted from the application teams, who end up selecting these policies from a dropdown or similar.

ACI Fabric Design

Now – for all of you network folks, let’s talk about how this fabric works behind the scenes.

The typical ACI fabric will be designed in a traditional leaf-spine architecture, with 9500s serving as the spine, and 9300s serving as the leaf switches.

The workloads and policy services connect directly into the leaf switches of the fabric. This can be baremetal workloads, or hypervisors.




An ACI fabric operates as a L3 routed leaf-spine fabric with VXLAN overlay. There is no TRILL or FabricPath anywhere – so redundancy is accomplished via L3 ECMP.

There are a few attributes that will be used from day 1 to offer application identification:


  • L3 addressing
  • L4 addressing
  • Tagging (VXLAN, NVGRE, VLAN)
  • Virtual or Physical Port


Each ingress port on the fabric is it’s own classification domain – so using the traditional VLAN model may not be a bad idea – since VLAN 10 on port 1 means something completely different than VLAN 10 on port 2. However – all IP gateways can also be made available across the fabric on any leaf. This means a lot when it comes to workload mobility.

NOTE: Routing integration can also take place between the ACI fabric and an edge router using iBGP, OSPF, or static routing.

Every port on the 9000 fabric is a native hardware VXLAN, NVGRE, and IP gateway. As you can imagine, this means that our flexibility in tagging mechanisms outside the fabric is a lot better. We can really just use whatever we want, and coordinate the tags that are being used using policy. The fabric will rewrite as necessary – allowing a Hyper-V host and an ESXi host to talk using both of their native tagging mechanisms at line rate.





Because classification occurs at the edge, each leaf essentially serves as a gateway for any tagging mechanism. This function is able to translate between VXLAN, VLAN, NVGRE, etc. – all at line rate in hardware. This also means that integration of these application policies between virtual and physical workloads is seamless.

This obviously requires some overall orchestration between the hypervisor and the fabric, because there’s only a small number of attributes usable for this classification. You’d have to reach into the API of – say vSphere – and figure out what tags are being used for what virtual machines, then make sure you use those tags appropriately as they enter the fabric. Eventually, other attributes could include:


  • DNS
  • DHCP
  • VM Attributes
  • LDAP information
  • 3rd party tools


These, as well as others, could all potentially be used for further granularity when identifying application endpoints. Time will tell which ones become most urgent and which ones Cisco adopts. Personally, I’ve had a ton of ideas regarding this classification and will be following up with a post.

Latency and Load Balancing

In normal networks, any time you’re talking about multiple paths, whether it’s a L3 ECMP with Cisco IP CEF or similar, or simple port channels, it’s typically per-flow load balancing, not per-packet. So as long as traffic is going to the same IP address from the same IP address, that traffic is only going to utilize a single link, no matter how many are in the “bundle”.

This is to limit the likelihood of packets being delivered in the wrong order. While TCP was built to handle this event, it is still stressful on the receiving end to dedicate CPU cycles to put packets back in the right order before being delivered to the application. So – by making sure traffic in a single flow always goes over a single link, it forces the packets to stay in order.

The ALE ASIC is able to do something extremely nifty to help get around this. It uses timers and short “hello” messages to determine the exact latency of each link in the fabric.







This is HUGE because it means that you know the exact latency for each link all the time. This allows you to do true link balancing, because you don’t have to do tricks to get packets to arrive in the right order – you simply time it so that they do. As a result, our entire fabric can use each link to it’s full potential.

This is a big argument for using a hardware solution because the fabric can make the best decisions about where to place the traffic without affecting, or requiring input from the overlay.

Security

An ACI fabric can operate in a whitelist model (default) or a blacklist model. Since the idea is to enable things from an application perspective, then in order to get connectivity you must first set up application profiles and create the policies that allow them to talk. Or, you could change to a blacklist model, denying the traffic you don’t want to exist.

There are a handful of benefits here – first off, ACLs aren’t really needed and wouldn’t be configured on a per-switch basis. This “firewall-esque” functionality simply extends across the entire fabric. Second, it helps prevent the issue of old firewall rules sticking around long after they become irrelevant.

So, a security configuration is always up to date and allowing only the traffic through that needs to get through because these are configured in application profiles, not in an ACL that’s only updated when things aren’t working.

Software and Programmability

Application Profiler is a custom tool built by Cisco to sweep the network to find applications and their configurations so that a Fabric network can be planned accordingly.

Simulators will be available along with the full API documentation but not necessarily a script library, such as what is provided for UCS (PowerTool/PowerShell). This could be developed on day one using the XML/JSON API being provided.

The list of mechanisms by which we’ll be able to programmatically interact with ACI is quite extensive (a good thing):

  • Scripting: Puppet/Chef, Python, XMPP
  • Monitoring: Tivoli, CA, Netscout, NetQoS, Splunk
  • Orchestration: CA, BMC, Cisco, OpenStack
  • NXOS Northbound APIs
  • PoAP (Power On Auto Provisioning) uses PXE boot
  • OpenStack plugin will be released. Support for Grizzly
  • L2 support for OpenStack plug-in, but not sure about L3
  • UCS Director (Cloupia) will work with compute, storage (NetApp), and now Nexus


[Insieme and Cisco ACI] Cisco Nexus 9000 Part 1 – Hardware



Nexus 9000 Overview

From a hardware perspective, the Nexus 9000 series seems to be a very competitively priced 40GbE switch. As (I think) everyone expected, the basic operation of the switch is to serve up a L3 fabric, using VXLAN as a foundation for overlay networks. The Nexus 9000 family will run in one of two modes: Standalone (or NXOS) mode, or ACI mode. In ACI, we get all of the advantages of a VXLAN-based fabric, with application intelligence built in to provide abstraction, automation and profile based deployment.

Some NXOS mode products are shipping now, more to follow beginning of next year.  ACI will be available by the end of Q1 next year, so I’ll talk about ACI in the next post. For now, assume that the hardware I’m going to talk about is with Standalone mode in mind, meaning this is the hardware the early adopters will be able to get their hands on. I will also write about ACI-compatible hardware that Cisco’s announcing in the second half of this post.

Nexus 9500

The Nexus 9000 series will start off with an 8-slot chassis, the Nexus 9508. The 4-slot and 16 slot models will be released later. The Nexus 9508  is 13RU high, meaning each rack could hold up to 3 chassis. This will be a modular chassis with no mid-plane, for complete front to back airflow across each line card.




Some hardware specs:


  • 4x 80w Plat Plus PSW’s (same as used in the UCS fabric interconnects)
  • 3 blanking plates in the front allow for future PSU’s (most likely used for 100G)
  • 3 fan trays (9 fans) in rear
  • 6 fabric modules (behind fan tray)


The initial line card with the Nexus 9508 will serve as a 40GbE aggregation line card (36 QSFP+ ports). The linecard was built to support 100GbE ASICs in the future with similar port density.

Nexus 9300

The Nexus 9300 is being positioned as either a small, collapsed access/aggregation, with no core feature set – a ToR switch to be uplinked to a spine of 9500s.




It is initially provided in two models:


  • Nexus 9396PQ – a 48 port non-blocking 10GbE switch. (2RU)
  • Nexus 93128TX – a 96 port 1 or 10 GbE switch (with a 3:1 oversubscription ratio on the latter) (3RU)


Both models include a Generic Expansion Module (GEM). This is a 12 port 40GbE QSFP+ module used to uplink to the 9500 spine when running in ACI mode, or to any other 40GbE device if running in NXOS mode. Only 8 of these ports are using on the 93128TX. This module also provides an additional 40MB buffer, as well as full VXLAN bridging and routing. As will be detailed later, the custom ASIC developed by Insieme (called the Application Leaf Engine, or ALE) provides connectivity between ACI nodes, so the QSFP ports in the GEM module connect directly to the ALE ASIC, while the other ports use the Trident II.

BiDi Optics

The Nexus 9000 hardware comes with some unique optics – they are purpose-built bidirectional optics that were created to provide an easy migration path from 10GbE to 40GbE. These optics will allow customers to re-use their existing 10G multimode fiber cabling to move to 40GbE. This sounds like magic, but it’s actually quite simple. These BiDi optics are still a QSFP form factor, and even uses the traditional LC connector that your existing 10GbE cable runs likely use.

These QSFP optics can be installed in any QSFP port, not just N9K, and not even just Cisco. This works by taking 8 lanes from the ALE ASIC and multiplexing it to 2 wavelengths at 20Gbit/s each (each is bidirectional) within the optic. Yes – you guessed it, WDM right in the optic.

This appears to be an offering by Cisco designed to move to 40GbE (a must for the ACI architecture) without drastic changes to existing cable plants. From what I’ve been told, you can even connect the ACI fabric to another non-Cisco switch in this manner. Obviously because of WDM directly in the optic, there must be one of these BiDi optics on both ends, but that should be it. These optics will support 100m on OM3 and 125m+ on OM4.

Chipsets and Forwarding Behavior

The Broadcom Trident II will do the vast majority of the work, such as unicast forwarding, but also including VXLAN tagging and rewrite to/from NVGRE and 802.1Q. (more on this in Part 2) Insieme’s own ALE ASIC is specifically designed to provide advanced functionality including ACI readiness. The ALE is still used in standalone mode to add an additional 40 meg of buffer (T2 only has 12Mb)

The Trident II ASIC handles all unicast forwarding.  Insieme’s ASIC (ALE) provides additional buffer to the T2, as well as offers VXLAN offloading.  There is no direct path between Trident II ASICs, even on the same line card. Packets are sent to the Fabric Module if the ingress and egress ports are managed by separate ASICs. Fabric Modules have their own CPU and acts as a separate forwarding engine.

For L3 LPM lookups, the ingress line card forwards the packet to the Fabric Module (configured as the default route) which contains the LPM lookup table and forwards to the line card with the destination port V.1.0 of VXLAN is based on multicast, but the upcoming version will utilize a centralized VXLAN control plane. The new VXLAN control plane will be very similar to how LISP’s control plane works, but VXLAN will retain the original full Ethernet header of the packet.

ACI Mode

As mentioned before, the second mode that the Nexus 9000 series operates in, is ACI mode. This mode allows for enhanced programmability across a complete fabric of Nexus 9000 switches. With ACI as the SDN solution on top, the fabric acts like one big switch – forwarding traffic using a myriad of policies that you can configure. We’ll be talking about these nerd knobs in the second post, but first, let’s look at the hardware that will make this possible – slated for release sometime in Q2 2014.


  • 1/10G Access & 10/40G Aggregation (ACI)
    • 48 1/10G-T & 4 40G QSFP+ (non blocking) – meant to replace end-of-rack 6500’s
    • 36 40G QSFP+ (1.5:1 oversubscribed) – used as a leaf switch, think end of rack
  • 40G Fabric Spine (ACI)
    • 36 40G QSFP+ for Spine deployments (non blocking, ACI only)
    • 1,152 10G ports per switch
  • 36 spine ports x 8 line cards = 288 leaf switches per spine
  • Leaf switches require 40G links to the spine

The line cards that support ACI will not be released until next year.

Spine line cards

  • 36x 40G ports per line card and no blocking
Supervisor Modules



  • Redundant half-width supervisor engine
  • Common for 4, 8 and 16 slot chassis (9504, 9508, and 9516)
  • Sandy bridge quad core 1.8 GHz
  • 16GB RAM
  • 64GB SSD


System controllers


  • Offloads supervisor from switch device management tasks
  • Increased system resilience & scale
  • Dual core ARM 1.3GHz
  • EoBC switch between Sups and line cards
  • Power supplies via SMB (system management bus)


Fabric Extenders


  • Supports 2248TP, 2248TP-E, 2232PP-10G, 2232TM-10G , B22-HP, B22-Dell





My Blog List

Networking Domain Jobs