Network engineers and administrators are at a crossroad. On one hand, protocols like BGP, IS-IS and MPLS still play a critical role in networks, so it's vital we maintain their base knowledge of traditional networking. On the other hand, software-defined networking is here to stay, so new skills like network programmability present a slew of new technologies to conquer. This article is designed to help network professionals by diving into a potentially cryptic component of SDN and network programmability: application programming interfaces (APIs).
A network engineer's view of APIs
Simply put, an API is an interface presented by software (such as a network operating system) that provides the capability to collect information from or make a change to an underlying set of resources. The inner workings of software may seem alien, but the concept is quite familiar to us networking pros. Let's use the Simple Network Management Protocol (SNMP) as an analogy. SNMP provides the means to request or retrieve data like interface statistics from forwarding elements. SNMP also allows applications to write configuration to networking devices. Although that's not a common use case for SNMP, it's helpful to keep in mind because APIs provide this same functionality for a wider array of software applications.
APIs in the context of SDN
The definition of an API may not, on its own, be overly useful for a network engineer, so let's take that definition and examine it in the context of SDN. In an open SDN model, a common interface discussed is the northbound interface (NBI). The NBI is the interface between software applications, such as operational support systems, and a centralized SDN controller.
Getting hands-on with SDN and REST APIs
Since many engineers best learn new technologies by getting direct experience with them, let's look at some next steps to get our hands on some REST APIs. The following three steps outline how to get up and running.
First, if you have no previous programming experience, acquire a tool to generate REST API calls. The Chrome browser, for example, has multiple plug-ins to generate REST API messages. These include Postman and the Advanced REST Client. Firefox has the RESTClient add-on for the same functionality. For those more comfortable with a command-line interface, the curl utility may also be used.
Second, get access to an SDN controller or a controller-like platform that supports REST APIs. Ryu and ONOS are open source options that fit the bill. For those wanting to align to a particular vendor, there are options such as NEC's ProgrammableFlow Controller or Juniper's OpenContrail.
Lastly, dig up the relevant REST API documentation for the SDN controller or platform, such as those for the Ryu controller or for OpenContrail. Although the formatting of the documentation varies, look for the following items: URI string for the requested, HTTP method (e.g., GET, POST, PUT, DELETE) and JSON/XML payload and/or parameters. Both the Ryu and OpenContrail documentation provide examples illustrating how to send a valid REST API message.
To sum it up, we have to demystify APIs in the context of software-defined networks. Why? Because SDN presents new technologies and novel ways for networking professionals to think about and solve networking challenges.
My recommendation was basically for Networkers to be open to change, and to start broadening their horizons. DevOps is coming to networking and that is a FACT. You might be wondering what skills a Network DevOps Engineer needs and here I attempt to answer that.
It's still about NETWORKING
I'm going to state this upfront here. You need to be good at Networking for any of the other skills here to be useful. Continue along vendor certification tracks, follow the IETF, join NANOG, experiment with new technologies. This is all invaluable.
Software Engineering Fundamentals
A lot of the DevOps skills have roots in Software Engineering. Being a Network Guy ™ this may seem like a little bit of a paradigm shift but here's something cool. Would you believe that some of these software engineering concepts have more to do with engineering best practice than with software, and are in fact relevant to the work you are doing today? Also, your SysAdmin buddies already know this and started their DevOps pilgrimage a while ago.
Unit/Functional/Integration Testing, Version Control, Agile, Test-Driven Development (TDD) and Behaviour Driven Development (BDD) are all things that you could benefit from today.
Fortunately, there is an easy way to pick these skills up. The folks over at Software Carpentry have put together a set of Tutorials to help research scientists get to grips with Python and supporting tools. The Lessons are put together in such a way that they are easy to understand for mere mortals (unlike a lot of CS textbooks/lectures)
Know your *nix
An understanding of Linux is going to stand you in good stead in the transition from NetOps to DevOps. As much as people like to talk about "Death of the CLI" they don't realise how much time Developers and SysAdmins spend in the Terminal. Whether this be checking in code with git, extracting information from Open vSwitch or using the OpenStack CLI clients you will likely spend a lot of time in the terminal too. Learning how to be productive here is essential and a good understanding of Linux will help when troubleshooting complex issues.
LPIC
There are vendor neutral *nix certifications which are worth a look like LPIC-1. While I haven't gone through this myself, I have read some LPIC study materials and found this infinitely useful. If you want a vendor certification, Red Hat have certifications available too.
Have some fun
Learning Linux doesn't have to be boring. I prefer a more practical approach so you may find attempting one of the following a nice project to hone your Linux-Fu:
Replace your ESXi Lab with KVM, Libvirt and Open vSwitch
Write command aliases to save yourself some typing
Learn vim, and make yourself a .vimrc
Learn some Python
I'm biased towards Python, but I feel it's the most approachable Programming Language for Network Engineers to pick up.
It has an "Interactive Interpreter" which is a lot like a CLI and let's you enter statements to see what happens
It can be used to basic scripting or beautifully designed object-oriented software but it doesn't force you to do things one way or another.
There is a rich ecosystem of libraries that simplify doing everyday tasks
It's being embedded in Network Devices AND network vendors are providing Python libraries for their software.
You don't need to know Python much to start getting real value. Think of how many things you could automate! People joke about automation not saving time (as it takes time to automate) but during that time you are getting a better understanding of Python, so it's not a total loss. Whether it's your weekly report, Mining the Social Web or something more Network-centric, undertaking a Python project will be really worthwhile... and if you can, host the result up on GitHub.
There are hundreds of good tutorials online, but if you are just getting stated I would recommend CodeAcademy.
Get your head around "Infrastructure as Code"
"Infrastructure as Code" is the battle cry of DevOps. To really understand what this is about and to get a handle on the what/why/how for Networking I'd recommend that you spend some time with:
Run through the tutorials, boostrap a server with Chef, use Puppet to deploy a LAMP server and if you are feeling brave, write a Chef Cookbook/Puppet Manifest. I couldn't mention this and not mention the awesome working being done on the Netdev library for Puppet and Chef.
What about some SDN?
You could take a course on Coursera, but why not get some practical experience? Download OpenDaylight and follow one of Brent Salisbury's awesome tutorials. You can simulate massive networks using Mininet and have some fun pushing paths using the REST API or experimenting with OpenStack integration. The nice thing about the OpenStack integration piece is that this requires you to get DevStack working, which is not easy, and it gives you some OpenStack experience.
Conclusion
Looking in to my crystal ball, I would predict that the Network DevOps engineer will need:
Strong Networking Skills
Knowledge of Linux System Administration
Experience with Puppet/Chef/Ansible/CFEngine/SaltStack would be desirable
Scripting skills in Bash, PHP, Ruby or Python
Ability to work under Source Control (git)
Experience in consuming (REST) APIs
Experience with OpenStack and OpenStack Networking
NETCONF is defined in RFC 6241 which describes it as follows:
The Network Configuration Protocol (NETCONF) defined in this document provides mechanisms to install, manipulate, and delete the configuration of network devices. It uses an Extensible Markup Language (XML)-based data encoding for the configuration data as well as the protocol messages. The NETCONF protocol operations are realized as remote procedure calls (RPCs).
It's not a new technology, as work started on this approximately 10 years ago, but what it gives us is an extensible and robust mechanism for managing network devices.
NETCONF understands the difference between configuration data and state data. As somebody who has been bitten by trying to perform a create operation and faced validation issues as I've mistakenly sent (or worse, edited) a read-only field in a request, I feel this is really valuable.
Another great thing from an operations perspective is the ability to test/validate configuration before it's applied to the device. NETCONF allows you to set at test-option for an edit-config operation that will either test only, or test then set the configuration.
Being XML-based, we can also validate our NETCONF against an XML Schema Document (XSD).
NETCONF supports devices with multiple config datastores e.g running/startup or candidate/running/startup.
Furthermore, we can also subscribe to notifications or perform other Remote Procedure Calls (RPCs) using NETCONF.
What is YANG
YANG is defined in RFC 6020 which describes it as follows:
YANG is a data modeling language used to model configuration and state data manipulated by the Network Configuration Protocol (NETCONF), NETCONF remote procedure calls, and NETCONF notifications.
I am going to make a bold assertion here
Machines love XML
Human's do not love XML.
-- Dave Tucker
Unfortunately it's humans that write standards, and standards dictate data models. Therefore it's in our interest to have a modeling language that people unfamiliar with XML can use and this is where YANG really shines.
YANG is hierarchical (like XML) and supports all of the niceties of a programming language like re-usable types and groupings and more importantly, extensibility. It has a powerful feature called "Augmentation" that allow you to extend an existing tree with some additional information. As it's designed for NETCONF, it allows you to model NETCONF-specific items like additional RPC's and the contents of notifications.
YANG is supported by some awesome open source tooling like pyang.
NETCONF <3 font="" yang="">3>
NETCONF is XML-based, which means that somebody (your network vendor) needs to model their configuration structure appropriately (unless they cheat and use a CLI format). Yang is the perfect way to do this, and also acts as good user documentation when parsed through pyang.
Consider the following yang snippet:
It doesn't take too much brain power to work out that this is a list of interfaces, the unique key is interface-name, and each interface has a speed and duplex. The accompanying XML would then be:
My hope for NETCONF and YANG is that the IETF and other SDO's standardize as many data models as they can. In this way, we can have a standard set of models that can be used for true multi-vendor network management. We don't want hundreds of proprietary MIB files, and I hope that the ease of modeling in Yang will encourage this.
So what has this got to do with SDN?
Even in SDN, we still have persistent state on network devices. OpenFlow doesn't automatically configure itself which is why OF-Config, which uses NETCONF, was developed. Open vSwitch, the de-facto standard for virtual switching, uses Open vSwitch Database Management Protocol (OVSDB) defined in informational RFC7047 which uses JSON-RPC.
Where I see NETCONF adding value is that we have a single protocol for managing configuration for both the traditional, and software defined network. I also don't want to get in to an Open Source vs Open Standards debate, but when interoperability is concerned open standards are essential, and having a standard set of Yang models would be advantageous.
It also has one other benefit, enabled by RESTCONF. Device-level Northbound API standardization.
What is RESTCONF you say? RESTCONF is currently an IETF Draft.
This document describes a REST-like protocol that provides a programmatic interface over HTTP for accessing data defined in YANG, using the datastores defined in NETCONF.
Now device-level NBI's aren't exactly SDN in my book, but they are pretty useful to Network DevOps. What RESTCONF does, is enable simple Yang models to be accessed over HTTP using RESTful-ish style.
Why is this so awesome?
NETCONF is really powerful, but it's a little cumbersome for small tasks. RESTCONF is the more nimble cousin which allow people that are already well-versed in a little REST-API work to perform small tasks without needing to learn an entirely new skill set. That's a real win for DevOps in my book.
Introduction to Application-Centric Infrastructure
In the last post, we discussed the hardware that was being announced from Cisco’s Insieme spin-in. While the hardware that is comprising the new Nexus 9000 series is certainly interesting, it wouldn’t mean nearly as much without some kind of integration on an application level.
Traditionally, Cisco networking has been relatively inaccessible to developers or even infrastructure folks looking to automate provisioning or configuration tasks. It looks like the release of ACI and the Nexus 9000 switch line is aiming to change that.
The Nexus 9000 family of switches will operate in one of two modes:
NXOS Mode – If you’ve worked with Cisco’s DC switches like the Nexus 7K or 5K, this should be very familiar to you. In this mode, you essentially have a 10GbE or 40GbE switch, with the features that are baked into that
In NXOS Mode, all of the additional custom ASICs that are present on the switch fabric are used primarily for enhancing the functionality of the merchant silicon platform, such as increasing buffer space, etc.
ACI Mode – This is a completely different mode of operation for the Nexus 9000 switch. In this mode, the switch participates in a leaf-spine based architecture that is purely driven by application policy. It is in this mode that we are able to define application relationships, and imprint them onto the fabric.
ACI is meant to provide that translation service between apps and the network. I was very happy to see this video posted on release day, as Joe does a great job at explaining the reasons for the product that he and the rest of the folks at Insieme have created:
This Nexus 9000 product family was built from the ground up, not just to be a cheap 10/40GbE platform, but also to be a custom fit for the idea of ACI. In this post, we’ll discuss this ACI engine (called the Application Policy Infrastructure Controller or APIC), as well as the direction that Cisco’s going from a perspective of programmability.
Programmability in NXOS Mode
Before I get too deep into ACI, I do want to call out some of the programmability features present in the Nexus 9000 series even without ACI (since NXOS mode is all we’ll be able to play with initially). The fact that the list below is so long, even in a mode that really only requires you to purchase the hardware and little else, is impressive, and certainly a refreshing turn for the better from a Cisco perspective.
The folks over at Insieme have gone through some great lengths to enable the Nexus 9000 switches with some great programmable interfaces, which is a huge move forward for the Nexus family, and with Cisco products in general, frankly. Here are some of the ways you’ll be able to interact with a Nexus 9000 switch even in the absence of ACI:
Power On Auto Provisioning (PoAP)
OpenStack Plugin
OnePK Capable
Puppet/Chef
Python Scripting
Linux container
Many NXOS commands are available via XML and JSON
XMPP integration
OpenFlow
This is a stark contrast to what we were given in the past. We’ve been able to do really cool stuff with interfaces like these on competing platforms like Juniper and Arista, but not with a Cisco switch. So – this is very good news. Again, these interfaces do not require ACI – though obviously ACI is needed if you want to administer an entire fabric of switches as one.
By the way, the long-term vision is to move all of these features into every NXOS device, which is very good news for those that recently bought a lot of existing Nexus 5000/7000 gear, for instance.
Application-Centric Infrastructure
ACI is all about policy-based fabric automation. When we’re talking about network agility, and creating an environment where the application and developer teams are able to consume network services without having to understand how the network works, this policy abstraction is crucial.
ACI extends the same concept we saw with UCS Service Profiles by using Application Network Profiles. These contain network-related attributes about an application. They can define things like:
Application Tiers
Connectivity policies
L4-L7 services
SML/JAON schema
When you’re working with a fabric composed of Nexus 9000 hardware, ACI becomes the layer that sits on top and enables network orchestration, and integration with other tools like OpenStack, etc.
Policy Concepts
Application policies will be configured in terms that make sense to application folks, abstracting away all of the network-specific nerd knobs for those with that skillset. The main focus is to remove the complexity out of the network to allow for true network automation.
NOTE: It should be noted that there is still a need for traditional network engineers to understand the fabric design underneath, which we’ll get into in a bit.
The main task is to define your application profiles to essentially describe their respective application’s impact on the network. Once done, policies can be configured by the same application folks to define relationships between these profiles.
Note that zero networking knowledge is needed for this. As an application developer or owner, you’re configuring relationships using verbs like “consume” or “register” or “store”. Those are the words you’re used to using. That’s the idea here – abstract the networking nerd knobs away and let the network engineers maintain them.
All of this is possible through the Application Policy Infrastructure Controller, or APIC. This is the policy server for ACI. In order to create an ACI fabric, you connect the APIC into a leaf switch. As soon as you plug this in, it discovers the entire topology in an automated fashion. The APIC is responsible for taking created policies and imprinting them onto the network.
ACI is being positioned to act as a translator between the application teams and network teams. It allows for a plug and play semantic of network elements allowing policy groups to pick and choose from a menu list of network structures that they want to utilize (QoS, FW, etc.). This is nothing new for Vmware admins who have worked with port groups on a vSwitch or anyone familiar with vNIC or vHBA templates in Cisco UCS – except that with solutions like Cisco ACI or Vmware NSX, the connectivity options offered behind the scenes is much more rich.
These attributes can be tied to network connectivity policies so that they’re abstracted from the application teams, who end up selecting these policies from a dropdown or similar.
ACI Fabric Design
Now – for all of you network folks, let’s talk about how this fabric works behind the scenes.
The typical ACI fabric will be designed in a traditional leaf-spine architecture, with 9500s serving as the spine, and 9300s serving as the leaf switches.
The workloads and policy services connect directly into the leaf switches of the fabric. This can be baremetal workloads, or hypervisors.
An ACI fabric operates as a L3 routed leaf-spine fabric with VXLAN overlay. There is no TRILL or FabricPath anywhere – so redundancy is accomplished via L3 ECMP.
There are a few attributes that will be used from day 1 to offer application identification:
L3 addressing
L4 addressing
Tagging (VXLAN, NVGRE, VLAN)
Virtual or Physical Port
Each ingress port on the fabric is it’s own classification domain – so using the traditional VLAN model may not be a bad idea – since VLAN 10 on port 1 means something completely different than VLAN 10 on port 2. However – all IP gateways can also be made available across the fabric on any leaf. This means a lot when it comes to workload mobility.
NOTE: Routing integration can also take place between the ACI fabric and an edge router using iBGP, OSPF, or static routing.
Every port on the 9000 fabric is a native hardware VXLAN, NVGRE, and IP gateway. As you can imagine, this means that our flexibility in tagging mechanisms outside the fabric is a lot better. We can really just use whatever we want, and coordinate the tags that are being used using policy. The fabric will rewrite as necessary – allowing a Hyper-V host and an ESXi host to talk using both of their native tagging mechanisms at line rate.
Because classification occurs at the edge, each leaf essentially serves as a gateway for any tagging mechanism. This function is able to translate between VXLAN, VLAN, NVGRE, etc. – all at line rate in hardware. This also means that integration of these application policies between virtual and physical workloads is seamless.
This obviously requires some overall orchestration between the hypervisor and the fabric, because there’s only a small number of attributes usable for this classification. You’d have to reach into the API of – say vSphere – and figure out what tags are being used for what virtual machines, then make sure you use those tags appropriately as they enter the fabric. Eventually, other attributes could include:
DNS
DHCP
VM Attributes
LDAP information
3rd party tools
These, as well as others, could all potentially be used for further granularity when identifying application endpoints. Time will tell which ones become most urgent and which ones Cisco adopts. Personally, I’ve had a ton of ideas regarding this classification and will be following up with a post.
Latency and Load Balancing
In normal networks, any time you’re talking about multiple paths, whether it’s a L3 ECMP with Cisco IP CEF or similar, or simple port channels, it’s typically per-flow load balancing, not per-packet. So as long as traffic is going to the same IP address from the same IP address, that traffic is only going to utilize a single link, no matter how many are in the “bundle”.
This is to limit the likelihood of packets being delivered in the wrong order. While TCP was built to handle this event, it is still stressful on the receiving end to dedicate CPU cycles to put packets back in the right order before being delivered to the application. So – by making sure traffic in a single flow always goes over a single link, it forces the packets to stay in order.
The ALE ASIC is able to do something extremely nifty to help get around this. It uses timers and short “hello” messages to determine the exact latency of each link in the fabric.
This is HUGE because it means that you know the exact latency for each link all the time. This allows you to do true link balancing, because you don’t have to do tricks to get packets to arrive in the right order – you simply time it so that they do. As a result, our entire fabric can use each link to it’s full potential.
This is a big argument for using a hardware solution because the fabric can make the best decisions about where to place the traffic without affecting, or requiring input from the overlay.
Security
An ACI fabric can operate in a whitelist model (default) or a blacklist model. Since the idea is to enable things from an application perspective, then in order to get connectivity you must first set up application profiles and create the policies that allow them to talk. Or, you could change to a blacklist model, denying the traffic you don’t want to exist.
There are a handful of benefits here – first off, ACLs aren’t really needed and wouldn’t be configured on a per-switch basis. This “firewall-esque” functionality simply extends across the entire fabric. Second, it helps prevent the issue of old firewall rules sticking around long after they become irrelevant.
So, a security configuration is always up to date and allowing only the traffic through that needs to get through because these are configured in application profiles, not in an ACL that’s only updated when things aren’t working.
Software and Programmability
Application Profiler is a custom tool built by Cisco to sweep the network to find applications and their configurations so that a Fabric network can be planned accordingly.
Simulators will be available along with the full API documentation but not necessarily a script library, such as what is provided for UCS (PowerTool/PowerShell). This could be developed on day one using the XML/JSON API being provided.
The list of mechanisms by which we’ll be able to programmatically interact with ACI is quite extensive (a good thing):
Scripting: Puppet/Chef, Python, XMPP
Monitoring: Tivoli, CA, Netscout, NetQoS, Splunk
Orchestration: CA, BMC, Cisco, OpenStack
NXOS Northbound APIs
PoAP (Power On Auto Provisioning) uses PXE boot
OpenStack plugin will be released. Support for Grizzly
L2 support for OpenStack plug-in, but not sure about L3
UCS Director (Cloupia) will work with compute, storage (NetApp), and now Nexus
NX-API has FINALLY made it’s way to another switch platform. The Nexus 3000 was the first switch series to receive the blessings of NX-API outside of the intial 9000 line. There don’t appear to be any differences between the implementations – hopefully there are none. Again, without a published NX-API “standard” it’s hard to tell. For features present on the 3000 series that are not present on the 9000 series, there are likely unique data structures for those features, obviously.
Regardless of what level of certification engineers seek from Cisco or other hardware vendors, they must also start learning programming basics -- and Python is the first step.
"Python allows you to start moving in your mind between procedural programming to object-oriented programming," said McNamara. "Switching into that mind of a software developer allows you start working with SDN controllers like OpenDaylight, as well as OpenStack and, to a point, NSX."
Along those same lines, network engineers should learn Linux, and begin playing with APIs. "You will need those basic interaction skills," McNamara said.
MacVittie advises engineers to seek training that is relevant to IT management, such as Project Management Professional or ScrumMaster certifications.
Odom suggests that admins start their career path with a vendor's basic routing and switching certification. Then they should learn data center virtualization, and finally move on to OpenStack Neutron.
From there, network engineers should pick an SDN focus. Each of the vendors and open source organizations are now differentiated enough in their strategies that engineers will have to pick a road, he said.
Network pros will need to look further than traditional vendor certifications if they want to build and manage SDN and programmable infrastructure.
The Cisco certification ladder is no longer a foolproof career path.
In fact, career-level networking certifications from most hardware vendors don't offer the SDN and programming skills that network engineers will inevitably need, according to a group of experts who spoke on a certs panel at Interop NYC last week.
Instead of plodding through every level of a vendor certification program, engineers should take the lower-level classes for networking basics, and then chart their own course in learning programming and management for automated, software-driven networks.
"You have to fundamentally understand how to route packets and use protocols to use equipment," said Lori MacVittie, principal technical evangelist at F5 Networks. "But things are changing and it's going to be more important to understand APIs and the tool sets around them, than it is to know how to interact with particular devices."
Network engineers are changing from "consumers of network technology to creators of network technology" as they move away from using CLI and GUI for proprietary, static infrastructure, said Colin McNamara, chief cloud architect at Nexus IS, a Dimension Data company. Now that networks are being abstracted and virtualized, engineers will need programming skills and the ability to interact with SDN controllers and orchestration systems to provision networking resources as an integrated part of IT, he said.
As networking becomes an integrated part of IT resource provisioning, it's no longer feasible for network vendors to train engineers to solely specialize in their technology, McNamara explained.
Meanwhile, it could be years before hardware vendors like Cisco include network programmability concepts even for their own systems in career-level certification programs, said Wendell Odom, a CCIE, longtime Cisco Press author and well-known certifications expert.