Sunday, December 28, 2014

Time for an SDN Sequel? Scott Shenker Preaches SDN Version 2


Scott Shenker, one of the minds behind the creation of SDN, has some misgivings about the technology. It’s time for SDNv2, he says.

Speaking Wednesday at the Internet2 Technology Exchange conference in Indianapolis, Shenker explained that the first take on software-defined networking (SDN), which started taking form six to eight years ago, relied on some key “misconceptions.” That’s not an indictment of the SDN concept, just a sign that it could use some tweaking.

Shenker — who, along with Nick McKeown and Martin Casado at Stanford University, and others, developed OpenFlow and the ideas behind SDN in the mid/late 2000s — is working with other researchers (including McKeown again) on a set of technologies that he’s calling SDNv2, a second draft that takes into account Layer 4-7 equipment, the prevalence of virtual switching, and the rise of network functions virtualization (NFV).

But there’s an important caveat: SDNv2 targets carrier networks. They differ from data-center networks in that they’re filled with legacy equipment that won’t go away soon, and they’re equipped with a network core that’s built to forward packets quickly. His talk didn’t discuss the data center, where SDN-as-we-know-it might be working just fine.

At a time when we’re telling each other things are moving so fast, Shenker is disappointed that SDN has matured so slowly — more slowly than he expected, anyway.

The problem lies with SDN’s roots, he said. He and other researchers simply misunderstood the nature of carrier networks.

The Tenets of SDNv2

SDNv2 would still separate the control and data planes, and it would still aim for programmatic control of the network.

It differs from older SDN conceptions in three major ways (plus one more point of Shenker’s that I’m adding to this list). Keeping in mind that this is all about carrier networks, and not about settings such as Google data centers, those differences are:

1. Software goes to the edge; the core stays dumb. Implied in this statement: Switching at the edge becomes largely virtual, handled on x86 cores.

This can be done, Shenker insists. His calculations suggest that a major ISP’s traffic can be handled by $150,000 worth of x86 cores. (Other microprocessor architectures such as ARM‘s would be welcome here, of course, but Intel‘s x86 dominates the market.) Power consumption would increase compared with switch ASICs, but it would still be “infinitesimal” compared with the entire carrier network, he said.

Why is this different? Because Shenker and others started out assuming the network was homogeneous. The differences between core and edge switches — the existence of MPLS, essentially — wasn’t taken into account.

“This one is unforgivable. We just ignored current systems,” Shenker said. “One of the secrets, when you teach networking, [is that] nobody covers MPLS.”

2. “Middleboxes” get included in SDN. “Middleboxes” refers to the Layer 4-7 appliances, real or virtual, found all over the network.

SDN’s earliest incarnations didn’t take these into account, as Layer 4-7 vendors will tell you repeatedly. This was a shortcoming of the early jump to OpenFlow, which primarily manipulated Layer 2.

Under SDNv2, functions such as firewalls, load balancers, and WAN optimization would be included in the SDN edge, in virtual form. Sometimes they would be in the form of lookup tables, but more often, they would be virtualized network functions (VNFs) — which is where NFV gets folded into SDNv2.

This corrects the original SDN assumption that the data plane was dominated by routers and switches. In fact, Berkeley researchers including Sylvia Ratnasamy did a simple count of boxes and found carrier networks were just about evenly divided among routers, switches, and middleboxes.

“If SDN claims it is going to provide programmatic control over the network, then it has to incorporate these middleboxes,” Shenker said.

3. The network is opened up to third-party services. This is the biggie, and something that speaks to the purpose of SDN rather that the mechanics of it.

It means the edge of the carrier network would become analogous to Amazon Web Services: Anybody can check in, for a fee, and start using the network. Shenker calls this “service virtualization.”

It’s the crux of SDNv2, because it addresses carriers’ No. 1 problem: how to generate more revenues. New services are an obvious answer, but carriers are slow to develop new services and conservative about deploying them — and a service has to be a huge revenue generator, on paper, to even be considered.

By providing low-level interfaces into the network and a web-based, self-service portal, carriers could make money by having people pay to use the network as their own. Internet2 just announced a capability like this, in fact; its network virtualization lets any members take advantage of the 100-Gb/s Internet2 backbone.

Just as startups use AWS to avoid buying big chunks of computing power, they could use SDNv2 to rent out a carrier-sized network that they otherwise couldn’t afford. “If you had a setup like this, two kids in a garage could build a competitor to Akamai,” Shenker said.

4. Closed interfaces are not allowed. Shenker didn’t list this as an SDNv2 bullet point, but he was adamant about it later in his talk, and it seemed to resonate with the audience. “You do not have vendors offering vendor-specific interfaces at the edge,” he said — possibly a dig at Cisco‘s OpFlex, which became a multivendor effort but really did originate with Cisco.

SDNv2 Piece-Parts

Of course, Berkeley is working on a few elements to get all this done:


  • Recursive SDN, which combines the best aspects of SDN and of global networks. SDN is being used on tens of thousands of switches at a time — but they’re all in a data center, all in the same place. Global networks have a wider reach, by definition, but they lack the fast failure recovery that’s vital to packet processing. Recursive SDN applies programmatic control to the hierarchy of a global network, initiating local responses to failures and using network virtualization to create new paths.
  • The elastic switch (a name Shenker doesn’t like, but, too bad). This would be a self-managing group of equipment (not necessarily one switch) that sits at the edge and uses a combination of virtual switches and ASIC-based switching —because you’ll want both, Shenker said. The ASIC switches would be good for cases of fast-and-dumb forwarding that don’t need much packet processing.
  • Various packet-processing improvements, such as better methods for fault tolerance. Shenker didn’t have time to get into details.
  • Network verification for middleboxes. With routers, there’s a way to check if the routing tables are valid and don’t cause loops. Layer 4-7 equipment has no equivalent because they don’t operate on simple tables. They’re more like software programs, “and when you check collections of programs, you use model checking,” which really doesn’t scale, Shenker said. His team has found a framework that does work at scale, though; it can verify 35,000 middleboxes in five minutes, he said.


A Word About OpenFlow
So where does OpenFlow fit into all this?

OpenFlow itself isn’t a bad protocol, Shenker said. But OpenFlow was meant to communicate new ideas to networking equipment. The limitations of ASICs — table sizes, for instance — prevented OpenFlow from getting some of those ideas across.

“It’s not that the OpenFlow design is wrong. It’s that it was given an impossible task,” he said.

Shenker was also a founder of Nicira, a startup whose arc mirrors his misgivings about OpenFlow. Nicira began life as an OpenFlow startup but quickly shifted strategies toward network virtualization; it was famously acquired by VMware in 2012 and became the basis for the NSX product line.

Shenker thinks OpenFlow would eventually be a candidate to run SDNv2’s packet-forwarding core, but today’s MPLS would suffice for now.

Changing the Innovation Model

You can see how SDNv2 is tailored to the carriers, because it opens the door to new revenue sources without displacing the mass of legacy equipment that’s not going to be decommissioned any time soon. Specifically, the network core wouldn’t have to be touched at first.

But preserving the legacy network isn’t what SDNv2 is about, Shenker insisted. “Most importantly, this is about changing the innovation model” for SDN, he said.

So far, he’s not getting a good reception from the U.S. carriers, despite their need for new services. Foreign carriers have been more inviting, though, and he thinks that could spur some demand. “My guess is that once it gets deployed there, it will be much easier for American carriers to pick up,” he said.

Like the P4 protocol that could be a candidate for OpenFlow 2.0, SDNv2 is an academic exercise for now. But Shenker’s talk revealed ways to plug the gaps between the real world (especially the carrier reality) and SDN’s original assumptions. Even if SDNv2 doesn’t get off the ground, the points he brought up could prove valuable.


No comments:

Post a Comment