Showing posts with label Junos. Show all posts

Sunday, March 3, 2013

Juniper Launches Cloud-Based Security Intelligence Service

Juniper's Junos Spotlight Secure service will give businesses greater insight into attackers, threats and the devices used in attacks.

Juniper Networks is looking to give organizations the up-to-date security intelligence they need to protect themselves against cyber attacks on their networks.

At the RSA Conference this week, Juniper unveiled a number of new and enhanced security products, in particularly Junos Spotlight Secure, a cloud-based service designed to give users greater intelligence about attackers, threats and individual devices, and will spread that intelligence across various network and security products.

In addition, Juniper and RSA, the security unit of storage giant EMC, are expanding their relationship through a technology partnership that will result in a high level of intelligence sharing between Juniper's Junos Spotlight Security service and RSA's Live threat delivery service, and using RSA's Security Analytics tool to give customers greater insight into the security threats they're facing.

"Next-generation security must be built on automated and actionable intelligence that can be quickly shared to meet the demands of modern and evolving networks," Nawaf Bitar, senior vice president and general manager of Juniper's security business unit, said in a statement. "This is only possible if you are able to collect definitive information about attackers. Junos Spotlight Secure provides the platform to deliver advanced intelligence with device-level attacker tracking. This integrated approach improves security intelligence, provides collective defense against attackers and delivers true defense in-depth for the data center."

Juniper officials argue that businesses need more than just the IP address of their attackers, but also the devices used in the attacks, and then have the ability to take the intelligence gathered and quickly bring it into the data center and into the network. They are positioning their Junos Spotlight Secure intelligence service as the place where security intelligence on attackers and threats is consolidated and then sent in real time to Juniper's security offerings to act on.

Using the cloud database, organizations can identify attackers by more than 200 unique attributes, and then keep track of them. Through Junos WebApp Secure software, an attacker can be "fingerprinted" and that information can be shared with other subscribers, bringing the real-time security to disparate networks. The goal is to enable businesses to keep out the bad guys while letting legitimate users securely access the network.

Juniper's Junos WebApp and the company's SRX Series service gateways are the first company products to leverage the Spotlight Secure offering, according to officials.

Junos WebApp Secure—formerly called Mykonos, a company Juniper bought in 2012—is used to integrate the intelligence from other sources that has come from Spotlight Secure, and then lets other Juniper products leverage the intelligence to more accurately defend against attacks. At the same time, Junos WebApp Secure uses Juniper's Intrusion Deception technology to not only profile and fingerprint attackers, but also to misdirect them, according to Juniper.

In addition, Junos WebApp Secure will be integrated into the SRX Series service gateways, enabling the gateways to leverage the intelligence from Spotlight Secure to block attackers, company officials said, noting it will be particularly effective against botnets and large-scale Web attacks.

Along with Spotlight Secure, Juniper also announced Junos DDoS Secure, a distributed denial-of-service protection system aimed at guarding Websites and Web applications against high-volume attacks as well as “low and slow” app attacks. It can either be deployed as software through a virtual machine in cloud environments or as a hardware appliance.

Juniper officials said they will be able to also take advantage of their software-defined networking (SDN) strategy to more quickly bring intelligence to the networks and deploy security services across networks. Organizations will be able to take advantage of Juniper's Junos Space Security Director, which provides centralized management in an SDN environment.

In extending their partnership, Juniper and RSA not only will focus on protecting corporate networks against attacks, but also on bringing greater mobile security services that tie strong authentication with secure remote access. The companies want to get RSA's mobile authentication technologies to interoperate with Juniper's Junos Pulse SSL Secure for remote access to a business' infrastructure.

The companies also want to make sure that the Junos Pulse SSL Secure offering can authenticate native mobile application access, creating a single access point for VPNs and mobile applications.

"Two of the most common requests we get from customers are about improving information sharing and enabling them to deploy greater security on mobile devices," Art Coviello, executive chairman of RSA and executive vice president at EMC, said in a statement. "This expanded technology partnership would enable RSA and Juniper to help address both of those key requirements for customers, and set the stage for increased collaboration on a wider range of advanced security challenges."

Thursday, January 10, 2013

Juniper Brings 80 TB Router to the Edge

Juniper is updating its MX edge router family with two new large scale hardware platforms and new virtualization software capabilities.

The new MX 2020 router is a 20 slot, 7 foot chassis that can deliver up to 80 TB of edge routing capacity in a single platform. Also new is the 10 slot MX 2010 which is a 10 slot chassis, delivering up to 40 TB of capacity. Both platforms eclipse the MX 980 which had been the top end of Juniper's edge routing portfolio coming in at 8.8 TB.

Luc Ceuppens, VP of product marketing at Juniper told EnterpriseNetworkingPlanetthat the new MX hardware is compatible will all existing MX line cards that Juniper customers already have in place. In terms of physical size, the MX 2020 is three times larger than the MX 980, though Ceuppens noted that the MX 2020 delivers more than three times the performance.

He explained that the MX 2020 is based on the same Juniper Trios silicon as the existing MX 960.
That said, Juniper has be able to integrate new advances in packaging and cooling in order to achieve the greater density in the MX 2020.

Virtual Chassis

Juniper does not expect that all of its customers will need or want to move to the bigger MX 2010 or MX 2020 platforms, though some might still want more scale. That's where the MX Virtual Chassis comes in.

Ceuppens explained that with the virtual chassis, existing MX 960 customers can continue to scale. What the technology will enable is for up to 8, MX 960 routers to be clustered together to from a single logical domain.

"Virtual Chassis allows you to grow the number of slots in a node, while maintaining one single control plane," Ceuppens said. "So the operational complexity is not increased when you increase the number of slots."

Mike Marcellin, SVP of product marketing and strategy at Juniper, added that Virtual Chassis isn't just for local groups of MX 960s either. With Virtual Chassis, the MX 960s can be separated by a few hundred kilometers which can provide near instantaneous disaster recovery failover.

Software Defined Networking

Juniper is also advancing the MX platform with a technology called Path Computational Element (PCE). Marcellin noted that PCE can be used to help enable a Software Defined Networking type of deployment. Juniper also has some support for the OpenFlow protocol as well to help enable the programmability of the network.

"OpenFlow can be used to select the specific flow that you want to do some kind of operation on," Marcellin said. "PCE is how you take those flows and steer them through the optimal path across a network."

Linux KVM

The JunosV App engine further enhances the MX routers with a Linux powered KVM virtual hypervisor. Marcellin noted that Juniper is using CentOS as the Linux base. CentOS is a clone of Red Hat Enterprise Linux.

The JunosV App engine is a hypervisor that enables service providers to run applications such as load balancing and security services at the edge of network

Monday, August 13, 2012

Juniper Networks OpenLab and the AT&T Foundry® to Co-Host Software Development Workshop/Hackathon

Workshop Provides University Students With Hands-On Experience With Software Defined Networking Tools and Innovations Around the Programmable Network

SUNNYVALE, CA--(Marketwire - Aug 13, 2012) - Juniper Networks, the industry leader in network innovation, with the AT&T Foundry®, today announced a university workshop and software development hackathon being offered through the AT&T Foundry® and Juniper's OpenLab - The Junos® Center for Innovation. The week-long event will be hosted at the AT&T Foundry® in Texas, in conjunction with the Institute for Innovation andEntrepreneurship at the University of Texas at Dallas starting on August 13, 2012.

The workshop and competitive challenge is designed to provide next-generation software developers access to the tools and guidance needed to design new innovations for programmable networks. The workshop will give university students hands-on experience with concepts around software defined networking (SDN) and channel their interest in technology to help solve real-world challenges in the networking industry and unlock new opportunities.

News Highlights

The SDN-themed workshop, utilizing elements from Juniper Networks' Academic Alliances (JNAA) Program, will be the first of two workshops that will focus on giving students an overview of the AT&T Foundry® and a corporate overview of Juniper Networks in addition to providing them with several days of training on Juniper's programmable network assets such as the Junos SDK and Junos Space Platform.
During the workshop students will build an application on top of the Juniper Networks®Junos Space SDK to monitor real-time traffic, aggregate and produce data for operator reports, in addition to dynamically provision network elements based on the mined data.
At the end of the workshop, students will compete in a hackathon, presenting and demonstrating their application to AT&T and Juniper representatives. Prizes will be awarded to the students based on solution innovation.
The University of Texas at Dallas will play a prominent role via the delivery of entrepreneurial content with the objective of fostering the spirit of innovation among students and promoting an entrepreneurial culture.
A second workshop and competitive challenge with a similar agenda will be held at Juniper's OpenLab facility in New Jersey in the fall.

Supporting Quotes

"AT&T Foundry® is focused on generating new ideas and we foster a culture of innovation. We are excited to join with Juniper Networks and co-sponsor these workshops, which we believe will help students get first-hand insight on the inner-workings of the networking industry and broaden their knowledge on software-defined networking."
-Mike Berry, director of operations, AT&T Foundry®

"At Juniper, we are deeply committed to innovation and we understand that the impact of software-defined networking stretches beyond the network to applications that depend on them. Through these workshops we hope to drive and facilitate collaboration and education, accelerate new network-integrated applications for our connected society and build software intensive applications that can help overcome some of the shortcomings in today's legacy networks."
-Judy Beningson, general manager and vice president, Management and Virtualization Platforms Group, Juniper Networks

"We are pleased to have joined with leading technology companies such as Juniper Networks and AT&T to help us develop next-generation leaders. We strongly believe that workshops such as these, which are competitive in nature, provide students a chance to improve their skill set, understand their strengths and encourage team work while working on today's leading technologies."
- Dr. Joseph C. Picken, executive director, The Institute for Innovation and Entrepreneurship at the University of Texas at Dallas

Additional Resources:

Juniper Networks Academic Alliances Program
Juniper Networks OpenLab
Juniper on Twitter
Juniper on Facebook

About Juniper NetworksJuniper Networks is in the business of network innovation. From devices to data centers, from consumers to cloud providers, Juniper Networks delivers the software, silicon and systems that transform the experience and economics of networking. Additional information can be found at Juniper Networks (www.juniper.net) or connect with Juniper on Twitter andFacebook.

Sunday, May 27, 2012

MTU Myth Busters

MTU – Maximum Transmission Unit, always not take importance by anyone, until someone hits by its never-seen & unpredictable results that's break communication. That is the same we faced (me and my team) at Leading Service Provider of Pakistan. MTU is normally termed as the maximum amount of information that can be sent in the Packet….but this is not the right thinking. MTU is the Physical layer characteristics, so better to say…its the maximum amount of information (data) can be sent in the Frame (e.g. Ethernet Frame). As per standard frame, maximum amount of Packet size accommodate in Ethernet frame is 1500B.

But if packet size is more than 1500B due to any reason, than Layer 2 informs Layer 3 to fragment the information as it cannot be fit into Ethernet frame. Initially it was observed that physical Media technology was not as stable & reliable as today, so Internet Architect suggest to prefer fragmentation, as they only have to re-transmit that small part of segment, not the complete information again. But this puts lots & lots of load on Layer 3 device responsible for fragmentation.

What are the reasons, when our normal HTTP or application traffic does not able to communicate? where Did MTU hits? Lets check it out…

Here are some overhead facts to carry Application/Presentation/Session Layer [Normally termed as Data] information,

•TCP Header = 20B
•GRE = 24B
•IPv4 Header = 20B or IPv6 Header = 40B
•MPLS Header = 4B to 16B (Including L3VPN, FRR TE, AToM Control Word)
•Ethernet Header = 14B
•VLAN/Trunk = 4B & Q-in-Q = 8B

Here are some examples where the end to end communication breaks for certain customers/applications, while all other service work well.

When everything goes well,

Consider the network with default config i.e. MTU 1500 for most of the FastEthernet interfaces (now a day’s Gig Ethernet interface have Jumbo enable by default for some vendors).

If any PC behind Router A want to send the Data and configured MTU at interfaces is 1500 than maximum data coming from A/P/S layers should be calculated based on following,

Data = 1500 – 20 (TCP) – 20 (IPv4) – 14 (Ethernet) = 1446B

This 1446B is usually considered as safe payload from Customer devices to pass all the application data w/o dropping somewhere in between Source & Destination. So if customer set MTU of its CE WAN interface than usually its CE router will do the fragmentation (if required) and usually the traffic will not drop in the transit. There are ways that Service Provider can set DF (Don't Fragment) bit on the incoming customer traffic, so that their Core routers will not be overloaded with Fragmentation process.

But there are scenarios, where the traffic with 1446B can be drop. Lets discuss those,

1) If Service Provider support MTU of 1500B and use VLAN trunk on any intermediate node connectivity:

In this scenario Router B & C are connected over the Ethernet Trunk Link, means there comes another 4B of VLAN TAG overhead. Now if the same 1446B of traffic come in from Customer router A, than it cannot pass over B-C link. Here is the calculation,

1446 (Data) + 20 (TCP) + 20 (IPv4) + 14 (Ethernet) + 4B (VLAN TAG) = 1504B (Required MTU)

If customer application mark the DF bit in Application or SP marked the same for informing customer traffic than Router B will not do the Fragmentation and traffic will be dropped. To resolve this issue, B-C link should support atleast 1504B.

Let’s discuss another scenario as an example:

2) If Service Provider support MPLS along with VLAN tagging.

In this scenario Service Provider network B-C-D support MTU of 1504B. Router C & D are connected over the Ethernet Trunk Link and also running MPLS, means there comes 4B of VLAN TAG overhead and 4B of MPLS Label overhead. Now if the same 1446B of traffic come in from Customer router A, than it can pass over B-C link, but not over C-D link. Here is the calculation,

1446 (Data) + 20 (TCP) + 20 (IPv4) + 4 (MPLS Label) + 14 (Ethernet) + 4B (VLAN TAG) = 1508B (Required MTU)

Similarly, if customer application mark the DF bit in Application or SP marked the same for informing customer traffic than Router C will not do the Fragmentation and traffic will be dropped. To resolve this issue, C-D link should support atleast 1508B.

The case is worse when Service Provider run MPLS Traffic Engineering and Customer traffic is carried over VPN, this will add additional overhead up to 12B, if Q-in-Q supported than additional 4B, if IPv6 is the transport protocol than IP header overhead will increased to 40B, instead of 20B of IPv4 header. Further if customer is using GRE tunneling than 24B of GRE overhead will be added.

So in the Nut Shell, its Service Provider responsibility to support the maximum MTU that can accommodate all sort of customer services including its own like MPLS etc. To the safe side if service provider enables Jumbo MTU (9192B) in its Access & Core network than almost all possible services can run w/o issue.

Vendors & MTU:

Now look at the MTU in the perspective of Vendors (Cisco, Juniper and Windows/Linux Machine). Cisco & Juniper implementation of MTU is bit different and specially when we try to verify the supported MTU using PING.

Juniper Implementation:

Lets discuss here MTU at Gigabit Ethernet Interface (Other interface have different default/maximum MTU – Check here). By default Physical Interface MTU is 1514 and if we configure Physical MTU other than the default value than underlying protocols will inherit the MTU from physical interface. We can also configure different MTU value on Protocol level as compared to the inherited one – The one reason to do that is to match the MTU on the remote device specially in case of OSPF neighborship, which cannot be established until both end IP MTU is same. The Protocol MTU cannot be more Physical MTU and its important to maintain the protocol header difference between IP & Layer 2, else Juniper will not allowed configuration commit. Here is the example from my M320 router, showing Physical Interface MTU 9100 (configured) and IP protocol MTU is drive from it (9100-18 = 9082B). Since it’s also configured with VLAN TAG 4B overhead will be added over 14B Layer 2 overhead, that's why we deduct 18 from physical interface MTU to get IP MTU.

falikhan@sydlab@M320-m2-re0> show interfaces ge-0/0/1

Physical interface: ge-0/0/1, Enabled, Physical link is Up
Link-level type: Ethernet, MTU: 9100, Speed: 1000mbps, MAC-REWRITE Error: None, Loopback:
Logical interface ge-0/0/1.621 (Index 95) (SNMP ifIndex 531)
Flags: SNMP-Traps 0×4000 VLAN-Tag [ 0x8100.621 ] Encapsulation: ENET2
Protocol inet, MTU: 9082
Protocol inet6, MTU: 9082
Protocol mpls, MTU: 9070

If interface also configured with MPLS address family than 12B (3 labels) overhead will be added.

When we PING from Juniper CLI, of size 1000B, it means, 1000B is ICMP payload, which will be encapsulated in ICMP header of 8B, which will be encapsulated in 20B IPv4 header and finally in 14+4B Layer 2 Ethernet frame overhead. So actually Bytes on wire will be 1000+8+20+14+4=1046B.

Now if we need to test that how much maximum size PING we can send to remote host via interface ge-0/0/1 (over IP network – No MPLS)? So the answer is 9082 (IP MTU) – 20 (IP Header) – 8 (ICMP header)= 9054B. Let’s test it,

falikhan@sydlab@M320-m3-re0> ping 10.250.22.1 logical-system SD31 source 10.250.23.1 size 9054 do-not-fragment

PING 10.250.22.1 (10.250.22.1): 9054 data bytes

9062 bytes from 10.250.22.1: icmp_seq=0 ttl=64 time=8.923 ms
9062 bytes from 10.250.22.1: icmp_seq=1 ttl=64 time=8.888 ms

^C

— 10.250.22.1 ping statistics —

2 packets transmitted, 2 packets received, 0% packet loss

round-trip min/avg/max/stddev = 8.888/8.905/8.923/0.017 ms

Note: I have configured Logical Router on M320 to simulate the multiple routers network.

falikhan@sydlab@M320-m3-re0> ping 10.250.22.1 logical-system SD31 source 10.250.23.1 size 9055 do-not-fragment

PING 10.250.22.1 (10.250.22.1): 9055 data bytes

ping: sendto: Message too long
ping: sendto: Message too long

^C

— 10.250.22.1 ping statistics —

2 packets transmitted, 0 packets received, 100% packet loss

This test shows that when Juniper router tries to PING using ICMP payload of 9055, it need IP MTU to support atleast 9083. But since currently supported IP MTU on interface is 9082, the maximum IP packet that can pass through this interface (w/o fragmentation) is 9054.

Just to clarify, by default for IPv4 traffic Router perform fragmentation i.e. if I remove do-not-fragment knob from PING, it can let it 9055 or higher payload ICMP packet over the same interface.

falikhan@sydlab@M320-m3-re0> ping 10.250.22.1 logical-system SD31 source 10.250.23.1 size 9055

PING 10.250.22.1 (10.250.22.1): 9055 data bytes

9063 bytes from 10.250.22.1: icmp_seq=0 ttl=64 time=9.685 ms

^C

— 10.250.22.1 ping statistics —

1 packets transmitted, 1 packets received, 0% packet loss

round-trip min/avg/max/stddev = 9.685/9.685/9.685/0.000 ms

falikhan@sydlab@M320-m3-re0> ping 10.250.22.1 logical-system SD31 source 10.250.23.1 size 1000

PING 10.250.22.1 (10.250.22.1): 1000 data bytes

1008 bytes from 10.250.22.1: icmp_seq=0 ttl=64 time=1.569 ms
1008 bytes from 10.250.22.1: icmp_seq=1 ttl=64 time=1.552 ms

^C

— 10.250.22.1 ping statistics —
2 packets transmitted, 2 packets received, 0% packet loss

round-trip min/avg/max/stddev = 1.552/1.560/1.569/0.008 ms

falikhan@sydlab@M320-m3-re0> ping 10.250.22.1 logical-system SD31 source 10.250.23.1 size 6000

PING 10.250.22.1 (10.250.22.1): 6000 data bytes

6008 bytes from 10.250.22.1: icmp_seq=0 ttl=64 time=6.179 ms
6008 bytes from 10.250.22.1: icmp_seq=1 ttl=64 time=6.173 ms

^C

— 10.250.22.1 ping statistics —
2 packets transmitted, 2 packets received, 0% packet loss

round-trip min/avg/max/stddev = 6.173/6.176/6.179/0.003 ms

Cisco Implementation:

Cisco implementation is bit different from Juniper. There you can specify the MTU on different families and if IP MTU is larger than physical interface MTU, it will not give you error like Juniper. But if only Physical interface MTU is define, underlying protocol will inherit MTU settings from physical interface. Another difference need to understand that when we do the “show interface” command on Cisco CLI, it will show only Physical interface MTU, to check IP MTU on the same interface, we need to run “show ip interface” command.

Here is the example from my Cisco router, showing Physical Interface MTU 1500 (default) and IP protocol MTU is configured as 1300B (1500 by default).

Router(config)# interface f0/0
Router(config-if)# ip mtu 1300

Router# show interface f0/0

FastEthernet0/0 is up, line protocol is up
Hardware is Gt96k FE, address is c200.5867.0000 (bia c200.5867.0000)
Internet address is 10.0.0.1/24
MTU 1500 bytes, BW 10000 Kbit, DLY 1000 usec,
reliability 255/255, txload 1/255, rxload 1/255

Router# show ip interface f0/0

FastEthernet0/0 is up, line protocol is up
Internet address is 10.0.0.1/24
Broadcast address is 255.255.255.255
Address determined by setup command
MTU is 1300 bytes

When we PING from Cisco CLI, of size 1000B, it means,

This 1000B consist of ICMP payload, ICMP Header (8B) and IP Header (20B), which will be encapsulated in 14B Layer 2 Ethernet frame overhead. So actually Bytes on wire will be 1000+14=1014B. But important point to mention here is that actual Payload transferred is actually 1000B – 8B (ICMP) – 20B (IP) = 972B only. NOTE: if we testing some customer application/service via IXIA or other testing tool (not via PING) than we need to consider ICMP & IP payload along with the DATA payload.

Now if we need to test that how much maximum size PING we can send to remote host via interface f0/0 (over IP network – No MPLS)? So the answer is pretty simple, packet size equals to the configured IP MTU value = 1300B, because it contains all the overheads of IP & ICMP. Let’s test it,

Router#ping 10.0.0.100 size 1001 df-bit

Type escape sequence to abort.

Sending 5, 1001-byte ICMP Echos to 10.0.0.100, timeout is 2 seconds:

Packet sent with the DF bit set
.…

Success rate is 0 percent (0/3)

Router#ping 10.0.0.100 size 1000 df-bit

Type escape sequence to abort.

Sending 5, 1000-byte ICMP Echos to 10.0.0.100, timeout is 2 seconds:

Packet sent with the DF bit set

!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 9/18/39 ms

Router#

Friday, December 23, 2011

JUNOS Link Aggregation Configuration

Objective: Creating aggregated interface between two switches (Cisco calls it Ether-channel)

First step
Define how many aggregation you want to have in JUNOS (for resource allocation)

command is: set chassis aggregated-devices ethernet device-count x

root# set chassis aggregated-devices ethernet device-count ?
Possible completions:
Number of aggregated Ethernet devices (1..32)
[edit]
root# set chassis aggregated-devices ethernet device-count 1
[edit]
root# commit
commit complete
[edit]
root#

Second step

Delete simple L2 switching for spanning tree to disable on each particular aggregation ethernet.

root# delete interfaces ge0/0/1 unit 0
[edit]
root# delete interfaces ge0/0/2 unit 0

Third Step

root# set interfaces ge0/0/1 ether-options 802.3ad ae0
[edit]
root# set interfaces ge0/0/2 ether-options 802.3ad ae0

ae0 is (aggregated ethernet 0) we simply linked ge0/0/1 and 2 to ae0
to see the spanning tree information exit from [edit] or use "run" command as a prefix to your show command:

root# exit
Exiting configuration mode

root> show spanning-tree interface
Spanning tree interface parameters for instance 0
Interface Port ID Designated Designated Port State Role
port ID bridge ID Cost
ge-0/0/0.0 128:513 128:513 32768.001f1231b840 20000 BLK DIS
ge-0/0/3.0 128:516 128:516 32768.001f1231b840 20000 BLK DIS

As we can see ge0/0/1 and 2 are not there, even ae0 is not participating in spanning tree.

Fourth Step

root# set interfaces ae0 unit 0 family ethernet-switching
[edit]
root# commit
commit complete
root# exit
Exiting configuration mode

root> show spanning-tree interface

Spanning tree interface parameters for instance 0
Interface Port ID Designated Designated Port State Role
port ID bridge ID Cost
ae0.0 128:1 128:514 32768.001f1231b780 10000 FWD ROOT
ge-0/0/0.0 128:513 128:513 32768.001f1231b840 20000 BLK DIS
ge-0/0/3.0 128:516 128:516 32768.001f1231b840 20000 BLK DIS

root> show interfaces terse

Interface Admin Link Proto Local Remote
ge-0/0/0 up down
ge-0/0/0.0 up down eth-switch
ge-0/0/1 up up
ge-0/0/1.0 up up aenet –> ae0.0
ge-0/0/2 up up
ge-0/0/2.0 up up aenet –> ae0.0
ge-0/0/3 up down
ge-0/0/3.0 up down eth-switch

root> show interfaces ae0

Physical interface: ae0, Enabled, Physical link is Up
Interface index: 155, SNMP ifIndex: 143
Link-level type: Ethernet, MTU: 1514, Speed: 2000mbps, …

root> show configuration
## Last commit: 2008-07-27 14:21:15 UTC by root
version 9.0R2.10;
chassis {
aggregated-devices {
ethernet {
device-count 1;
}
}
}
interfaces {
ge-0/0/0 {
unit 0 {
family ethernet-switching;
}
}
ge-0/0/1 {
ether-options {
802.3ad ae0;
}
}
ge-0/0/2 {
ether-options {
802.3ad ae0;
}
}
ae0 {
unit 0 {
family ethernet-switching;
}
}
vme {
unit 0 {
family inet {
address 192.168.1.253/24;
}
}
}
}
root> show ethernet-switching interfaces
Interface State VLAN members Blocking
ae0.0 up default unblocked
ge-0/0/0.0 down default blocked – blocked by STP/RTG
ge-0/0/3.0 down default blocked – blocked by STP/RTG

to enable LACP for link aggregation [passive/active]:

ae0 {
aggregated-ether-options {
lacp {
passive;
}
}

Fifth Step
to enable trunk:

root# set interfaces ae0 unit 0 family ethernet-switching port-mode trunk

Overview configuration (with multiple VLANs)

chassis {
aggregated-devices {
ethernet {
device-count 1;
}
}
}
interfaces {
ge-0/0/0 {
unit 0 {
family ethernet-switching;
}
}
ge-0/0/1 {
ether-options {
802.3ad ae0;
}
}
ge-0/0/2 {
ether-options {
802.3ad ae0;
}
}
ae0 {
aggregated-ether-options {
lacp {
passive;
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
}
}
}
vlan {
unit 10 {
family inet {
address 192.168.10.1/24;
}
}
unit 11 {
family inet {
address 192.168.11.1/24;
}
}
}
vme {
unit 0 {
family inet {
address 192.168.1.253/24;
}
}
}
}
protocols {
lldp {
interface all;
}
rstp;
}
vlans {
PC1 {
vlan-id 10;
interface {
ge-0/0/10.0;
}
l3-interface vlan.10;
}
PC2 {
vlan-id 11;
interface {
ge-0/0/11.0;
}
l3-interface vlan.11;
}
PC3 {
vlan-id 30;
interface {
ge-0/0/13.0;
}
}
}
virtual-chassis {
member 0 {
mastership-priority 255;
}
}
poe {
interface all;
}
root>

Note the output of show commands:

root# show vlans
PC1 {
vlan-id 10;
interface {
ge-0/0/10.0;
}
l3-interface vlan.10;
}
PC2 {
vlan-id 11;
interface {
ge-0/0/11.0;
}
l3-interface vlan.11;
}
PC3 {
vlan-id 30;
interface {
ge-0/0/13.0;
}
l3-interface vlan.30;
}

root# run show vlans
Name Tag Interfaces
PC1 10
ge-0/0/10.0
PC2 11
ge-0/0/11.0*
PC3 30
ge-0/0/13.0
default
ge-0/0/0.0, ge-0/0/3.0…
mgmt
me0.0
[edit]

Friday, November 25, 2011

Junosphere: the first impressions

Courtesy - Ivan Pepelnjak

Junosphere, a great network-in-the-Clouds solution from Juniper. You might be familiar with Olive, the “non-existent” way of running Junos on an x86 machine (including a VM); Junosphere is the supported version of the same concept, including a real forwarding plane (it’s my understanding Olive lacks that, which makes certain protocols behave in unexpected ways).

Compared to other similar offerings (including our remote labs and Cisco’s IOS-in-a-Cloud), Junosphere has several significant advantages:

You can create your own topologies that include as many routers as you need, allowing you to recreate complex routing/migration scenarios. Although the topology file format is a bit arcane at the moment, I had no problems creating my own topologies (but you already know I’m more than a bit crazy). For saner people, there’s a tool that can take your OSPF or IS-IS database and turn it into a Junosphere topology.

The VJX1000 router (Junos-in-a-VM) supports Gigabit Ethernet interfaces and Junosphere allows you to connect them together with simple virtual bridges. Think of those bridges as cables/hubs; unlike Dynamips, Junosphere bridges have no explicit VLAN support or access/trunk links. Serial or POS interfaces are also not available.

Just keep adding topologies

You can access your virtual routers directly using Juniper’s SSL VPN solution. After starting the SSL VPN connection, you can use SSH to connect to the devices or SCP to copy files to/from them.
Obviously, once you have SSH up and running, you can test all sorts of Junos automation/SDK tricks (starting with NETCONF).

You can connect physical devices to Junosphere. A Junosphere connector (a VM running in VMware environment – be it VMware Player, Workstation or ESX) can establish a link between any Junosphere bridge and an interface (vNIC) in your workstation/hypervisor host. You can use it to connect Junosphere LAN to your physical interface or to anything else VMware player can use (including a tap interface of a Linux box ... you do know why that’s interesting, don’t you?).

Unfortunately, somehow the SSL VPN Java applet didn’t work on my Linux machine (Fedora 14 with Firefox 3.6.23) where I run all the other simulation stuff; I had to use Internet Explorer on my Windows laptop to connect to the labs.

You can load and save your topologies and configurations. This was one of the best features (from my perspective). The default configuration of the VJX routers includes an event trigger that transfers current configuration to a FTP server every time you execute a commit. Regardless of what you do, a copy of the configuration is always in a safe place and can be saved through the web-based UI and later copied (as a .tgz file) to your workstation.

You can choose the Junos release you want to run. At the moment, the set of releases you can choose from is fixed, but it does include a stable-and-supported release, an experimental release (11.4) and a few others.

You can run other VMs in the same sandbox, including Centos servers, Junos Space and a few test tools.

Do I like Junosphere? Absolutely. Are there any drawbacks? Sure, like every other system Junosphere has a few glitches, from UI that could use some improvements to minor configuration nuisances that can play havoc with the configuration saving feature ... but the major roadblock is the current pricing and go-to-market strategy.

The current list price for Junosphere is $5/router/day (Amazon’s small EC2 instance costs $2.04 per day and is charged by the minute), and you can only purchase it through regular Juniper’s sales channels (including partners). That makes perfect sense if you’re working on a customer demo, proof-of-concept or a migration scenario for a large enterprise network ... and you have direct contact with Juniper or got Junosphere access as a deal closing sweetener. But do you really think a Juniper partner would be interested in getting a $250 purchase order for a 10-day access to a 5-router Junosphere environment? How about a simple use-your-credit-card approach Cisco is using with its e-learning labs?

The per-day charging model is another pain point. With proper preparation, planning and scheduling, the current model could work for me or someone who has to get fluent with Junos really fast to support the next project. Obviously I would be throwing away more than two thirds of the allotted time because I’m too old to work on the routers for more than 8-10 hours a day, but paying $50/day (10-router topology) for something that helps you earn real money shouldn’t be a showstopper.

However, I really like the ability to run a lab for an hour or so to test the next idea that hatched in the back of my brains while I was working on something else. Paying for the whole day just to be able to test a few things might not be too expensive in absolute terms, but definitely feels like a total waste of money.

Juniper’s marketing is doing a great job trying to persuade networking engineers to embrace Junos – from Day One books to Junos as a second language and FastTrack programs. It’s too bad they’re not making the final step and getting everyone interested in kicking some Junos tires (or working really hard on mastering another platform) a simple on-demand access to live Junos environment.

Wednesday, November 23, 2011

How To Forcibly Log a User out of a Juniper Router

user@router> show system users
5:32PM up 197 days, 4:11, 2 users, load averages: 0.41, 0.16, 0.06
USER TTY FROM LOGIN@ IDLE WHAT
shafiqul d0 - 5:09PM 23 -cli (cli)
mustafa p0 172.16.16.14 5:25PM - -cli (cli)

user@router> request system logout terminal shafiqul
OR
user@router> request system logout terminal d0

user@router> show system users
5:33PM up 197 days, 4:11, 1 user, load averages: 0.27, 0.14, 0.06
USER TTY FROM LOGIN@ IDLE WHAT
mustafa p0 172.16.16.14 5:25PM - -cli (cli)

Wednesday, November 16, 2011

How OSPF SPF Adaptive Timers are implemented in IOS and JUNOS

It became a fact that both of Cisco Systems and Juniper Networks have proved their strong market penetration and most of the operators and providers deploying the various platforms of both of them. Based on this, it became an essential for the networking engineers specially those who are working on operator’s environment to know how each vendor’s platforms are architectured, and how their OS are structured as well as how to configure it. However this will not be adequate for the design engineers who has to assure their multi-vendor network are perfectly merged and converged without any interoperability issues, so, they have to dig more and understand how each of the leading vendors are implementing the technologies and matching the RFCs.

Today we will start explaining how each of Cisco Systems and Juniper networks are implementing the OSPF SPF adaptive timers or what called SPF throttling (Cisco) or SPF hold-down (Juniper)
Before we dig into that, let’s talk a little bit about what OSPF SPF Adaptive Timers are designed to do for us, and then we’ll take a look at how each vendor is implementing the concept.

If we can recall from our OSPF background, OSPF SPF algorithm has design to run upon arrivals of LSAs. So, if each LSA triggers a full or incremental SPF run, and if they are arriving fast, SPF can begin eating up the majority of your CPU.

The challenge in large-scale networks is to quickly react to network changes while at the same time not allowing SPF calculations to dominate the route processors. This is the goal of SPF delay, also called SPF hold-down or SPF throttling.

Rather than kick off an SPF calculation every time a new LSA/LSP arrives, SPF delay forces the router to wait a bit between SPF runs. If a large number of LSA/LSPs are being flooded, a delay between SPF runs means that more LSA/LSPs are added to the link state database during the hold-down period. Efficiency is then increased because when the hold-down period expires and SPF is run, more network changes are included in a single calculation.

But this efficiency you are getting from SPF delay, it has its costs which it increase your network convergence time. So, the challenge is to set the delay interval long enough when abnormal things happen while keeping it short when the network is stable so you got a quick convergence. This leads to the concept of adaptive SPF timers.

Both Cisco and Juniper are offering adaptive SPF timers, but with different approaches. In the coming sections, we are going to explain the mechanism used by each vendor.

Adaptive SPF Timers in JUNOS

Juniper Networks uses a linear fast/slow algorithm for adaptive SPF timers. So, it introduced the SPF delay timer which is the minimum delay in the time between the detection of a topology change and when the SPF algorithm actually runs. This period is 200ms by default. The period is configurable with the spf-delay command to between 50 and 8000ms.

Secondly, they introduce a second parameter which is rapid-runs. If three (the default) SPF runs are triggered in quick succession, indicating instability in the network, the router will enter the “slow mode” and a third parameter called the hold-down timer will start. Any subsequent SPF calculation is not run until the hold-down timer expires. The routers remain in this “slow mode” until the hold-down period have passed since the last SPF run—indicating that the network has converged—and then switches back to “fast mode”, and the system reverts to the configured values for the delay and rapid-runs statements.

The default values for SPF calculations in JUNOS can be seen below:

Default SPF timers values in JUNOS

r2@r2> show ospf overview | match SPF
Full SPF runs: 280SPF delay: 0.200000sec, SPF holddown: 5 sec, SPF rapid runs: 3

Changing SPF Timers in JUNOS

The configuration stanza for JunOS shows how these settings may be changed.

1spf-options {

2        delay milliseconds;

3        holddown milliseconds;

4        rapid-runs number;

5}

These default values can be changed with the following command:

[edit protocols ospf]
r1@r1> set spf-options delay milliseconds holddown milliseconds rapid-runs number

Now we are going to play with the timers and run the debugs, and examine the behavior. We will set the delay to 1 sec and the hold-down timer to 20 sec while keeping the rapid-runs as default.

spf-options {

        delay 1000;

        holddown 20000;

}

The log entry below shows, on lines 2,6 and 10, that the SPF run occurs every 1 second after the LSA Update. Once the SPF run has completed 3 iterations it moves into a slower mode of operation.

01:50.465905 OSPF full SPF refresh scheduled for topology default

01:50.466445 OSPF SPF scheduled for topology default in 1s

01:51.467761 Starting full SPF refresh for topology default

02:04.540150 OSPF rcvd LSUpdate 91.198.180.250 -> 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

02:04.541073 OSPF full SPF refresh scheduled for topology default

02:04.541581 OSPF SPF scheduled for topology default in 1s

02:05.543546 Starting full SPF refresh for topology default

02:12.886187 OSPF rcvd LSUpdate 91.198.180.250 -> 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

02:12.892787 OSPF full SPF refresh scheduled for topology default

02:12.893285 OSPF SPF scheduled for topology default in 1s

02:13.894226 Starting full SPF refresh for topology default

The next log entry shows that SPF started after 20 sec from the SPF run (at t=12:02:13). The default number of SPF calculations that can occur in succession is 3. The range that you can configure is from 1 through 5. Each SPF algorithm is run after the configured SPF delay. When the maximum number of SPF calculations occurs, the hold-down timer begins. We previously configured this to be 20 seconds. Any subsequent SPF calculation is not run until the hold-down timer expires. This is why the received LSA update on line 4 does not immediately trigger an SPF run.

02:20.739927 OSPF rcvd LSUpdate 91.198.180.250 -> 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

02:20.747717 OSPF full SPF refresh scheduled for topology default

02:20.756118 OSPF SPF scheduled for topology default in 13.140569s

02:26.990677 OSPF rcvd LSUpdate 91.198.180.250 -> 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

02:33.896073 Starting full SPF refresh for topology default

Next, the log shows the router once again enters the fast mode…

02:59.734614 OSPF rcvd LSUpdate 91.198.180.250 -> 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

02:59.753923 OSPF full SPF refresh scheduled for topology default

02:59.754409 OSPF SPF scheduled for topology default in 1s

03:00.755847 Starting full SPF refresh for topology default

03:07.494415 OSPF rcvd LSUpdate 91.198.180.250 -> 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

03:07.501625 OSPF full SPF refresh scheduled for topology default

03:07.502166 OSPF SPF scheduled for topology default in 1s

03:08.503663 Starting full SPF refresh for topology default

03:57.215931 OSPF rcvd LSUpdate 91.198.180.250 -> 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

03:57.223481 OSPF full SPF refresh scheduled for topology default

03:57.223998 OSPF SPF scheduled for topology default 1s

03:58.225848 Starting full SPF refresh for topology default

We can also observe from the previous log that although 3 more SPF runs have taken place, the router does not move into slow mode again. This is because there has been 50sec between the first and the last SPF run in the set of 3. If the 3 SPF runs happen within 3 x “delay value“, or in our case 3 seconds, the router will start to throttle the number of SPF runs, and start the holddown timer countdown. If the SPF runs are outwith 3 x the configured delay value, the rapid-run counter is reset to 0 and no back-off algorithms are run.

Now, shown in the next log snippet, the router will enter the slow mode and the holddown timer will start, because three SPF runs have occurred in succession.

04:03.364745 OSPF rcvd LSUpdate 91.198.180.250 -&gt; 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

04:03.378123 OSPF full SPF refresh scheduled for topology default

04:03.378655 OSPF SPF scheduled for topology default in 1s

04:04.379888 Starting full SPF refresh for topology default

04:15.329694 OSPF rcvd LSUpdate 91.198.180.250 -&gt; 224.0.0.5 (fxp1.100 IFL 95 area 0.0.0.0)

04:15.349992 OSPF full SPF refresh scheduled for topology default

04:15.350510 OSPF SPF scheduled for topology default in 1s

04:16.352016 Starting full SPF refresh for topology default

And finally, the following log shows that SPF again started after 20 sec from the last SPF run (at t=12:04:16)

The figure below is charting the above debug which can help you in more understanding the JunOS behaviour with the SPF timers

Adaptive SPF Timers in IOS

Cisco Systems introduced an exponential backoff algorithm for the adaptive SPF timers by using three different configurable timers.
This exponential functionality limits the number of SPF computations during times of network instability by doubling the delay associated with the SPF run, up to a maximum hold delay, for the period of instability. When the period of instability ends, the delay is reset to the original value. Three timers are associated SPF exponential backoff: Start Time, Initial-Hold Time, and Max-Hold Time.
IOS internally has an internal timer called the waiting-interval which the SPF computation will be delayed till it expires. When a topology change is received for the first time, the waiting-interval will be set to the start timer which is similar to the spf-delay in JUNOS, and the SPF computation is delayed for the value set by start timer. When the SPF computation completes, a waiting-interval starts with the value of the initial-hold timer and the router will enter the “slow mode”. If there is a topology change during waiting-interval, the SPF computation will run at the expiration of the initial-hold timer. At the completion of the SPF computation the waiting-interval is set to the twice the value of initial-hold timer and then run again. So for example, if the start timer is 100ms and the initial-hold timer is 1000ms, the router delays the first SPF run by 100ms, the second by 1000ms, the third by 2000ms, the fourth by 4000ms, and so on.
The waiting-interval grows exponentially as 2^t*initial-hold until it reaches the max_hold-time value. After this, any topology change during the current waiting-interval would result in the next SPF computation will run at the expiration of the max hold time and next waiting-interval being equal to the constant max-hold timer. This ensures that exponential growth is limited. If the SPF has not run for twice the time specified by the max-hold timer, the router switches back to “fast” mode in which the start delay timer is used and the waiting-interval is reset back to the initial value.
The default values for SPF calculations in IOS can be seen below:

Default SPF timers values in IOS

R2#sh ip ospf | i SPF�
Initial SPF schedule delay 5000 msecs
Minimum hold time between two consecutive SPFs 10000 msecs
Maximum wait time between two consecutive SPFs 10000 msecs

Changing SPF Timers in IOS

These default values can be changed with the following command:

R1(config)# router ospf 100
R1(config-router)# timers throttle spf spf-start spf-hold spf-max-wait

As we did above with JunOS, will play with the SPF throttle timers and run the debugs, and examine the behavior. We will set the Start delay timer to 1 sec and the initial-hold timer to 5 sec and the max-hold timer to 50 sec.
The log entry below shows, on lines 2, that the SPF run at t= 21:30 which is 1 second after the LSA Update, and the next wait_interval set to the initial-hold time which is 5 sec as shown in line 5.

21:29: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

21:30: OSPF: Begin SPF at 54881.208ms, process time 13316ms

21:30:       spf_time 15:14:43.672, wait_interval 1000ms

21:30: OSPF: End SPF at 54889.600ms, Total elapsed time 84ms

21:30:       Schedule time 15:14:44.672, Next wait_interval 5000ms

The next log entry shows that the waiting_interval is getting doubled after each SPF run. Starting with a waiting_interval equal to 5 sec which is the initial-hold timer as shown on line 3, the next waiting_interval on line 8 is set to 10 sec then to 20 sec on line 14 and 40 sec on line 22.
While the router is in the slow mode no SPF will run until the wait_interval elapses no matter how many topology changes have been detected. This is why the received LSA update on lines 13 and 14 and also on lines 20,21,22 and 23 does not immediately trigger an SPF run.

21:32: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

21:35: OSPF: Begin SPF at 54889.600ms, process time 13424ms

21:35:       spf_time 15:14:44.672, wait_interval 5000ms

21:35: OSPF: End SPF at 54889.672ms, Total elapsed time 72ms

21:35:       Schedule time 15:14:49.672, Next wait_interval 10000ms 

21:41: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

21:45: OSPF: Begin SPF at 54899.672ms, process time 13516ms

21:45:       spf_time 15:14:49.672, wait_interval 10000ms

21:45: OSPF: End SPF at 54899.720ms, Total elapsed time 48ms

21:45:       Schedule time 15:14:59.720, Next wait_interval 20000ms  

21:58: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

22:03: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

22:05: OSPF: Begin SPF at 54919.720ms, process time 13580ms

21:05:       spf_time 15:14:59.720, wait_interval 20000ms

22:05: OSPF: End SPF at 54919.776ms, Total elapsed time 56ms

22:05:       Schedule time 15:15:19.776, Next wait_interval 40000ms 

22:22: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

22:27: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

22:32: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

22:39: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

22:45: OSPF: Begin SPF at 54959.776ms, process time 13684ms

22:45:       spf_time 15:15:19.776, wait_interval 40000ms

22:46: OSPF: End SPF at 54959.884ms, Total elapsed time 108ms

22:46:     Schedule time 15:15:59.884, Next wait_interval 50000ms

The next log entry shows that the waiting_interval is reached the max-hold time (50 sec) and upcoming waiting_interval being equal to the constant max-hold timer as on lines 3 and 8 .

23:17: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

23:36: OSPF: Begin SPF at 55009.884ms, process time 13808ms

23:36:       spf_time 15:15:59.884, wait_interval 50000ms

23:36: OSPF: End SPF at 55009.928ms, Total elapsed time 44ms

23:36:       Schedule time 15:16:49.928, Next wait_interval 50000ms 

24:26: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

24:26: OSPF: Begin SPF at 55059.928ms, process time 13872ms

24:26:       spf_time 15:16:49.928, wait_interval 50000ms

24:26: OSPF: End SPF at 55059.968ms, Total elapsed time 40ms

24:26:       Schedule time 15:17:39.968, Next wait_interval 50000ms

We can also observe from the previous log that although the LSA on line 6 arrived 60 sec after last SPF run has taken place which is more than the waiting_interval , the router does not move into fast mode again. This is because that the condition is that to divert back to the fast mode the SPF should not run for twice the time specified by the max-hold timer.
Now, shown in the next log snippet, the router will enter the slow mode and the holddown timer will start, because the SPF has not run for 100 sec which is twice the time specified by the maximum delay period.

26:22: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

26:23: OSPF: Begin SPF at 55177.420ms, process time 13932ms

26:23:       spf_time 15:19:36.420, wait_interval 1000ms

26:23: OSPF: End SPF at 55177.488ms, Total elapsed time 68ms

26:23:       Schedule time 15:19:37.488, Next wait_interval 5000ms

26:29: OSPF: Detect change in LSA type 1, LSID 2.2.2.2, from 2.2.2.2 area 0

26:28: OSPF: Begin SPF at 55177.488ms, process time 14016ms

26:28:       spf_time 15:19:37.488, wait_interval 5000ms

26:28: OSPF: End SPF at 55184.660ms, Total elapsed time 108ms

26:28:       Schedule time 15:19:42.660, Next wait_interval 10000ms

For more clarity, I have reflected the debugs on the following figure, so you can use both the debugs and the figure to examine the behavior

Tuesday, November 15, 2011

Best Practices for cleaning the Juniper router

1) request system zeroize

Erase all data, including configuration and log files. "show system commit" data was not deleted. On the other hand, all the old configuration files(juniper.conf.[1-3].gz) were deleted and router will came up with the very default configuration.

To deleted "show system commit" information the following technique should be used.

root> start shell sh

# echo "" > /var/db/commits

# exit

3) rm -rf /var/home/*

Friday, November 11, 2011

Reasons for the Junos MX Series MPC Crash

View Bulletin PSN-2011-08-327

Title: MX Series MPC crash in Ktree::createFourWayNode after BGP UPDATE

Products Affected: This issue can affect any MX Series router with port concentrators based on the Trio chipset -- such as the MPC or embedded into the MX80 -- with active protocol-based route prefix additions/deletions occurring.

Platforms Affected

Security

JUNOS 11.x
MX-series
JUNOS 10.x
SIRT Security Advisory
SIRT Security Notice

Revision Number 1

Issue Date: 2011-08-08

PSN Issue :

MPCs (Modular Port Concentrators) installed in an MX Series router may crash upon receipt of very specific and unlikely route prefix install/delete actions, such as a BGP routing update. The set of route prefix updates is non-deterministic and exceedingly unlikely to occur. Junos versions affected include 10.0, 10.1, 10.2, 10.3, 10.4 prior to 10.4R6, and 11.1 prior to 11.1R4. The trigger for the MPC crash was determined to be a valid BGP UPDATE received from a registered network service provider, although this one UPDATE was determined to not be solely responsible for the crashes. A complex sequence of preconditions is required to trigger this crash. Both IPv4 and IPv6 routing prefix updates can trigger this MPC crash.

There is no indication that this issue was triggered maliciously. Given the complexity of conditions required to trigger this issue, the probability of exploiting this defect is extremely low.

The assertions (crash) all occurred in the code used to store routing information, called Ktree, on the MPC. Due to the order and mix of adds and deletes to the tree, certain combinations of address adds and deletes can corrupt the data structures within the MPC, which in turn can cause this line card crash. The MPC recovers and returns to service quickly, and without operator intervention.

This issue only affects MX Series routers with port concentrators based on the Trio chipset, such as the MPC or embedded into the MX80. No other product or platform is vulnerable to this issue.

Solution:

The Ktree code has been updated and enhanced to ensure that combinations and permutations of routing updates will not corrupt the state of the line card. Extensive testing has been performed to validate an exceedingly large combination and permutation of route prefix additions and deletions.

All Junos OS software releases built on or after 2011-08-03 have fixed this specific issue. Releases containing the fix specifically include: 10.0S18, 10.4R6, 11.1R4, 11.2R1, and all subsequent releases (i.e. all releases built after 11.2R1).

This issue is being tracked as PR 610864. While this PR may not be viewable by customers, it can be used as a reference when discussing the issue with JTAC.

KB16765 - "In which releases are vulnerabilities fixed?" describes which release vulnerabilities are fixed as per our End of Engineering and End of Life support policies.

Workarounds

No known workaround exists for this issue.

Network Enhancers - "Delivering Beyond Boundaries"

Network Enhancers - "Delivering Beyond Boundaries" Headline Animator