Monday, January 14, 2013

Storage 101 - Part 1

 

Introduction

It seems that everywhere you turn these days, you hear someone talking about Storage Area Network (SAN) storage. You might have heard someone say that SANs are complicated or expensive, but you might have wondered how a SAN differs from a traditional network. In this article series, I will discuss the basics of Storage Area Networking. My plan is to start out by talking about what a SAN is (and is not). From there, I want to talk about some of the hardware that is used in a SAN, as well as some common SAN architectures. Reading this article series won’t make you a SAN expert, but it should give you a much better understanding of Storage Area Networking.

What is a Storage Area Network?

I will never forget the first time that I ever heard someone mention a SAN. Many years ago a friend called me on the phone excited because he had just implemented a SAN. When I asked him what a SAN was he told me that his storage was tied directly to the network. I remember wondering what the big deal was since networked storage had been around for years.
 
At that point in my life (which was many years ago) I had never heard of a SAN, so perhaps I misunderstood my friend’s explanation. It’s also possible however, that my friend didn’t fully understand what a SAN was either. In either case, my friend’s definition of a SAN was somewhat accurate, but completely inadequate.
 
A SAN is a collection of networked storage devices as my friend had said, but a SAN is completely different from Network Attached Storage (NAS) which is also a form of networked storage.
 
There are three main ways in which a SAN differs from NAS. First, a SAN uses different hardware than NAS does. Second, SANs utilize different protocols than NAS devices do. Third, SANs read and write data in a different way than NAS does.
 
To show you what I mean, consider the nature of a NAS device. There are a lot of different types of NAS devices on the market, and some are more sophisticated than others, but generally speaking, a NAS device is an appliance that connects to the network via one or more Ethernet cables. A NAS appliance contains one or more disks, and is usually configurable through a Web interface. This interface usually allows the device’s storage to be partitioned or to be used as a RAID array.
 
Once the NAS device is put into production, the device is treated much like a typical file server. Users connect to the NAS device through an Ethernet connection using the TCP/IP protocol.
 
Depending on the type of NAS, it might also support SMB or NetBIOS over TCP/IP. In any case, you can think of a NAS device as a self-contained file server.
 
SAN storage works completely differently from NAS storage. When a NAS is in use, a user with the proper permissions can connect directly to a NAS volume (through a file share) and read and write files.
 
SANs can be configured to provide similar functionality, but there is a lot going on behind the scenes. For starters, users cannot generally connect directly to SAN storage because user workstations communicate with other computers on the network using TCP/IP. Although there are exceptions, SAN storage is usually accessed through Fibre Channel.
 
Even though this difference in protocols might seem trivial, it actually hints at the very essence of a SAN. Networks that depend on TCP/IP and SMB are primarily designed to access file system data. In other words, these types of networks are ideally suited for reading and writing files that are stored on file servers, Web servers, etc.
 
In contrast, Fibre Channel doesn’t work at the file level, but rather at the storage block level. As such, you wouldn’t use Fibre Channel to read a file that is stored on a file share. Instead, Fibre Channel reads and writes individual storage blocks.
 
There are a couple of reasons why this seemingly trivial distinction is important. First, Fibre Channel offers much higher performance than a traditional TCP/IP network does. Although network throughput does play a role in the overall speed of the connection, the main reason why Fibre Channel is so much faster than TCP/IP is because Fibre Channel is a more efficient protocol with less overhead. Having less overhead allows Fibre Channel to move data more quickly.
 
The other reason why Fibre Channel’s block level storage interaction is significant is because Fibre Channel communicates directly (and natively) with the storage device. This means that in a SAN environment, it is possible to treat a remote storage device as if it were a local storage resource.
 
To give you a better idea of what I am talking about, consider what happens when you connect a Windows machine to a NAS device. The NAS storage gets mapped to a network drive. In the case of a SAN however, it is possible to get Windows to treat a SAN volume as local storage (as opposed to a network drive), even if the physical storage device is located remotely.
 
This is an important distinction because the Windows operating system treats local and networked storage differently. For example, there are Windows applications that can be installed to local storage, but not to a network drive. These types of applications can be installed to SAN storage however, because the Windows operating system does not distinguish between true local storage and SAN storage (at least not as far as the application is concerned).
 
Keep in mind that I am not saying that SAN storage is always treated as local storage or that it cannot be used for anything else. Often times the end users actually see SAN storage as a mapped network drive.
 
So how can this be? It all has to do with the fact that the users workstations do not normally connect directly to SAN storage. It is usually servers (or virtual workstations) that make use of SAN storage. Imagine for example that a file server is configured to use SAN storage instead of true local storage. The file server is connected to the SAN in a way that allows the storage to be treated as local. However, when end users attach to the file server they might be accessing files that are stored on the SAN, but they are not directly connecting to the SAN. Instead, the users are connecting to the file server via TCP/IP. The file server is the only machine that accesses the SAN storage directly.
 
This architecture really isn’t much different than it would be if the file server were using direct attached storage. Even if the file server’s storage were truly local, the users wouldn’t access the storage directly. The users communicate with the file server’s operating system and it is the operating system that hands off disk requests to the storage subsystem. Exactly the same thing happens if the file server is connected to a SAN. The only difference is that the storage is not local to the file server.

Conclusion

In this article, I have explained that there are some major differences between Storage Area Networks (SAN) and Network Attached Storage (NAS). In Part 2 of this series, I will begin discussing the hardware that is used in a SAN.

Storage 101 - Part 2
 

No comments:

Post a Comment