Storage Extension over SDH for Business
Continuity and Disaster Recovery Solutions
Compliance with Regulation of Distance Requirements
Between Main and Backup Sites
Introduction and Scope:
Tremendous
growth in enterprise storage needs has created demand for reliable and
cost-effective backup and data protection solutions. Public awareness of the
need for business continuity and disaster recovery solutions is growing
dramatically, aided by reams of new data protection legislation. One aspect of new legislation in the US and Europe (SEC, Health Insurance Portability
and Accounting Act (HIPAA), Sarbanes – Oxley; Basel II) is the strict
regulation of the distance requirement between main data storage site and
backup sites. The emergence of storage extension solutions is a natural
evolution of the growing use of Storage Area Networks (SANs) within the
enterprise. These solutions provide data connectivity between separate SAN
islands within the enterprise using the Wide Area Network (WAN). Large amounts
of data pass between SAN islands, and a variety of storage protocols such as
Fibre Channel, ESCON, FICON and GbE are often used to facilitate the data
transfer.
In
this white paper, we present the business case for using SDH as the WAN between
SAN islands. We describe the advantages of using SDH compared to other
alternatives, and review the latest standards (FC-BB-SDH, FC-BB-GFPT, GFP-F,
VCAT) that were developed to fulfill the special needs that arise when using
the SDH network for the transfer of large amounts of data. Finally, we discuss
implementation aspects and provide some examples of storage over SDH solutions.
Disaster Recovery
and Business Continuity – The Objectives
The cost of downtime for an
enterprise varies dramatically from industry to industry. Each enterprise needs
to identify its critical data and define the amount of time they can allow to
elapse before recovering the data. Some sectors, such as the financial and
healthcare arenas, need to maintain
synchronous business continuity at all times to preserve their integrity,
maintain customer loyalty and prevent court litigations. Other sectors can
endure hours of recovery time and some level of data loss. Regardless of a
particular enterprise’s needs, there exists a clear trade-off between the cost
of the solution (including equipment and communication costs) and the recovery
objectives of the enterprise. There are two main quantitative objectives that
that are used to define these objectives:
Recovery
Time Objective (RTO) – The time that elapses between the point of
data/application loss and the point where the data/application is available
again.
Recovery
Point Objective (RPO) – The freshness of the data (in the backup site)
when the disaster is happening. i.e. how much data is lost since the last backup.
These two objectives are illustrated in
Figure-1.

Figure-1: Illustration of RTO and
RPO
The capacity of the data to be
protected and the RPO/RTO dictates the speed of the link between the main site
and the backup sites. Since the amount of data that requires protection varies
in each enterprise and also reflects its individual risk management policies,
there is range of software and hardware backup solutions that are tailored to
each specific application. Each of these requires different processing power
and communication capacity.
However, the basic storage needs
for enterprises can be broken down into three main categories.
Figure-2
demonstrates the trade-off between the functionality needed
and the cost required.

Figure-2: SAN solutions and their respected cost
Business Continuity:
This is the highest level of data maintenance and it requires
synchronous mirroring coupled in some cases with a need to share processing
resources between the main site and the secondary site. From a bandwidth perspective,
this is the most demanding application. The typical medium for this solution is
dark fiber with CWDM/DWDM and SDH fat pipes. The recovery time is close to zero
and the freshness of the data is nearly 100%.
Disaster Recovery: This mid-level data maintenance
solution requires high-speed replication, but not fully synchronous (i.e. there
is some gap between what is written at any time on the main site and the backup
site). Recovery time can take between few minutes and a few hours. This solution
is less costly than business continuity and typically can be implemented over
CWDM/DWDM, SDH and IP networks. Some data can be lost.
Data Protection: The simplest method of data
maintenance, this solution can tolerate a few days for recovery more significant
data loss. These applications can rely on SDH and IP networks. There are few
standards that were developed to use the IP network (FCIP, iFCP) for such
applications where the RTO/RPO requirements are less stringent.
Figure-3 illustrates the use of
different WAN solutions with the associated RTO/RPO objectives.

Figure-3: WAN solutions associated with RTO/RPO
specific objectives and applications
Figure-3 illustrates the
versatility of the SDH network as the WAN solution. Its flexible rates allow
usage in a variety of situations.
SDH–Network of Choice
Many
factors need to be considered when choosing the WAN medium. These include
RTO/RPO, costs (lease cost, ownership cost and operational cost), distance
between sites, and scalability. Table-1 summarizes some of the features of each
WAN medium.
Dark Fiber and CWDM/DWDM: Dark fiber is an expensive
resource that can be used in cases where fiber is privately owned or dark fiber
capacity can be leased. Distance requirements may make this solution
impractical and the cost of fiber can be high when its capacity is not shared
by other applications. When using DWDM/CWDM, the capacity of each strand of
fiber is greatly enhanced. Service providers now supply the CWDM/DWDM
infrastructure over the WAN using Metro DWDM gear. This is the most practical
and economical solution when the required bandwidth is over 2.5Gb/s.
SDH: In contrast to dark fiber, the
public SDH network is widely available. It is the network of choice when the
required rates are less than 2.5Gb/s. The latest GFP/VCAT standards allow
greater network efficiency (NxVC4-n, NxVC4) by virtually concatenating free
time slots and forming a virtual fat container. SDH networks are highly
reliable due to stringent equipment design standards and replication planning
based on known and reliable protection schemes.
Operational support systems for the monitoring and activation of
services are in place, as well as trained manpower. Carriers are eager to
utilize the network for additional data services, and enterprises are keen to
outsource the operational task of maintaining a reliable link to the carriers –
thereby creating a win-win situation.
The Next Generation SDH Network
supports new data services by using new standards such as FC-BB_3 with GFP-F
and FC-BB_3 with GFP-T.. These standards create two significant advantages:
Rate Limiting of
Fiber Channel:
Instead of using the 1Gb/s or 2Gb/s standard rates, it is possible to reduce
the rate to sub-rates without any packet loss by taking advantage of
rate limiting of Fibre Channel. Note
that this different from some forms of Ethernet rate limiting and traffic
policing that are not concerned with the integrity of the traffic flow.
Fiber Channel
Distance Extension:
The standards provide a means for extending the distance between the main and
backup sites. Most importantly, SDH can be used in applications where hundreds
(and even thousands) of kilometers separate the sites.
IP Network: The use of IP networks for
disaster recovery and data protection is based on the ability to tolerate
unexpected delays and modest RTO/RPO requirements. In turn, data can travel
over longer distances at a lower cost. The FCIP standard provides a means to
extend distance and provide rate limiting without packet loss.
|
Storage
Over..
|
Service
Required
|
Protocol
|
Latency
|
Typical
Distance
|
Fiber
|
Dark Fiber
|
Native
|
Propagation
Delay
|
Tens of Kilometers
|
|
CWDM/DWDM
|
Wavelength
|
Native
|
Propagation Delay
|
Up to 120-180 mi (200-300
km) (DWDM) throughput decrease with distance
|
|
SONET/SDH
|
OC-n/VC4-n/NxVT1.5, VCAT
|
GFT mapped over SDH
FC-BB_SDH
FC-BB_GFPT
|
Low Latency
|
Hundred/thousands of
miles/kilometers
-Full or Sub-rates
-Synchronous or asynchronous
applications
|
|
Ethernet/IP
|
GbE (full rate or sub-rate)
|
FCIP
|
Low-High latency (depends on
QoS – Quality of Service)
|
Hundreds/thousands of
miles/kilometers
-Full or Sub-rates
-Unpredictable latency
-Asynchronous application
|
Table-1: WAN mediums for business
continuity and disaster recovery solutions
Table-2
outlines some examples of SDH network usage at different rates based on the
capacity of the data that requires protection.
|
Storage Volume Required for
Duplication
|
Solution
|
Technology
|
0-500 GB/S
|
Data Protection/Disaster
Recovery
|
NxVT1.5/NxVC4-1/IP/Ethernet
|
|
500GB – 1.5TB
|
Disaster Recovery
|
NxVC4-1
|
|
1.5TB – 6TB
|
Disaster Recovery
|
NxVC4-1
|
|
+6TB
|
Business Continuity
|
NxVC4-16, CWDM/DWDM
|
Table
2: using SDH
different rates in different applications
Storage over SDH – Standards Ensure
Interoperability and Performance
The new SDH standards resolve some
of the disadvantages of the proprietary solutions that used to dominate storage
transport over DWDM/CWDM and SDH. The proprietary methods did not have the
ability to interoperate, or use GFP/VCAT capabilities (they are usually based
on POS). Furthermore, they were customized for specific FC-switch
manufacturers. The growing interest of carriers and vendors in transport of
Fibre Channel over next generation SDH resulted in the development of the
FC-BB_SDH with GFP-F standard . For ESCON services the use of GFP-T was specified.
Development efforts aimed at improving the transparency and flexibility of the
FC-BB-SDH continue and resulted with a new standard called FC-BB_GFPT. This
standard allows direct connectivity to Storage Arrays (E-Port) in addition to
connectivity to the FC switches (B-Port)..
Storage over SDH– Implementation
Examples
There
are two primary scenarios in which SDH is used:
(A)
Service providers/carriers provide the storage interface to their customers
(FC, ESCON, FICON, GbE). They can also provide end-to-end support of the
installation and operation of the disaster recovery solution, up to the
demarcation point with the customer. This mode of operation is done when
service providers provide Ethernet services over SDH (EoS). By using the
PacketLight devices they can extend the offering to FC and ESCON. Moreover –
they can provide a single WAN pipe over SDH to carry SAN (FC interface) and LAN
(GbE interface) traffic.
(B) Enterprises install and operate the storage transport equipment,
implementing their own disaster recovery solution. They buy bandwidth pipes
(e.g. STM-1, STM-4, STM-16 , NxVC4) from the carriers.
PacketLight has developed a suite
of products that cater to the needs of both carriers and enterprises Using the
latest VCAT technology, these products support both FC-BB_SDH with GFP-F and
FC-BB-GFPT with GFP-T. With these standards the service providers utilize
better their existing SDH network and the enterprise can use the same SDH
connectivity to transport all its data requirements - SAN and LAN. Additional
features including power redundancy, link redundancy, standard performance
monitoring, DCC management channel, web server element managers, and CORBA
interfaces - all provided by PacketLight products. The PacketLight products are
unique in being a compact Customer Located Equipment (CLE) that can take both
GbE and FC in a flexible way. The Enterprise
gets a single reliable connection with guaranteed bandwidth and no delay to its
LAN and SAN traffic. By using the new standards of LCAS it can change the
amount of bandwidth it using without causing any service disruption. For
example: LAN traffic can be carried during the day in most of the bandwidth and
SAN traffic during the night.
The PacketLight devices are CLE which
are typically connected to the FC/ESCON switch/director. In case (A) above, the
CLEs are part of the service provider network and also serve as the demarcation
point to the customers. They provide required features of a demarcation point,
such as line and service loop-backs that are operated via the SDH overhead.
Figure-4 depicts such a typical implementation.

Figure-4: PacketLight Storage over SDH access devices for disaster
recovery applications
In
case (B) above, the enterprises will obtain the standard OC-n/STM-n services
from the service provider. The CLEs are managed as part of the local SAN/LAN by
the enterprise management system. The PacketLight devices support web-based
management, and also include an SNMP agent for management through the enterprise
SNMP suite.
PacketLight
offers a range of products that cater to different throughput and bandwidth
needs:
|
|
Services
|
SDH Uplink
|
Comments
|
|
PL-10
|
Single
Fiber Channel 1Gb/s
|
STM-1/
1 GFP/VCAT
|
NxVC4,
NxVC12, VCAT
|
|
PL-20
|
Single
Fiber Channel 1Gb/s
|
STM-4/
GFP/VCAT
|
NxVC4
VCAT
|
|
PL-100
|
Up
to 4XFC/GbE
Up
to 12X ESCON
Combinations
of FC, ESCON, GbE
|
STM-16/STM-4/STM-1
GFP-F
VCAT or FC-BB-GFPT
|
NxVC4
VCAT
Using
one WAN connection to mix GbE and FC services
GbE
with GFP-F
|
Table-3: PacketLight Suite Of
Products
PL-10
and PL-20 are ideal for applications that require less bandwidth with a single
Fiber Channel interface. PL-100 is ideal for applications that require several
interfaces for connecting both the SAN and the LAN over the WAN. By providing
both GbE and FC/ESCON interfaces, it is possible to consolidate the
communication needs of an enterprise in a single device using a public network.
Table-4
provides an example of the costs involved in a typical disaster recovery
solution using an STM-1 link with PacketLight’s PL-10. It is assumed that in
the case of a disaster – there will be a need to transport 500GB from site to
site to resume operation.
|
Cost of 1 Hour down Time*
|
Capacity of Data To Recover
(GBYTE)
|
RTO using STM-1 (HOURS)
|
Cost of RTO (USD)
|
Annual Cost of SDH Service (USD)**
|
|
$15,000
|
500
|
7.168459
|
$107,527
|
$36,000
|
Table 4: cost calculation
involved with a disaster recovery solution with PL-10
*
This value varies significantly between enterprises.
**
Based on a service fee of $3000 per month. CLE pricing is not presented here,
as it is not material to the calculation. Note also that carriers may adjust
their pricing if they offer Fiber Channel interfaces directly. By paying $36000
for communication costs per year the enterprise can implement incremental asynchronous
DR plan that will ensure freshness of the data and recovery time objectives
based on his storage needs.