Data Reduction: Deduplication, Compression and Replication with the Dell DR4000 Disk Backup Appliance
By Deni Connor, Founding Analyst
Storage Strategies NOW
January 2012
Data is growing in leaps and bounds and the quantity of data that needs to be stored is causing problems for IT managers, who fear that they won’t be able to meet backup windows or may lose data because of faulty or ill-designed backup processes. To accommodate this data growth, IT managers must make investments in more storage capacity, dedicate more floor space and energy consumption to it, and spend more money and time managing it, all with continually constrained budgets.
The number of snapshots, mirrors, replicas, and clones taken for migration purposes – all contribute to the amount of data that must be stored and protected against loss. Add to that the data that is being replicated for disaster recovery purposes and the data that is being archived for regulatory and compliance purposes. And, then also consider the amount of data that is copied to tape and shuttled offsite for long-term preservation. All these processes and forms of data contribute to the overwhelming amount of data that IT needs to backup and protect.
To address this data growth, IT managers are looking to data deduplication and compression appliances, which reduce the amount of data that is stored and backed up. By deduplicating either primary storage or secondary backup data, IT figures they can reduce their storage footprint by as much as 15:1, according to industry estimates. They are looking for appliances that allow them to eliminate multiple copies of the same data and to be able to keep more data online longer and readily available in the event of a disaster or data loss. They are looking to reduce their dependence on tape backup, reduce backup storage costs, while reducing their power and cooling costs in the data center.
IT is deploying data deduplication and compression technologies quickly and they are becoming a checklist item on any manager’s RFP for data protection and data reduction. According to Storage Strategies NOW research, over 75% of respondents have deployed or plan to deploy data deduplication and compression. Of those deploying deduplication and compression, 54.4% are using it to reduce their footprint for secondary backup data.
Dell’s recently introduced the DR4000 Disk Backup Appliance is doing just that – letting administrators reduce the amount of backup data stored to increase efficiencies in the data center
Intent
The DR4000 Disk Backup Appliance is Dell’s first target-based deduplication and compression backup product, targeted at their sweet spot – SMB and mid-sized enterprise businesses. In July 2010, Dell acquired Ocarina, a startup in the data optimization space, and from the get-go, indicated that deduplication and compression of primary and secondary backup data would be integrated in some fashion within its Compellent Storage Center, EqualLogic and PowerVault arrays and its DX Object Storage system. Beginning in February 2011 Dell formalized this integration vision with the unveiling of the Dell Fluid Data Architecture, with deduplication and compression continuing to play an integral role throughout the storage portfolio.
In October 2011, the company introduced the DX6000G Storage Compression Node, an option for the Dell DX Object Storage Platform, that compresses unstructured data for long-term archive and retention based on policies and content-aware algorithms, and drastically reduces data file sizes and the footprint of archival and cloud infrastructure storage. At that time, Dell promised to introduce deduplication into Dell’s Fluid Data storage architecture in early 2012.
Now in January 2012, the company has. At the Dell Storage Forum in Europe, the company unveiled the DR4000 Disk Backup Appliance for deduplicating and compressing structured and unstructured secondary backup data in remote offices or data centers.
The DR4000
The Dell DR4000 Disk Backup Appliance is the first disk-to-disk backup product that uses exclusively all Dell intellectual property. The appliance is a 2U (3.5-inch high) implementation that includes as many as 12 1TB Serial ATA (SATA) drives or 12 300GB or 600GB Serial Attached SCSI (SAS) drives. It attaches to the network via 4 1GbE or 2 10GbE ports. The appliance, which supports CIFS and NFS and will support Symantec Open Storage Technology (OST) in an upcoming service pack release, is available in three SKUs of 2.7TB, 5.4TB and 9TB useable storage capacity after RAID overhead. After deduplication, it is estimated that the appliances will be able to scale to more than 100TB of logical storage capacity.
The appliance sits in-line and deduplicates data in a variable length mode. Like the DX6000G, it uses algorithms that are in this case optimized for performance and high throughput. According to preliminary results, the DR4000 has an ingest performance of as much as 3TB per hour.
Also included at no cost is support for most data protection and backup software, including Symantec NetBackup and BackupExec and CommVault Simpana. A future service pack release will bring support for EMC Networker, CA ARCServe, IBM Tivoli Storage Manager and two virtual machine backup packages – Veeam Backup and Replication and ASG Software Solutions’ Atempo Time Navigator.
All-inclusive licensing
Perhaps most significant of the announcement is Dell’s inclusion of features such as replication at no additional cost to the customer. When a customer buys a DR4000, they get asynchronous over IP replication support. This is unlike other vendors of deduplication gear that charge a premium for replication and other advanced features.
In addition, the all-inclusive licensing model, first adopted for the Dell EqualLogic product line, will include future capabilities such as retention and compliance policies and integrated cloud connectors into the Dell cloud. The company will also support deduplication at the source (backup server) via the Symantec OST protocol at no charge to the customer.
Our Take
Dell’s roadmap for deduplication and compression –based backup to disk family of products includes tighter integration with its other storage products, consistent with the Fluid Data architecture. The DR4000 is most closely aligned with its EqualLogic family of storage arrays both in terms of its target market and its all-inclusive licensing model. Future capabilities such as retention and compliance policies and integrated cloud connectors into the Dell cloud will help the company align these offerings with the Fluid Data tenets of value, ease of use, economies of scale and reduced CAPEX.
Dell’s DR4000 is not just a platform that operates independently of other Dell storage products, but a platform that is tightly integrated to deliver a lower total cost of ownership, a lower footprint and simplified management. These attributes will allow Dell to deliver incremental value and ease of use and economics to its customers.
Included replication capability in the DR4000 is one of the best proof points of the Fluid Data architecture – any data that traverses the wire between the DR4000 is deduplicated and hence WAN connectivity benefits accrue – customers using the DR400 to replicate data over slower 5 or 10Mb lines are able to easily create disaster recovery and business continuity solutions.
Whatever direction Dell takes with the rest of its deduplication and compression technologies in its NAS and SAN products is sure to be tailor-made for the platform itself, whether accommodating different performance expectations, different levels of content awareness or resource availability.
The Dell DR4000 is a clear indication that Dell knows what it is doing in the deduplication and compression market and that it has taken the needs of its customers to heart. For large data centers, the company will be able to introduce models with more scale and throughput; for distributed enterprises an edge-to-core architecture made up of entry-level to enterprise size appliances; and, for the SMB and mid-size enterprise, a family of appliances such as the DR4000 that meet customer’s need and helps them tackle the problem of unrelenting data growth.
Leave a Reply



