|
|

Data Deduplication
ü Space Savings - Up to 90%
ü Cost savings through reduced storage capacity requirements
ü Improved storage resource efficiencies |  |
Sacramento County Dept. of Human Assistance
“Due to our new storage virtualization capability, we’ve realized an 80% space savings over tape. Backups now finish 16 times faster. Restore times have improved as well, up to six times faster on average.”
Download Case Study
Data deduplication is an important new technology that is quickly being embraced by users as they struggle with issues of data proliferation. By eliminating redundant data objects, an immediate benefit is obtained through space efficiencies.
In the context of disk storage, deduplication refers to any algorithm that searches for duplicate data objects (e.g., blocks, chunks, files) and discards those duplicates. When duplicate data is detected, it is not retained, but instead a "data pointer" is modified so that the storage system references an exact copy of the data object already stored on disk.
Why Deduplication?
Data deduplication is popular in disk storage today because it reduces the amount of disk space needed to store data. The average UNIX® or Windows® enterprise disk volume contains thousands or even millions of duplicate data objects. As these objects are modified, distributed, backed up, and archived, the duplicate data objects are stored repeatedly. The end result of this is inefficient use of storage resources. Deduplication helps to prevent this inefficiency.
How Much Space Does Deduplication Actually Save?
Deduplication vendors often claim that their products offer 20:1, 50:1, or even greater data reduction ratios. These claims actually refer to the "time-based" space savings effect of deduplication on repetitive data backups. Since data backups contain largely unchanged data, once the first full backup has been stored, all subsequent full backups will see a very high occurrence of deduplication.
In non-backup data environments, such as file archival or infrequently accessed unstructured data, the rules of time-based data reduction ratios do not apply. In these environments, volumes do not receive a steady supply of redundant full backups, but may still contain a large amount of resident duplicate data objects. The ability to reduce the space requirements in these volumes through deduplication is measured in "spatial" terms. In other words, if a 500GB data archival volume can be reduced to 300GB through deduplication, the spatial reduction is 40%.
Space savings through deduplication can provide substantial cost benefits through reduced storage capacity requirements. As with any new technology, however, there are many approaches and techniques being implemented to accomplish data deduplication. Because of this, users should look beyond the actual deduplication of data and also consider design factors such as data reliability and performance overhead.
| Deduplication Benefits using NetApp Technology |
- NetApp deduplication operates with a high degree of granularity. Newly stored data is divided into small blocks. Each block of data has a digital "signature," which is compared to all other signatures in the volume. If an exact block match exists on the disk volume, the duplicate block is discarded and its disk space is reclaimed.
- NetApp deduplication is tightly integrated with Data ONTAP® software and the WAFL® file system. Because of this, deduplication is performed with extreme efficiency. Complex hashing algorithms and look-up tables are not required. Instead, NetApp deduplication is able to leverage existing Data ONTAP internal characteristics to create and search digital fingerprints, redirect data pointers, and free up redundant data areas—all with a minimal amount of user performance impact.
- Another key advantage of NetApp deduplication's integration with Data ONTAP is the ability to utilize the error checking and recovery procedures that are inherent to Data ONTAP. This includes recovery from power failures, file inconsistencies, and file-system corruption.
- NetApp deduplication is extremely easy to use and does not require any external software or additional appliances. A simple command line instruction invokes the deduplication process.
- NetApp deduplication can be implemented across a wide variety of applications and file types. These include data backup, data archival, and primary data. Other deduplication products are predominantly confined for use in only one application, e.g., data backup.
- NetApp deduplication is available on a wide variety of platforms, including the NearStore R200 and FAS® 2000, 3000 and 6000 models. These platforms offer customers more scalability in capacity and performance than any other deduplication product.
We offer competitive prices, award-winning configuration support, great delivery and world-class implementation and support services. Contact a Champion Solution Specialist at 800-771-7000 or submit a quick quote.
| 
|
|