Advantages and Disadvantages of Inline Deduplication

450953112_91a9047f5a_mIn a previous post, we’d gone over the advantages and disadvantages of inline vs. postprocess dedupliation. For those of you who haven’t already read the previous article, here’s all you need to remember:

  • Inline deduplication occurs BEFORE the file is written
  • Postprocess deduplication occurs AFTER the file is written

The main advantage of the inline process is that it minimizes I/O operations.

With the Postprocess method, you need to:

  • Input data
  • Write it to disk
  • Read the disk
  • Delete the data
  • Write a pointer to the hash table

As you can imagine, this is a very resource-intensive way to handle large amounts of data during the backup process. Inline deduplication reduces this down to just “input and write the data”. This means that the deduplication process is much faster.

Another benefit of the inline method is that you don’t need a large buffer to hold the uncompressed data while it’s being processed as you might with the postprocess method.

And of course, the simplicity of the inline method makes it very easy to implement and operate.

However, there are also a few downsides to the inline method.
Because this method is very processor intensive, it might slow down the input speed of your backup server. For most businesses, this shouldn’t be a problem. Speeds are still very fast. But if you need to back up extremely large amounts of data very quickly, this may be something to think about.
Another disadvantage is that it gives you less control over how your backups are deduplicated.

But probably the biggest downside to this approach is that inline deduplication can optimize for storage, but not for restorability. This means that might store your data in such a way that backup recoveries may be slow or inefficient. We’ll go into more detail about this in another post.

For most business applications, inline deduplication will work just fine. Especially if this is only being used for long-term backup or archival.
In another post, we’ll go over the pros and cons of compressing backup data using a postprocess deduplication methodology.

Image Credit: http://www.flickr.com/photos/question_everything/450953112/sizes/s/