How Dropbox Tricks Us: An Illusion of Private Space

Most of us use the popular cloud storage service Dropbox. In fact, some of us love it so much that we store files directly in our Dropbox folder while working with them. One of the reasons why Dropbox and other cloud storage services have become popular is that these services give us an illusion of private space. The folder we see on our computers, phones and tablets is a nice little folder that is replicated on the data centers of these storage companies. Not quite. While these companies do have a humongous amount of data storage available at their data centers, it certainly costs them a lot to maintain such centers. They are constantly trying to figure out ways to optimize, compress and simplify file storage mechanisms to save costs and improve performance.

One of these things is file deduplication. Without going into too many technical details, the concept can be explained in three simple steps:

  • User uploads a file to the cloud storage.
  • The storage service realizes that some other user has already uploaded the same file.
  • The service shrugs its shoulders (not really!) and just keeps one copy of the file instead of storing a duplicate copy for the two users.

This illustration should make things clear –

File Deduplication

File Deduplication

But the process doesn’t end there. The service maintains a list of users who have uploaded the same file. So when a user wants to download the file they ‘uploaded’, the service simply checks if the user is listed as one of the ‘owners’ of the file and if yes, allows them to download the file.

Of course, the above process is only useful if the file is popular enough to be owned and uploaded by multiple users. Think mp3s, movie files or ebooks. So even if most of our files on Dropbox, Google Drive, Box or Skydrive are personal documents, file deduplication is certainly a useful technique to save space. But wait, what about privacy?! For that and more, check out this wonderful blog post.

Hope this short post was interesting! As usual, please leave your feedback in the comments sections. Thanks for reading! 🙂

– Omkar



4 thoughts on “How Dropbox Tricks Us: An Illusion of Private Space

  1. Agreed that this process is possible only for popular files, but still a check will have to be made if the files are exact duplicates correct? Consider the example of a popular ebook. Although the name, size and other relevant attributes are same, what if i made a few annotations in that book. Now my copy is not an exact duplicate. So in cases like these i suppose there are algorithms that check for exact duplicates. correct ?

  2. they must be checking it at binary level, a binarry diff algorithm probably. i don’t see this as a problem. isn’t this the same as downloading paid content . content is going to be just a copy of the original file. Not like buying a CD where u get a physically different object. hope my analogy made sense.

    • Probably yes, some sort of a binary diff algorithm would work.
      I agree, but the whole ‘Dropbox folder’ paradigm is that you have a personal storage space in the cloud. That’s certainly not the case for popular files at least.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s