Dropbox – It’s Scale Jim, But Not As We Know It

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Dropbox recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.

 

What’s That In Your Pocket?

James Cowling spent some time talking to us about Dropbox’s “Magic Pocket” system architecture. Despite the naff name, it’s a pretty cool bit of tech. Here’s a shot of James answering a question.

 

Magic Pocket

Dropbox uses Magic Pocket to store users’ file content:

  • 1+ EB of user file data currently stored
  • Growing at over 10PB per month

Customising the stack end-to-end allowed them to:

  • Improve performance and reliability for our unique use case
  • Improve economics

 

Inside the Magic Pocket

Brief history of development

  • Prototype and development
  • Production validation
    • Ran in dark phase to find any unknown bugs
    • Deleted first byte of data from third party cloud provider in February 2015
  • Scale out and cut over
    • 600,000+ disks
    • 3 regions in USA, expanding to EU
  • Migrated more than 500PB of user data from third party cloud provider into MP in 6 months

It’s worth watching the video to get a feel for the scale of the operation. You can also read more on the Magic Pocket here and here. Chan also did a nice write-up that you can access here.

 

Beyond Public Cloud

A bit’s been made of Dropbox’s move from public cloud back to its own infrastructure, but Dropbox were careful to point out that they used third parties where it made sense for them, and still leveraged various public cloud and SaaS offerings as part of their daily operations. The key for them was understanding whether building their own solution made sense or not. To that end, they asked themselves three questions:

  • At what scale is investment in infrastructure cost effective?
  • Will this scale enable innovation by building custom services and integrating hardware / software more tightly?
  • Can that innovation add value for users?

From a scale perspective, it was fairly simple, with Dropbox being one of the oldest, largest and most used collaboration platforms around. From an integration perspective, they needed a lot of network and storage horsepower, which set them apart from some of the other web-scale services out there. They were able to add value to users through an optimised stack, increased reliability and better security.

 

It Makes Sense, But It’s Not For Everyone

That all sounds pretty good, but one of the key things to remember is that they haven’t just cobbled together a bunch of tin and a software stack and become web-scale overnight. While the time to production was short, all things considered, there was still investment (in terms of people, infrastructure and so forth) in making the platform work. When you commit to going your own way, you need to be mindful that there are a lot of ramifications involved, including the requirement to invest in people who know what they’re doing, the capacity to do what you need to do from a hardware perspective, and the right minds to come up with the platform to make it all work together. The last point is probably hardest for people to understand. I’ve ranted before about companies not being anywhere near the scale of Facebook, Google or the other hyperscalers and expecting that they can deliver similar services, for a similar price, with minimal investment.

Scale at this level is a hard thing to do well, and it takes investment in terms of time and resources to get it right. And to make that investment it has to make sense for your business. If your company’s main focus is putting nuts and bolts together on an assembly line, then maybe this kind of approach to IT infrastructure isn’t really warranted. I’m not suggesting that we can’t all learn something from the likes of Dropbox in terms of how to do cool infrastructure at scale. But I think they key takeaways should be that Dropbox have:

  • Been around for a while;
  • Put a lot of resources into solving the problems they faced; and
  • Spent a lot of time deciding what did and did not make sense to do themselves.

I must confess I was ignorant of the scale at which Dropbox is operating, possibly because I saw them as a collaboration piece and didn’t really think of them as an infrastructure platform company. The great thing, however, is they’re not just a platform company. In the same way that Netflix does a lot of really cool stuff with their tech, Dropbox understands that users value performance, reliability and security, and have focused their efforts on ensuring that the end user experience meets those requirements. The Dropbox backend infrastructure makes for a fascinating story, because the scale of their operations is simply not something we come across every day. But I think the real success for Dropbox is their relentless focus on making the end user experience a positive one.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.