Pavilion Data Systems Overview

I recently had the opportunity to hear about Pavilion Data Systems from VR Satish, CTO, and Jeff Sosa, VP of Products. I thought I’d put together a brief overview of their offering, as NVMe-based systems are currently the new hotness in the storage world.

 

It’s a Box!

And a pretty cool looking one at that. Here’s what it looks like from the front.

[image courtesy of Pavilion Data]

The storage platform is built from standard components, including x86 processors and U.2 NVMe SSDs. A big selling point, in Pavilion’s opinion, is that there are no custom ASICs and no FPGAs in the box. There are three different models available, with different connectivity and capacity options.

From a capacity perspective, you can start at 14TB and get all the way to 1PB in 4RU. The box can start at 18 NVMe drives and (growing by increments of 18) goes to 72 drives. It runs RAID 6 and presents the drives as virtual volumes to the hosts. Here’s a look at the box from a top-down perspective.

[image courtesy of Pavilion Data]

There’s a list of supported NVMe SSDs that you can use with the box, if you wanted to source those elsewhere. On the right hand side (the back of the box) are the IO controllers. You can start at 4 and go up to 20 in a box. There’s also 2 management modules and 4 power supplies for resiliency.

[image courtesy of Pavilion Data]

You can see in the above diagram that connectivity is also a big part of the story, with each pair of controllers offering 4x 100GbE ports.

 

Software? 

Sure. It’s a box but it needs something to run it. Each controller runs a customised flavour of Linux and delivers a number of the features you’d expect from a storage array, including:

  • Active-active controller support
  • Space-efficient snapshots and clones
  • Thin provisioning.

There’re also plans afoot for encryption support in the near future. Pavilion have also focused on making operations simple, providing support for RESTful API orchestration, OpenStack Cinder, Kubernetes, DMTF RedFish and SNIA Swordfish. They’ve also gone to some lengths to ensure that standard NVMe/F drivers will work for host connectivity.

 

Thoughts and Further Reading

Pavilion Data has been around since 2014 and the leadership group has some great heritage in the storage and networking industry. They tell me they wanted to move away from the traditional approach to storage arrays (the dual controller, server-based platform) to something that delivered great performance at scale. There are similarities more with high performance networking devices than high performance storage arrays, and this is by design. They tell me they really wanted to deliver a solution that wasn’t the bottleneck when it came to realising the performance capabilities of the NVMe architecture. The numbers being punted around are certainly impressive. And I’m a big fan of the approach, in terms of both throughput and footprint.

The webscale folks running apps like MySQL and Cassandra and MongoDB (and other products with similarly awful names) are doing a few things differently to the enterprise bods. Firstly, they’re more likely to wear jeans and sneakers to the office (something that drives me nuts) and they’re leveraging DAS heavily because it gives them high performance storage options for latency-sensitive situations. The advent of NVMe and NVMe over Fabrics takes away the requirement for DAS (although I’m not sure they’ll start to wear proper office attire any time soon) by delivering storage at the scale and performance they need. As a result of this, you can buy 1RU servers with compute instead of 2RU servers full of fast disk. There’s an added benefit as organisations tend to assign longer lifecycles to their storage systems, so systems like the one from Pavilion are going to have a place in the DC for five years, not 2.5 – 3 years. Suddenly lifecycling your hosts becomes simpler as well. This is good news for the jeans and t-shirt set and the beancounters alike.

NVMe (and NVMe over Fabrics) has been a hot topic for a little while now, and you’re only going to hear more about it. Those bright minds at Gartner are calling it “Shared Accelerated Storage” and you know if they’re talking about it then the enterprise folks will cotton on in a few years and suddenly it will be everywhere. In the meantime, check out Chris M. Evans’ article on NVMe over Fabrics and Chris Mellor also did an interesting piece at El Reg. The market is becoming more crowded each month and I’m interested to see how Pavilion fare.

Pure//Accelerate 2018 – (Fairly) Full Disclosure

Disclaimer: I recently attended Pure//Accelerate 2018.  My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my notes on gifts, etc, that I received as a conference attendee at Pure//Accelerate 2018. This is by no stretch an interesting post from a technical perspective, but it’s a way for me to track and publicly disclose what I get and how it looks when I write about various things. I’m going to do this in chronological order, as that was the easiest way for me to take notes during the week. While everyone’s situation is different, I took 5 days of unpaid leave to attend this conference.

 

Saturday

My wife dropped me at the BNE domestic airport and I had some ham and cheese and a few coffees in the Qantas Club. I flew Qantas economy class to SFO via SYD. The flights were paid for by Pure Storage. Plane food was consumed on the flight. It was a generally good experience, and I got myself caught up with Season 3 of Mr. Robot. Pure paid for a car to pick me up at the airport. My driver was the new head coach of the San Francisco City Cats ABA team, so we talked basketball most of the trip. I stayed at a friend’s place until late Monday and then checked in to the Marriott Marquis in downtown San Francisco. The hotel costs were also covered by Pure.

 

Tuesday

When I picked up my conference badge I was given a Pure Storage and Rubrik co-branded backpack. On Tuesday afternoon we kicked off the Analyst and Influencer Experience with a welcome reception at the California Academy of Sciences. I helped myself to a Calicraft Coast Kolsch and 4 Aliciella Bitters. I also availed myself of the charcuterie selection, cheese balls and some fried shrimp. The most enjoyable part of these events is catching up with good folks I haven’t seen in a while, like Vaughn and Craig.

As we left we were each given a shot glass from the Academy of Sciences that was shaped like a small beaker. Pure also had a small box of Sweet 55 chocolate delivered to our hotel rooms. That’s some seriously good stuff. Sorry it didn’t make it home kids.

After the reception I went to dinner with Alastair Cooke, Chris Evans and Matt Leib at M.Y. China in downtown SF. I had the sweet and sour pork and rice and 2 Tsingtao beers. The food was okay. We split the bill 4 ways.

 

Wednesday

We were shuttled to the event venue early in the morning. I had a sausage and egg breakfast biscuit, fruit and coffee in the Analysts and Influencers area for breakfast. I need to remind myself that “biscuits” in their American form are just not really my thing. We were all given an Ember temperature control ceramic mug. I also grabbed 2 Pure-flavoured notepads and pens and a Pure Code t-shirt. Lunch in the A&I room consisted of chicken roulade, salmon, bread roll, pasta and Perrier sparkling spring water. I also grabbed a coffee in between sessions.

Christopher went down to the Solutions Expo and came back with a Quantum sticker (I am protecting data from the dark side) and Veeam 1800mAh keychain USB charger for me. I also grabbed some stickers from Justin Warren and some coffee during another break. No matter how hard I tried I couldn’t trick myself into believing the coffee was good.

There was an A&I function at International Smoke and I helped myself to cheese, charcuterie, shrimp cocktail, ribs, various other finger foods and 3 gin and tonics. I then skipped the conference entertainment (The Goo Goo Dolls) to go with Stephen Foskett and see Terra Lightfoot and The Posies play at The Independent. The car to and from the venue and the tickets were very kindly covered by Stephen. I had two 805 beers while I was there. It was a great gig. 5 stars.

 

Thursday

For breakfast I had fruit, a chocolate croissant and some coffee. Scott Lowe kindly gave me a printed copy of ActualTech’s latest Gorilla Guide to Converged Infrastructure. I also did a whip around the Solutions Expo and grabbed:

  • A Commvault glasses cleaner;
  • 2 plastic Zerto water bottles;
  • A pair of Rubrik socks;
  • A Cisco smart wallet and pen;
  • Veeam webcam cover, retractable charging cable and $5 Starbucks card; and
  • A Catalogic pen.

Lunch was boxed. I had the Carne Asada, consisting of Mexican style rice, flat iron steak, black beans, avocado, crispy tortilla and cilantro. We were all given 1GB USB drives with a copies of the presentations from the A&I Experience on them as well. That was the end of the conference.

I had dinner at ThirstBear Brewing Co with Alastair, Matt Leib and Justin. I had the Thirstyburger, consisting of Richards Ranch grass-fed beef, mahón cheese, chorizo-andalouse sauce, arugula, housemade pickles, panorama bun, and hand-cut fried kennebec patatas. This was washed down with two glasses of The Admiral’s Blend.

 

Friday

As we didn’t fly out until Friday evening, Alastair and I spent some time visiting the Museum of Modern Art. vBrownBag covered my entry to the museum, and the Magritte exhibition was terrific. We then lunched in Chinatown at a place (Maggie’s Cafe) that reminded me a lot of the Chinese places in Brisbane. Before I went to the airport I had a few beers in the hotel bar. This was kindly paid for by Justin Warren. On Friday evening Pure paid for a car to take Justin and I to SFO for our flight back to Australia. Justin gets extra thanks for having me as his plus one in the fancier lounges that I normally don’t have access to.

Big thanks to Pure Storage for having me over for the week, and big thanks to everyone who spent time with me at the event (and after hours) – it’s a big part of why I keep coming back to these types of events.

Pure//Accelerate 2018 – Wednesday – Chat With Charlie Giancarlo

Disclaimer: I recently attended Pure//Accelerate 2018.  My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my notes from the “Chat with Charlie Giancarlo” session for Analysts and Influencers at Pure//Accelerate.

 

Chat with Charlie Giancarlo

You’ve said culture is important. How do you maintain it? “It’s the difference between hiring missionaries and hiring mercenaries”. People are on a mission, out there to prove something, drive the company forward. There’s not an exact, formulaic way of doing it. Hire people you have experience with in the industry. Pure does have a good interview process. It tends to bring out different sides of the interviewee at the same time. We use objective tests for engineering people. Check on the cultural backgrounds of sales talent.

Are there any acquisitions on the horizon? Gaps you want to fill? We have an acquisition strategy. We’ve decided where we’re going, identified the gaps, looked at buy versus build versus partner. There’s a lot of research to do around strengths and weaknesses, fit, culture. There are different types of companies in the world. Get rich quick, play in your own sandbox, people who are on a mission. We have gaps in our product lines. We could be more cloud, more hyper-converged. FlashBlade is still not 3.0 product.

Other companies are under pressure to be more software or cloud. Given your hardware background, how’s that for you? Our original product was software on commodity hardware. All except one SDS vendor sells hardware. At the end of the day, selling pure software that goes on any box is insanely hard to achieve. Majority of SDS still sell on hardware – one throat to choke. Some customers, at scale, might be able to do this with us. Why build our own hardware? 4RU and 1PB versus 1.5 racks. We understood the limitations of commodity hardware. We’re not wedded to the hardware – we’re wedded to providing more value-add to our customers.

Has anyone taken you up on the offer? Some are enquiring.

One of your benefits has been focus, one thing to sell. You just mentioned your competitors don’t have that. Now you’re looking at other stuff? We’re making data easier and more economic to consume. Making the entire stack easier to consume. When I say more HCI, what do I mean? Box with compute, storage and network and you can double it, etc. Another way to look at HCI is a single pane of glass for orchestration, automated discovery, ease of use issue. Customers want us to extend beyond storage.

Single throat to choke and HCI. You provide the total stack, or become an OEM. I have no intention of selling compute. It’s controlled by the semi-conductor company or the OS company.

How about becoming an OEM provider? If they were willing, I’d be all ears. But it’s rare. Dell, Cisco, they’re not OEM companies. Margin models are tough with this.

Developing international sales? Our #2 goal is to scale internationally. Our goals align the company around a few things. It’s not just more sales people. It’s the feature set (eg ActiveCluster). ActiveCluster is more valuable in Europe than anywhere else. US – size is difficult (distance). In Europe they have a lot of 100km apart DCs. Developing support capability. Scaling marketing, legal, finance. It’s a goal for the whole company.

The last company to get to $1B was NetApp. What chance does Pure have to make it to $5B? Well, I hope we do. It’s my job. Storage? That’s a terrible business! Friends in different companies have a lot of different opinions about it. Pure could be the Arista of storage? The people who are engaged in storage don’t believe in storage anymore. They’re not investing in the business. It’s a contrarian model. Compete in business, not just tech. We’re investing 20% in R&D. You need to invest a certain amount in a product line. They have a lot of product lines. We could be bought – we’re a public company. But Dell won’t buy. HPE have bought Nimble. Hitachi don’t really buy people. Who does that leave? I think we have a chance of staying independent.

You could buy somebody. I believe we have a very good sales force. There are a lot of ways to build an acquisition strategy. We have a good sales force.

You’re a public company. You haven’t been doing well. What if Mr Elliott comes into your company? (Activist investor). Generally they like companies with lots of cash. Or companies spending too much on R&D without getting results. We’re growing fast. We just posted 40% profit. Puritans might believe our market cap should be higher. The more we can show that we grow, the more exciting things might be. I don’t think we’re terribly attractive to an activist right now.

Storage is not an interesting place to be. But it’s all about data. Do you see that shifting with investors? What would cause that? I believe we need to innovate too. I think that the investors would need to believe that some of the messages we’re sending today, and over the next year, create an environment where our long term growth path is higher and stronger than it is today. Sometimes its sheer numbers, not storyline. The market believes that NetApp, EMC, etc that they can cause pricing and growth challenges for us for a long time. We need them to believe we’re immune to those challenges.

How about China as a marketplace? China as a competitive threat with new technologies? China is a formidable country in every respect. Competition, market. It’s more difficult than it was 10 years ago as a market. Our administration hasn’t help, China has put a lot of rules and regulations in place. I wish we’d focus on those, not the trade deficit. It’s a market we’re experimenting in. If it only works out as well as our competitors can achieve, it may not be worthwhile. And the issue of competition. I worry about Huawei, particularly in third world countries. Viable, long-lasting commercial concerns. In Europe it’s a bit different. The Chinese are very innovative. The US does well because of a market of 300 million, China has 1.4 billion people.

Joe Tucci said 4-5 years ago that the industry was merging. He said you can’t survive as a small player. How many times have we seen this picture? Huge conglomerates falling apart under their own weight. I hate to disagree with Joe. It’s a misunderstanding of scale. It’s about individual products and capabilities, not the size of the business. If you’re just big, and not growing, you no longer have scale. All you’ve done is create a large company with a lot of under scaled products. Alan Kay “perspective is worth 40 IQ points” [note: it’s apparently 80, maybe I misheard].

Interesting session. 4 stars.

Pure//Accelerate 2018 – Thursday General Session – Rough Notes

Disclaimer: I recently attended Pure//Accelerate 2018.  My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my rough notes from Thursday’s General Session at Pure//Accelerate 2018.

 

Dave Hatfield

Dave Hatfield takes the stage, reports that there have been over 10000+ viewers and participants for the show. Cast your minds back to the “Summer of love” in 1968. This was also the time of the first big tech demo – “The Mother of All Demos” by Doug Engelbart – and it included the introduction of the mouse, network computing, hypertext linking, collaboration, multiple windows. You can see the clip here.

Pure is about embracing the tools of transformation.

 

Dr Kate Darling

Dr Kate Darling (MIT Media Lab) then takes the stage. She is a researcher with expertise in AI and robotics. She just flew in from Hong Kong. She mentions she had a baby 6 months ago. People say to her “Kate, it must be so interesting to watch your baby develop and compare it to AI development”. She says “[m]y baby is a million times more interesting than anything we’ve developed”.

AI is going to shape the world her baby’s growing up in. Like electricity, we don’t know how it will shape things yet. Some of the applications are really cool. A lot of it is happening behind the scenes. E.g. They took a Lyft to the airport and the driver was using Waze (which uses AI). There’s a bit of hype that goes on, and fear that AI might self-evolve and kill us all. This distracts from the benefits. And the actual problems we face right now (privacy, security, etc). Leads people to over-estimate where we are right now in terms of development.

She works in robotics. We’ve been doing this for centuries. We’re a long way from them taking over the world and killing us all. If you search for AI (via google images) you see human brain / robots pictures. Constantly comparing AI to human intelligence. This image is heavily influenced by sci-fi and pop culture. Automation will have an impact on labour markets. But AI is not like human intelligence. We’ve developed AI that is much smarter than people. But the AI is also a lot dumber. E.g. Siri, I’m bleeding, call me an ambulance. Ok, I’ll call you “an ambulance” from now on.

[image source http://www.derppicz.com/siri-call-me-an-ambulance/]

We’ve been using animals for 1000s of years, and we still use them. E.g., Dolphins for echo-location. Autonomous and unpredictable agents. Their skills are different to ours, and they can partner with us and extend our abilities. We should be thinking outside of the “human replacement” box.

Examples:

  • Japan looks to AI to simplify patent screening
  • Recognise patterns in peoples’ energy consumption
  • Spam filters

Work in human – robot interaction. People’s psychological reactions to physical robots. Treat them like they’re alive, even though they’re machines. Perceive movement in our personal space as intent. The Roomba is really dumb. Military robots – soldiers become attached to bomb disposal robots. Paro Robotics – seal used in nursing homes. A lot of people don’t like the idea of robots for them. But this replaces animal therapy, not human care.

AI can shape how we relate to our tools, and how we relate to each other. The possibilities are endless.

If you’re interested in AI. It’s kind of a “hypey buzzword” thrown around at conferences. It’s not a method and more of a goal. Most of what we do is machine learning. eg. Hot dog example from Silicon Valley. If you’re into AI, you’ll need data scientists. They’re in high demand. If you want to use AI in your business, it’s important to educate yourself.

Need to be aware of some of the pitfalls, check out “Weapons of Math Destruction” by Cathy O’Neill.

There are so many amazing new tools being developed. OSS machine learning libraries. There’s a lot to worry about as a parent, but there’s a lot to look forward to as well. eg. AI that sorts LEGO. Horses replaced by cars. Cars now being replaced by a better version of an autonomous horse.

 

Dave Hatfield

Dave Hatfield takes the stage again. How can you speed up tasks that are mundane so you can do things that are more impactful? You need a framework and a way to ask the questions about the pitfalls. DevOps – institutionalised knowledge of how to become software businesses. Introduces Jez Humble.

 

Jez Humble

Why does DevOps matter? 

The enterprise is comprised of business, engineering, and operations. The idea for a project occurs, it’s budgeted, delivered and thrown over the wall to ops. Who’s practicing Agile? All about more collaboration. Business people don’t really like that. Now delivering into production all the time and Operations aren’t super happy about that. Operations then create a barrier (through change management), ensuring nothing ever changes.

How does DevOps help?

No real definition. The DevOps Movement is “a cross-functional community of practice dedicated to the study of building, evolving and operating rapidly changing, secure, resilient systems at scale”. There’s some useful reading (Puppet’s State of DevOps Reports) here, here, and here.

Software delivery as a competitive advantage

High performers were more than twice as likely to achieve or exceed the following objectives

  • Quantity of products or services
  • Operating efficiency
  • Customer satisfaction
  • Quality of products or services provided
  • Achieving organisational and mission goals
  • Measures that demonstrate to external parties whether or not the organisation is achieving intended results

IT Performance

  • Lead time for changes
  • Release frequency
  • Time to restore service
  • Change fail rate

We’re used to thinking about throughput and stability and a trade-off – that’s not really the case. High performers do both.

2016 IT performance by Cluster 

(From the 2016 report)

  High IT Performers Medium IT Performers Low IT Performers
Deployment Frequency

For the primary application or service you work on, how often does your organisation deploy code?

On demand (multiple deploys per day) Between once per week and once per month Between once per month and every 6 months
Lead time for changes

For the primary application or service you work on, what is your lead time for changes (i.e. how long does it take to go from code commit to code successfully running in production)?

Less than an hour Between one week and one month Between one month and 6 months
Mean time to recover (MTTR)

For the primary application or service you work on,how long does it generally take to restore service when a service incident occurs (e.g. unplanned outage, service impairment)?

Less than an hour Less than one day Less than one day
Change failure rate

For the primary application or service you work on, what percentage of the changes either result in degraded service or subsequently require remediation (e.g. lead to service impairment, service outage, require a hotfix, rollback, fix forward, patch)?

0-15% 31-45% 16-30%

 

“It’s about culture and architecture”. DevOps isn’t about hiring “DevOps experts”. Go solve the boring problems that no-one wants to do. Help your people grow. Grow your own DevOps experts. Re-orgs sucks the energy out of company. They often don’t produce better outcomes. Have people who need to work together, sit together. The cloud’s great, but you can do continuous delivery with mainframes. Tools are great, but buying “DevOps tools” doesn’t change the outcomes. “Please don’t give developers access to Prod”. DevOps is learning to work in in small batches (product dev and org change). You can’t move fast with water / scrum / fall.

Architectural Outcomes

Can my team …

  • Make large-scale changes to the design of its system without the permission of somebody outside the team or depending on other teams?
  • Complete its work without needing fine-grained communication and coordination with people outside the team?
  • Deploy and release its product or service on demand, independently of other services the product or service depends on?
  • Do most of its testing on demand, without requiring an integrated test environment?
  • Perform deployments during normal business hours with negligible downtime?

Deploying on weekends? We should be able to deploy during the day with negligible downtime

  • DevOps is learning to build quality in. “Cease dependence on mass inspection to achieve quality. Improve the process and build quality into the product in the first place”. W. Edwards Deming.
  • DevOps is enabling cross-functional collaboration through value streams
  • DevOps is developing a culture of experimentation
  • DevOps is continually working to get better

Check out the Accelerate book from Jez.

The Journey

  • Agree and communicate measurable business goals
  • Give teams support and resources to experiment
  • Talk to other teams
  • Achieve quick wins and share learnings
  • Never be satisfied, always keep going

 

Dave Hatfield

Dave Hatfield takes the stage again. Don’t do re-orgs? We had 4 different groups of data scientists pop up in a company of 2300. All doing different things. All the data was in different piggy banks. We got them all to sit together and that made a huge difference. “We need to be the ambassadors of change and transformation. If you don’t do this, one of your competitors will”.

Please buy our stuff. Thanks for your time. Next year the conference will be in September. We’re negotiating the contracts right now and we’ll let you know soon.

Solid session. 4.5 stars.

Pure//Accelerate 2018 – Wednesday General Session – Rough Notes

Disclaimer: I recently attended Pure//Accelerate 2018.  My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my rough notes from Wednesday’s General Session at Pure//Accelerate 2018.

[A song plays. “NVMe” to the tune of Naughty By Nature’s OPP]

 

Charlie Giancarlo

Charlie Giancarlo (Pure Storage CEO) takes the stage. We share a mission: to power innovation. Storage is really important part of making that mission happen. We’re in the zettabyte era, we don’t even talk about EB anymore. It’s not the silicon age, or the Internet age, or the social age. We’re talking about the original gold rush of 1849. The amount gold in data is unlimited. We need the tools to pick the gold out of the data. The data heroes are us. And we’re announcing a bunch of tools to mine the gold.

Who am I? What has allowed us to get to success? Where are we going?

I’m new here, I’ve just gotten through his 3rd quarter. I’ve been an nngineer, entrepreneur, CTO, equity partner – entirely tech focused. I’ve made a living looking at innovation on the basis of looking at it as a 3-legged stool.

What are the components that advance tech?

  • Networking
  • Compute (Processing)
  • Storage

They advance on their own timelines, and don’t always keep pace, and the industry someitmes gets out of balance. Data centre and system architectures adjust to accommodate this.

Compute

Density has multiplied by a factor of 10 in 10 years (slow down of Moore’s Law), made up for this by massive scale in the DC

Networking

Multiplied by 10 in 8 years, 10Gbps about 10 years ago, and 100Gbps about 2 years ago

Data

  • Multiplied by a factor of 1000
  • Storage vendors just haven’t kept up
  • Storage on white boxes?

Pure came in and bought balance to this picture, allowing storage to keep up with networking and compute.

It’s all about

  • Business model
  • Customer experience
  • Technology

“Data is the most important asset that you have”

Pure Storage became a billion dollar company in revenue last year (8 years in, 5 years after it introduced its first product). It’s cashflow positive and “growing like a bat out of hell” with over 4800 customers. Had less than 500 customers just 4 years ago. And a large chunk of customers are cloud providers. Also in 30% of the Fortune 500.

Software

The software economy sits on top of the compute / network / storage stool. Companies are becoming more digital. Last year they talked about Domino’s, and this year they’re using AI to analyse your phone calls. Your calls are being answered by an AI engine that takes your order. An investment bank has more computer engineers and developers than they have investment bankers. Companies need to feed these apps with data. Data is where the money is.

DC Architectures

  • Monolithic scale-up – client / server (1990s)
  • Virtualised – x86 + virtualisation (2000s)
  • Scale-out  – cloud (2010s)

Previously big compute – apps are rigid, now there’s big data – apps are fluid, data is shared

“Data-centric” architecture

  • Faster
  • 100% shared
  • Simpler
  • Built for rapid innovation

Dedicated storage and stateless compute

Examples

  • Large booking, travel, e-commerce site
  • PAIGE.AI – cancer pathology – digitised samples from the last decade
  • Man AHL – economy and stock market modelling

Behind all these companies is a “data hero”

Over 80% of CxOs believe that the speed of analysing data would be one of their biggest competitive issues, but CIOs worried about not being able to keep up with data coming in to the enterprise.

“We empower innovators to build a better word with data”

Beyond AFA

  • Modern data pipeline
  • A whole new model
  • Pure “on-demand”
  • The AI Era

“New meets Now”

It takes great people to make a great company – the amazing “Puritans”. Pure have a NPS score of 83.7 – best in B2B.

 

Matt Kixmoeller

Matt Kixmoeller takes the stage. We need a new architecture to unlock the value of data. Back in 2009. Michael Jackson died, Obama was in, Fusion-IO had just started. Pure came along and had the idea of building an AFA. Today we’re going to bring you the sequel

  • There’s basically SAN / NAS and DAS (which has seen a resurgence in web scale era)
  • DAS reality – many apps, rigid scaling, either too much storage or too much compute

New technologies to re-wire the DC

  • Diverse, fast compute (CPU, GPU, FPGA)
  • Fast networks and protocols (RoCE, NVMe-oF)
  • Diverse SSD
  • Eliminates the outside the box penalty
  • Gets CPUs totally focussed on applications

What if we can finally unite SAN and DAS into a data-centric architecture?

Gartner have identified “Shared accelerated storage”. “The NVMe-oF protocol … will help balance the performance and simplicity of direct-attached storage (DAS) with the scalability and manageability of shared storage”.

“Tier 0”? – they’re making the same mistake again. Pure are focused on shared accelerated storage available for all.

Tomorrow can look like this

  • Diskless, stateless, elastic compute (continuers, VMs, bare metal)
  • Shared accelerated storage (block, file, object)
  • Fast, converged networks
  • Open, full-stack orchestration

 

Keith Martin

Keith Martin (ServiceNow) takes the stage

  • Dealing with high volumes of data
  • Tremendous growth in net new data
  • 18 months ago, doing basic web scale, DAS architecture
  • Filling up DCs at a very fast clip
  • Stopped and analysed everything there was

What happens in an hour in the DC?

In one hour our customers:

  • 7.5 million performance analytics scores computed
  • 730,000 configuration items added
  • 274,000 notifications sent
  • 76,000 assets added
  • 49,200 live feed messages
  • 36,300 change requests
  • 15,600 workflow activities

Every hour of the day our engineering teams:

  • Develop code across the globe in 9 global develoipment locations (SD, SC, SF, Kirkland, London, Amsterdam, Tel Aviv, Hyderabad, Bangalore)
  • Use 450 copies of ServiceNow for quality engineering testing
  • Run 100,000 automated quality engineering tests

In one hour on our infrastructure

  • 25 billion database queries
  • 112 million HTTP requests
  • 2.5 million emails
  • 25.3 million API calls
  • 493TB of backups

We were going through

  • 30K hard drives
  • 3500+ servers
  • >2000 failed HDDs per year

CPU time was being consumed with backup data movement and restore times were becoming longer and longer. They started to look at the FlashBlade. With its small footprint and low power it was a really interesting option for them. It was really easy to setup and use. They let the engineers out of their cages to play with it in the lab and found it was surprisingly hard to break. So they’ve decided to start using FlashBlade in production as their standard for protection data.

Achieving 3x density now

Each rack has:

  • 30 1RU servers
  • 1000 compute cores
  • 1.5PB effective Flash

Decided to test and implement FlashArray as well and they’re excited about FlashArray//X. ServiceNow cares about uptime. Pure has the best non-disruptive upgrade, expansion and repair model. DAS can prove to be expensive at scale.

 

Matt Kixmoeller

Kix takes the stage again

  • 2016: FlashBlade – the world’s first AFA for big data
  • 2017: FlashArray//X

Introducing the FlashArray//X Family

  • //X10
  • //X20
  • //X50
  • //X70
  • //X90

 

Bill Cerreta

Bill Cerreta takes the stage.

  • The FlashArray was launched in 2012, Purity was built to optimise Flash
  • //M chassis designed for NVMe
  • Deep integration of software and hardware

Where are we going with Flash?

SCM, QLC. We’ve eliminated translation layers. The X//90, for example, has

  • Dual-Protocol controllers – speaks to both SSD and NVMe
  • The 10 through 90 have 25GbE onboard
  • Everything’s NVMe/oF ready and this will be added via software later in the year
  • Double the write bandwidth of //M
  • This year, they’re all in on //X
  • 7 generations of evergreen, non-disruptive upgrades [photo]
  • //X makes everything faster (compared to //M)

Neil Vachharajani takes the stage briefly to talk MongoDB on shared accelerated storage.

Kix continues.

Priced for mainstream adoption

  • Early attempts at NVMe cost 10x more than AFAs
  • //X, when introduced last year, was 25% more than //M
  • $0 premium for //X over //M on an effective capacity basis

[Customer video – Berrios]

 

Jason Nadeau

Jason Nadeau takes the stage. Most infrastructure wasn’t built to allow data to flow freely.

  • 10s of products
  • Complex design
  • Silos, difficult to share

“Data-as-a-Service”

Data-centric Architecture

  • Consolidated and simplified
  • Real-time
  • On-demand and self-driving
  • Ready for tomorrow
  • Multi-cloud

Foundation

  • FlashArray
  • FlashBlade
  • FlashStack
  • AIRI

API-first model and software at the heart of the architecture.

 

Sandeep Singh

Sandeep Singh takes the stage. A lot of companies have managed to virtualise. A lot have managed to “flash-ify”. But a lot of them have yet to automate and “service-ize”, to “container-ize”, or to adopt multi-cloud.

Automate and service-ize – on every cloud platform

  • VMware SDDC – VMware SDDC validated design
  • Open automation – pre-built open full-stack automation toolkits
  • Openshift PaaS – container-based reference architecture

Simon Dodsley takes the stage to talk with Sandeep about MongoDB deployments in less than a minute (down from 5 days).

Sandeep continues. Container adoption is increasing quickly but there’s a lack of storage support for persistent containers. Pure have container plug-ins for Docker, Kubernetes. Containerized apps want to consume storage as-a-service. Introducing Pure Service Orchestrator.

Multi-cloud

Introduced ActiveCluster last year. Snapshots and snapshot mobility (portable snapshots introduced last year) are important.

  • Snap to NFS is generally available now
  • CloudSnap to AWS S3 (available in late 2018)
  • DeltaSnap open API (Veeam, Catalogic, actifiio, CommVault, Rubrik, Cohesity)

 

Jason Nadeau

Jason Nadeau comes back on stage. Data as-a-service consumption. Leases aren’t pay per use and aren’t a service-like experience

Introducing Evergreen Storage Service (ES2)

  • Pay per used GB
  • True open
  • Terms as short as 12 months
  • Always evergreen
  • Onboard in days
  • Always “better-than-cloud” economics

Capex with Evergreen storage, Opex with ES2

[Video on PAIGE.AI]

 

Matt Burr

Matt Burr takes the stage. Unlocking the value of what was once cold data. New era demands a new data mindset.

  • How has the value of data changed?
  • How can you extract that value?
  • How can you get started today?

A robot will replace a human surgeon. A machine has learned to adapt faster than the human brain can. More and more data will live in the hotter tier. What tools can make this valuable? Change in the piggy bank – like data. But data is stuck in silos.

  • Data warehouse
  • Data lake
  • Modern data pipeline
  • AI data pipeline

$/GB used to make sense. We need new metrics. $/flops? $/simulation. Real value is generated by simplifying and accelerating the data flow. Build a data hub on FlashBlade. FlashBlade is 16 months old (GA in January 2016).

Invites NVIDIA’s Rob Ober on stage

 

Rob Ober

“The time has come for GPU computing”

  • Moore’s Law is flattening an awful lot
  • NVIDIA as “the AI computing platform”
  • “The more you buy, the more you save”

Traditional hyper scale cluster – 300 dual-CPU servers, 180KW power, or you can deploy 1 DGX-2, 10KW.

Science fiction is being made possible

  • Ultrasound retrofit
  • 5G beam
  • Molecule modelling 1/10 millionth $

Scaling AI

  • Design guesswork
  • Deployment complexity
  • Multiple points of support

AI scaling is hard, “not like your traditional infrastructure”

AIRI

  • Jointly-validated solution
  • Faster, simplified deployment
  • Trusted expertise and support

 

Matt Kixmoeller

Kix takes the stage again. There’s a big gap in AI infrastructure, with customers spread across varying stages of journey from single server -> scale-out infrastructure. Introduces AIRI Mini and they’re also extending AIRI to Cisco.

 

Data Warehouse pitfalls

  • Performance not keeping up with data
  • Pricing extortions and over-provisioning
  • Inflexible appliances built for a single workload

Progress has to have a foundation.

Customer example of telco in Asia moving from Exadata to FlashBlade

Introducing FlashStack for Oracle Data Warehouse

Set your data free

 

Dave Hatfield

Dave Hatfield takes the stage. Thanks for coming. Over 5000 people in the Bill Graham Civic Auditorium and a lot watching on-line. Customers, partners. Be sure to check out the “petting zoo” (Solutions Pavilion). We wanted to have something that was “not your father’s storage show. Your father’s storage show happened last month”. Anyone been to a Grateful Dead show? It’s a community experience, you don’t know what will happen next.

And that’s a wrap.

Burlywood Tech Announces TrueFlash

Burlywood Tech came out of stealth late last year and recently announced their TrueFlash product. I had the opportunity to speak with Mike Tomky about what they’ve been up to since emerging from stealth and thought I’d cover the announcement here.

 

Burlywood TrueFlash

So what is TrueFlash? It’s a “modular controller architecture that accelerates time-to-market of new flash adoption”. The idea is that Burlywood can deliver a software-defined solution that will sit on top of commodity Flash. They say that one size doesn’t fit all, particularly with Flash, and this solution gives customers the opportunity to tailor the hardware to better meet their requirements.

It offers the following features:

  • Multiple interfaces (SATA, SAS, NVMe)
  • FTL Translation (Full SSD to None)
  • Capacity ->100TB
  • Traffic optimisation
  • Multiple Protocols (Block (NVMe, NVMe/F), File, Object, Direct Memory)

[image courtesy of Burlywood Tech]

 

Who’s Buying?

This isn’t really an enterprise play – those aren’t the types of companies that would buy Flash at the scale that this would make sense. This is really aimed at the hyperscalers, cloud providers, and AFA / HCI vendors. They sell the software, controller and SSD Reference Design to the hyperscalers, but treat the cloud providers and AFA vendors a little differently, generally delivering a completed SSD for them. All of their customers benefit from:

  • A dedicated support team (in-house drive team);
  • Manufacturing assembly & test;
  • Technical & strategic support in all phases; and
  • Collaborative roadmap planning.

The key selling point for Burlywood is that they claim to be able to reduce costs by 10 – 20% through better capacity utilisation, improved supply chain and faster product qualification times.

 

Thoughts

You know you’re doing things at a pretty big scale if you’re thinking it’s a good idea to be building your own SSDs to match particular workloads in your environment. But there are reasons to do this, and from what I can see, it makes sense for a few companies. It’s obviously not for everyone, and I don’t think you’ll be seeing this n the enterprise anytime soon. Which is the funny thing, when you think about it. I remember when Google first started becoming a serious search engine and they talked about some of their earliest efforts with DIY servers and battles with doing things at the scale they needed. Everyone else was talking about using appliances or pre-built solutions “optimised” by the vendors to provide the best value for money or best performance or whatever. As the likes of Dropbox, Facebook and LinkedIn have shown, there is value in going the DIY route, assuming the right amount of scale is there.

I’ve said it before, very few companies really qualify for the “hyper” in hyperscalers. So a company like Burlywood Tech isn’t necessarily going to benefit them directly. That said, these kind of companies, if they’re successful in helping the hyperscalers drive the cost of Flash in a downwards direction, will indirectly help enterprises by forcing the major Flash vendors to look at how they can do things more economically. And sometimes it’s just nice to peak behind the curtain to see how this stuff comes about. I’m oftentimes more interested in how networks put together their streaming media services than a lot of the content they actually deliver on those platforms. I think Burlywood Tech falls in that category as well. I don’t care for some of the services that the hyperscalers deliver, but I’m interested in how they do it nonetheless.

Storbyte Come Out Of Stealth Swinging

I had the opportunity to speak to Storbyte‘s Chief Evangelist and Design Architect Diamond Lauffin recently and thought I’d share some information on their recent announcement.

 

Architecture

ECO-FLASH

Storbyte have announced ECO-FLASH, positioning it as “a new architecture and flash management system for non-volatile memory”. Its integrated circuit, ASIC-based architecture abstracts independent SSD memory modules within the flash drive and presents the unified architecture as a single flash storage device.

 

Hydra

Each ECO-FLASH module is comprised of 16 mSATA modules, running in RAID 0. 4 modules are managed by each Hydra, with 4 “sub-master” Hydras being managed by a master Hydra. This makes up one drive that supports RAID 0, 5, 6 and N, so if you were only running a single-drive solution (think out at the edge), you can configure the modules to run in RAID 5 or 6.

 

[image courtesy of Storbyte]

 

Show Me The Product

[image courtesy of Storbyte]

 

The ECO-FLASH drives come in 4, 8, 16 and 32TB configurations, and these fit into a variety of arrays. Storbyte is offering three ECO-FLASH array models:

  • 131TB raw capacity in 1U (using 4 drives);
  • 262TB raw capacity in 2U (using 16 drives); and
  • 786TB raw capacity in 4U (using 48 drives).

Storbyte’s ECO-FLASH supports a blend of Ethernet, iSCSI, NAS and InfiniBand primary connectivity simultaneously. You can also add Storbyte’s 4U 1.18PB spinning disk JBOD expansion units to deliver a hybrid solution.

 

Thoughts

The idea behind Storbyte came about because some people were working in forensic security environments that had a very heavy write workload, and they needed to find a better way to add resilience to the high performance storage solutions they were using. Storbyte are offering a 10 year warranty on their product, so they’re clearly convinced that they’ve worked through a lot of the problems previously associated with the SSD Write Cliff (read more about that here, here, and here). They tell me that Hydra is the primary reason that they’re able to mitigate a number of the effects of the write cliff and can provide performance for a longer period of time.

Storbyte’s is not a standard approach by any stretch. They’re talking some big numbers out of the gate and have a pretty reasonable story to tell around capacity, performance, and resilience as well. I’ve scheduled another session with Storbyte to talk some more about how it all works and I’ll be watching these folks with some interest as they enter the market and start to get some units running workload on the floor. There’s certainly interesting heritage there, and the write cliff has been an annoying problem to solve. Coupled with some aggressive economics and support for a number of connectivity options and I can see this solution going in to a lot of DCs and being used for some cool stuff. If you’d like to read another perspective, check out what Rich over at Gestalt IT wrote about them and you can read the full press release here.

Nexenta Announces NexentaCloud

I haven’t spoken to Nexenta in some time, but that doesn’t mean they haven’t been busy. They recently announced NexentaCloud in AWS, and I had the opportunity to speak to Michael Letschin about the announcement.

 

What Is It?

In short, it’s a version of NexentaStor that you can run in the cloud. It’s ostensibly an EC2 machine running in your virtual private cloud using EBS for storage on the backend. It’s:

  • Available in the AWS Marketplace;
  • Is deployed on preconfigured Amazon Machine Images; and
  • Delivers unified file and block services (NFS, SMB, iSCSI).

According to Nexenta, the key benefits include:

  • Access to a fully-featured file (NFS and SMB) and block (iSCSI) storage array;
  • Improved cloud resource efficiency through
    • data reduction
    • thin provisioning
    • snapshots and clones
  • Seamless replication to/from NexentaStor and NexentaCloud;
  • Rapid deployment of NexentaCloud instances for test/dev operations;
  • Centralised management of NexentaStor and NexentaCloud;
  • Advanced Analytics across your entire Nexenta storage environment; and
  • Migrate legacy applications to the cloud without re-architecting your applications.

There’s an hourly or annual subscription model, and I believe there’s also capacity-based licensing options available.

 

But Why?

Some of the young people reading this blog who wear jeans to work every day probably wonder why on earth you’d want to deploy a virtual storage array in your VPC in the first place. Why would your cloud-native applications care about iSCSI access? It’s very likely they don’t. But one of the key reasons why you might consider the NexentaCloud offering is because you’ve not got the time or resources to re-factor your applications and you’ve simply lifted and shifted a bunch of your enterprise applications into the cloud. These are likely applications that depend on infrastructure-level resiliency rather than delivering their own application-level resiliency. In this case, a product like NexentaCloud makes sense in that it provides some of the data services and resiliency that are otherwise lacking with those enterprise applications.

 

Thoughts

I’m intrigued by the NexentaCloud offering (and by Nexenta the company, for that matter). They have a solid history of delivering interesting software-defined storage solutions at a reasonable cost and with decent scale. If you’ve had the chance to play with NexentaStor (or deployed it in production), you’ll know it’s a fairly solid offering with a lot of the features you’d look for in a traditional storage platform. I’m curious to see how many enterprises take advantage of the NexentaCloud product, although I know there are plenty of NexentaStor users out in the wild, and I have no doubt their CxOs are placing a great amount of pressure on them to don the cape and get “to the cloud” post haste.

Storage Field Day 15 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

This is a quick post to say thanks once again to Stephen and Ben, and the presenters at Storage Field Day 15. I had a super fun and educational time. For easy reference, here’s a list of the posts I did covering the events (they may not match the order of the presentations).

Storage Field Day – I’ll Be At Storage Field Day 15

Storage Field Day 15 – Day 0

Storage Field Day 15 – (Fairly) Full Disclosure

IBM Spectrum Protect Plus Has A Nice Focus On Modern Data Protection

Dropbox – It’s Scale Jim, But Not As We Know It

StarWind VTL? What? Yes, And It’s Great!

WekaIO – Not The Matrix You’re Thinking Of

Cohesity Understands The Value Of What Lies Beneath

Western Digital – The A Is For Active, The S Is For Scale

Come And Splash Around In NetApp’s Data Lake

Huawei – Probably Not What You Expected

Datrium Cloud DVX – Not Your Father’s Cloud Data Protection Solution

Hedvig’s Evolution

 

Also, here’s a number of links to posts by my fellow delegates (in no particular order). They’re all very smart people, and you should check out their stuff, particularly if you haven’t before. I’ll attempt to keep this updated as more posts are published. But if it gets stale, the Storage Field Day 15 landing page will have updated links.

 

Josh De Jong (@EuroBrew)

The Challenge Of Scale

Convergence Without Compromise

 

Glenn Dekhayser (@GDekhayser)

#SFD15: Datrium impresses

 

Chan Ekanayake (@ChanEk81)

Storage Field Day 15 – Introduction

Dropbox’s Magic Pocket: Power Of Software Defined Storage

A Look At The Hedvig Distributed Hybrid Cloud Storage Solution

Cohesity: A Secondary Storage Solution For The Hybrid Cloud?

NetApp’s & Next Generation Storage Technologies

 

Chin-Fah Heoh (@StorageGaga)

Always serendipitous Storage Field Days

Storage dinosaurs evolving too

Magic happening

Cohesity SpanFS – a foundational shift

NetApp and IBM gotta take risks

Own the Data Pipeline

Huawei Dorado – All about Speed

 

Mariusz Kaczorek (@Settlersoman)

 

Ray Lucchesi (@RayLucchesi)

Western Digital at SFD15: ActiveScale object storage

Huawei presents OceanStor architecture at SFD15

 

Dukagjin Maloku (@DugiDM)

Storage Field Day 15 … #SFD15

 

Michael Stanclift (@VMStan)

 

Lino Telera (@LinoTelera)

Back to Silicon Valley for Storage Field Day 15

Storage Field Day 15: Dropbox the high availability in a pocket

Storage Field Day 15: Cohesity the solution for secondary data

Storage Field Day 15: Weka.io

 

Arjan Timmerman (@ArjanTim)

Starwind software: SFD15 preview

 

Dr Rachel Traylor (@Mathpocalypse)

Commentary: White Papers Dont Impress Me Much

Dialogue: What Do We Mean By Predictive Analytics?

Little’s Law: For Estimation Only

 

Vendor Posts

Datrium @ Storage TechFieldDay

Storage Field Day Wrap-up: How Cohesity is Disrupting Legacy Backup

 

Thanks.

Hedvig’s Evolution

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Hedvig recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.

 

More Hybrid Than Ever

It’s been a little while since I’ve spoken to Hedvig. Since that time they’ve built on a platform that was already pretty robust and feature-rich.

[image courtesy of Hedvig]

 

Features

If you’re unfamiliar with Hedvig, this post by Ray Lucchesi provides a nice overview of the offering. There are a number of nice features, including the fact that it’s hypervisor agnostic. You can also run the proxy on bare metal deployed as KVM instance. Each host requires a proxy and there are 2 proxies per host (active / passive) for HA. It provides protocol consolidation on a single platform and can do deduplication, compression and encryption at a virtual disk level. Workloads map to a virtual disk, and the deduplication is global (and can be toggled on / off at a virtual disk level). Deduplication is performed at a block-level to a 4K granularity.

The default replication policy is “Agnostic” (let the system decide where to put the data), but you can also tell it that you need it to be “Rack Aware” or even “DC Aware”. The cool thing is that the same policies apply whatever protocol you use.

Hedvig uses a concept called Containers (no, not those containers, or those containers). These are assigned to storage pools, and striped across 3 disks.

There is demarcation between metadata and data.

Data Process:

  • Local data persistence
  • Replication

Metadata Process:

  • Global knowledge of everything happening in the cluster

The solution can integrate with external KMS infrastructure if you’re into that sort of thing, and there’s a really focus on “correctness” of data in the system.

 

Hedvig’s Evolution

Hedvig already had a good story to tell in terms of scalable, software-defined storage by the time I saw them in 2016. Their recent presentation demonstrated not just some significant re-branding, but also increased maturity around the interface and data protection features on offer with the platform. Most of the demonstration time was spent in the Hedvig GUI, in stark contrast to the last time I saw them when there was an almost constant requirement to drop in to the CLI to do a variety of tasks. At the time this made sense as the platform was relatively new in the market. Don’t misunderstand me, I’m as much a fan as anyone of the CLI, but it feels like you’re in with a better chance of broad adoption if you can also present a useable GUI for people to leverage.

Of course, whether or not you have a snazzy HTML 5 UI means nothing if you don’t have a useful product sitting behind that interface. It was clear from Hedvig’s presentation that they certainly do have something worthy of further consideration, particularly given its focus on data protection, geo-resilience and storage efficiency. The fact that it runs on pretty much anything you can think of is also a bonus. I don’t think too many people would dispute that SDS has a lot of advantages over traditional storage deployments. It’s often a lot more accessible and provides an easier, cheaper entry point for deployment. It can often be easier to get changes and improvements made to the platform that aren’t necessarily tied to particular hardware architectures, and, depending on the software in play, it can often run on just about any bit of x86 compute you want it to. The real value of solutions like Hedvig’s are the additional data protection and efficiency features that provide performance, scalability and resilience beyond the standard 2-node, 1000 disk midrange offerings.

Hedvig seem to be listening to their current and (potential) customers and are making usability and reliability a key part of their offering. I look forward to seeing how this develops over the next 12 months.