Disclaimer: I recently attended Pure//Accelerate 2018. My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are my rough notes from Wednesday’s General Session at Pure//Accelerate 2018.
[A song plays. “NVMe” to the tune of Naughty By Nature’s OPP]
Charlie Giancarlo (Pure Storage CEO) takes the stage. We share a mission: to power innovation. Storage is really important part of making that mission happen. We’re in the zettabyte era, we don’t even talk about EB anymore. It’s not the silicon age, or the Internet age, or the social age. We’re talking about the original gold rush of 1849. The amount gold in data is unlimited. We need the tools to pick the gold out of the data. The data heroes are us. And we’re announcing a bunch of tools to mine the gold.
Who am I? What has allowed us to get to success? Where are we going?
I’m new here, I’ve just gotten through his 3rd quarter. I’ve been an nngineer, entrepreneur, CTO, equity partner – entirely tech focused. I’ve made a living looking at innovation on the basis of looking at it as a 3-legged stool.
What are the components that advance tech?
- Compute (Processing)
They advance on their own timelines, and don’t always keep pace, and the industry someitmes gets out of balance. Data centre and system architectures adjust to accommodate this.
Density has multiplied by a factor of 10 in 10 years (slow down of Moore’s Law), made up for this by massive scale in the DC
Multiplied by 10 in 8 years, 10Gbps about 10 years ago, and 100Gbps about 2 years ago
- Multiplied by a factor of 1000
- Storage vendors just haven’t kept up
- Storage on white boxes?
Pure came in and bought balance to this picture, allowing storage to keep up with networking and compute.
It’s all about
- Business model
- Customer experience
“Data is the most important asset that you have”
Pure Storage became a billion dollar company in revenue last year (8 years in, 5 years after it introduced its first product). It’s cashflow positive and “growing like a bat out of hell” with over 4800 customers. Had less than 500 customers just 4 years ago. And a large chunk of customers are cloud providers. Also in 30% of the Fortune 500.
The software economy sits on top of the compute / network / storage stool. Companies are becoming more digital. Last year they talked about Domino’s, and this year they’re using AI to analyse your phone calls. Your calls are being answered by an AI engine that takes your order. An investment bank has more computer engineers and developers than they have investment bankers. Companies need to feed these apps with data. Data is where the money is.
- Monolithic scale-up – client / server (1990s)
- Virtualised – x86 + virtualisation (2000s)
- Scale-out – cloud (2010s)
Previously big compute – apps are rigid, now there’s big data – apps are fluid, data is shared
- 100% shared
- Built for rapid innovation
Dedicated storage and stateless compute
- Large booking, travel, e-commerce site
- PAIGE.AI – cancer pathology – digitised samples from the last decade
- Man AHL – economy and stock market modelling
Behind all these companies is a “data hero”
Over 80% of CxOs believe that the speed of analysing data would be one of their biggest competitive issues, but CIOs worried about not being able to keep up with data coming in to the enterprise.
“We empower innovators to build a better word with data”
- Modern data pipeline
- A whole new model
- Pure “on-demand”
- The AI Era
“New meets Now”
It takes great people to make a great company – the amazing “Puritans”. Pure have a NPS score of 83.7 – best in B2B.
Matt Kixmoeller takes the stage. We need a new architecture to unlock the value of data. Back in 2009. Michael Jackson died, Obama was in, Fusion-IO had just started. Pure came along and had the idea of building an AFA. Today we’re going to bring you the sequel
- There’s basically SAN / NAS and DAS (which has seen a resurgence in web scale era)
- DAS reality – many apps, rigid scaling, either too much storage or too much compute
New technologies to re-wire the DC
- Diverse, fast compute (CPU, GPU, FPGA)
- Fast networks and protocols (RoCE, NVMe-oF)
- Diverse SSD
- Eliminates the outside the box penalty
- Gets CPUs totally focussed on applications
What if we can finally unite SAN and DAS into a data-centric architecture?
Gartner have identified “Shared accelerated storage”. “The NVMe-oF protocol … will help balance the performance and simplicity of direct-attached storage (DAS) with the scalability and manageability of shared storage”.
“Tier 0”? – they’re making the same mistake again. Pure are focused on shared accelerated storage available for all.
Tomorrow can look like this
- Diskless, stateless, elastic compute (continuers, VMs, bare metal)
- Shared accelerated storage (block, file, object)
- Fast, converged networks
- Open, full-stack orchestration
- Dealing with high volumes of data
- Tremendous growth in net new data
- 18 months ago, doing basic web scale, DAS architecture
- Filling up DCs at a very fast clip
- Stopped and analysed everything there was
What happens in an hour in the DC?
In one hour our customers:
- 7.5 million performance analytics scores computed
- 730,000 configuration items added
- 274,000 notifications sent
- 76,000 assets added
- 49,200 live feed messages
- 36,300 change requests
- 15,600 workflow activities
Every hour of the day our engineering teams:
- Develop code across the globe in 9 global develoipment locations (SD, SC, SF, Kirkland, London, Amsterdam, Tel Aviv, Hyderabad, Bangalore)
- Use 450 copies of ServiceNow for quality engineering testing
- Run 100,000 automated quality engineering tests
In one hour on our infrastructure
- 25 billion database queries
- 112 million HTTP requests
- 2.5 million emails
- 25.3 million API calls
- 493TB of backups
We were going through
- 30K hard drives
- 3500+ servers
- >2000 failed HDDs per year
CPU time was being consumed with backup data movement and restore times were becoming longer and longer. They started to look at the FlashBlade. With its small footprint and low power it was a really interesting option for them. It was really easy to setup and use. They let the engineers out of their cages to play with it in the lab and found it was surprisingly hard to break. So they’ve decided to start using FlashBlade in production as their standard for protection data.
Achieving 3x density now
Each rack has:
- 30 1RU servers
- 1000 compute cores
- 1.5PB effective Flash
Decided to test and implement FlashArray as well and they’re excited about FlashArray//X. ServiceNow cares about uptime. Pure has the best non-disruptive upgrade, expansion and repair model. DAS can prove to be expensive at scale.
Kix takes the stage again
- 2016: FlashBlade – the world’s first AFA for big data
- 2017: FlashArray//X
Introducing the FlashArray//X Family
Bill Cerreta takes the stage.
- The FlashArray was launched in 2012, Purity was built to optimise Flash
- //M chassis designed for NVMe
- Deep integration of software and hardware
Where are we going with Flash?
SCM, QLC. We’ve eliminated translation layers. The X//90, for example, has
- Dual-Protocol controllers – speaks to both SSD and NVMe
- The 10 through 90 have 25GbE onboard
- Everything’s NVMe/oF ready and this will be added via software later in the year
- Double the write bandwidth of //M
- This year, they’re all in on //X
- 7 generations of evergreen, non-disruptive upgrades [photo]
- //X makes everything faster (compared to //M)
Priced for mainstream adoption
- Early attempts at NVMe cost 10x more than AFAs
- //X, when introduced last year, was 25% more than //M
- $0 premium for //X over //M on an effective capacity basis
[Customer video – Berrios]
Jason Nadeau takes the stage. Most infrastructure wasn’t built to allow data to flow freely.
- 10s of products
- Complex design
- Silos, difficult to share
- Consolidated and simplified
- On-demand and self-driving
- Ready for tomorrow
API-first model and software at the heart of the architecture.
Sandeep Singh takes the stage. A lot of companies have managed to virtualise. A lot have managed to “flash-ify”. But a lot of them have yet to automate and “service-ize”, to “container-ize”, or to adopt multi-cloud.
Automate and service-ize – on every cloud platform
- VMware SDDC – VMware SDDC validated design
- Open automation – pre-built open full-stack automation toolkits
- Openshift PaaS – container-based reference architecture
Simon Dodsley takes the stage to talk with Sandeep about MongoDB deployments in less than a minute (down from 5 days).
Sandeep continues. Container adoption is increasing quickly but there’s a lack of storage support for persistent containers. Pure have container plug-ins for Docker, Kubernetes. Containerized apps want to consume storage as-a-service. Introducing Pure Service Orchestrator.
Introduced ActiveCluster last year. Snapshots and snapshot mobility (portable snapshots introduced last year) are important.
- Snap to NFS is generally available now
- CloudSnap to AWS S3 (available in late 2018)
- DeltaSnap open API (Veeam, Catalogic, actifiio, CommVault, Rubrik, Cohesity)
Jason Nadeau comes back on stage. Data as-a-service consumption. Leases aren’t pay per use and aren’t a service-like experience
Introducing Evergreen Storage Service (ES2)
- Pay per used GB
- True open
- Terms as short as 12 months
- Always evergreen
- Onboard in days
- Always “better-than-cloud” economics
Capex with Evergreen storage, Opex with ES2
[Video on PAIGE.AI]
Matt Burr takes the stage. Unlocking the value of what was once cold data. New era demands a new data mindset.
- How has the value of data changed?
- How can you extract that value?
- How can you get started today?
A robot will replace a human surgeon. A machine has learned to adapt faster than the human brain can. More and more data will live in the hotter tier. What tools can make this valuable? Change in the piggy bank – like data. But data is stuck in silos.
- Data warehouse
- Data lake
- Modern data pipeline
- AI data pipeline
$/GB used to make sense. We need new metrics. $/flops? $/simulation. Real value is generated by simplifying and accelerating the data flow. Build a data hub on FlashBlade. FlashBlade is 16 months old (GA in January 2016).
Invites NVIDIA’s Rob Ober on stage
“The time has come for GPU computing”
- Moore’s Law is flattening an awful lot
- NVIDIA as “the AI computing platform”
- “The more you buy, the more you save”
Traditional hyper scale cluster – 300 dual-CPU servers, 180KW power, or you can deploy 1 DGX-2, 10KW.
Science fiction is being made possible
- Ultrasound retrofit
- 5G beam
- Molecule modelling 1/10 millionth $
- Design guesswork
- Deployment complexity
- Multiple points of support
AI scaling is hard, “not like your traditional infrastructure”
- Jointly-validated solution
- Faster, simplified deployment
- Trusted expertise and support
Kix takes the stage again. There’s a big gap in AI infrastructure, with customers spread across varying stages of journey from single server -> scale-out infrastructure. Introduces AIRI Mini and they’re also extending AIRI to Cisco.
Data Warehouse pitfalls
- Performance not keeping up with data
- Pricing extortions and over-provisioning
- Inflexible appliances built for a single workload
Progress has to have a foundation.
Customer example of telco in Asia moving from Exadata to FlashBlade
Introducing FlashStack for Oracle Data Warehouse
Set your data free
Dave Hatfield takes the stage. Thanks for coming. Over 5000 people in the Bill Graham Civic Auditorium and a lot watching on-line. Customers, partners. Be sure to check out the “petting zoo” (Solutions Pavilion). We wanted to have something that was “not your father’s storage show. Your father’s storage show happened last month”. Anyone been to a Grateful Dead show? It’s a community experience, you don’t know what will happen next.
And that’s a wrap.