Friday, November 18, 2011

NetApp Case Study in Big Data

2 speakers, an Enterprise Architect from NetApp and Principal Architect from Think Big Analytics

Big Data is not really a problem of particular interests for CAE, however, the approach taken to tackle NetApp Big Data problem might be. They use a lot of Open Source systems to tackle their issues.

Presentation slides: not published

Agenda
  • NetApp big data challenge
  • Technology assessment
  • Drivers
AutoSupport Family
  • catch issues before they become critical
  • secure automated call-home service
  • system monitoring and non intrusive alerting
  • RMA request without customer action
  • enables faster incident management
AutoSupport - Why does it matter?
see slide


Business challenges

Gateways
  • Scale out and respect their SLAs
  • 600K ASUPs per week
  • 40% coming over the week end
  • 5% growth per week

ETL
  • data parsed and loaded in 15 min.
Data Warehouse
  • only 5% goes in data warehouse, but even then, gwoing 6-8TB per month
  • Oracle DBMS struggling
  • no easy access to this unstructured data
reporting
  • numerous mining request not met currently
Incoming AutoSupport Volumes and TB consumption
  • Exponential flat file storage requirement

The Petri Dish
  • NetApp has its own BigData challenges with AutoSupport Data
  • Executive guidance was to chose the solution that solved our business problem first: independent of our own technologies
  • NetApp product teams very interested in our decision making process (the in-house customer petri dish)\
ASUP Next - Proof of concept (POC) strategy
  • don't swallow the elephant whole
  • Use technical POCs to validate solution components
  • Review the POCs results and determine the end state solution
  • size the implementation costs using the results of the POCs
  • Make vendor decisions based on facts!
Requirements Used for POC and RFP
  • Cost effective
  • Highly scalable
  • Adaptive
  • New Analytical capabilities
Technology Assessment
Essentially, they followed the following steps to validate their solutions
  1. Defined POC Use cases
  2. Did a Solution Comparison based on their technology survey
  3. They prototyped
  4. they benchmarked
AutoSupport: Hadoop Use case in POC
with a single 10 node hadoop cluster (on E-series with 60 disks of 2TB), they were able to change a 24 billion records query that took 4 weeks to load into a 10.5 hours load time and a previously impossible 240 billions query into one that now runs in only 18 hours!

This is why they chose Hadoop!

Solution Architecture
multiple slides on architectures, requirements, considerations.

Physical Hadoop Architecture
see slide for all details on disks, networks, machines, RAM, etc.
All their machines run on RedHat Entrerprise (RHEL) 5.6, using ext3 filesystem

Hadoop Architecture components
Flume: Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Hadoop’s HDFS.

  • HDFS: Hadoop Distributed File System, is the primary storage system used by Hadoop applications.
  • HBase: the Hadoop database to be used when you need random, realtime read/write access to your Big Data.
  • MapReduce: a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers.
  • Pig: from Apache, it is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.
Ingestion data flows
3 types of ASUPs (AutoSUPport)
  1. Actionable
  2. Weekly logs
  3. Performance data
10 flume servers for 200GB/hour
less than 1 minute for case creation

Performance benchmarks
see slides

DAS vs E-Series performance benchmark
E-series multi LUN configuration changes improved read-write I/O by 65%

Hadoop Benefits
  • Scalability - by decouling computing from the storage
  • Improved Performance - improved response time to meet their SLAs (faster than Oracle DBMS onunstructured data abd faster that tradiontal server-based storage
  • Enterprise Robustness - grade protection of HDFS metadata
Takeaways
  • NetApp assessed multiple traditional DB technologies to solve its Big data problem and determine Hadoop was the best fit.
  • HW & SW short and long term costs were key in the decision to move to Hadoop on ESeries

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.