Loading…
Back To Schedule
Thursday, October 29 • 11:50am - 12:30pm
Hadoop on OpenStack: Scaling Hadoop-SwiftFS for Big Data

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Elastically scalable big data clusters that can respond to varying workload demands, while efficiently utilizing and sharing cloud resources, is a reality that is attainable with Hadoop on OpenStack.  To achieve that reality requires seperating cluster compute from cluster storage in order to enable scaling compute independently of data.  In this session we discuss how OpenStack Swift can serve as the basis for an elastically scalable Hadoop cluster on OpenStack and detail the challenges faced when using Swift as the primary data store for big data.  We describe the cluster storage design and enhancements to the Hadoop Swift file system implementation that are necessary to achieve performance at big data scale.

Successful approaches to a number of the challenges are presented:

  • Storage architecture design addressing object, block, and transient storage

  • Hadoop SwiftFS enhancements to handle tens of thousands to millions of objects

  • Vendor specific support for Swift API implementations (CEPH)

  • Tool ecosystem interoperability


Speakers
avatar for Andrew Leamon

Andrew Leamon

Director, Engineering Analysis, Comcast
Drew Leamon started his career at Microsoft while studying Computer Science at Princeton University.   In his studies, he delved into Computer Graphics, Artificial Intelligence and Computational Neurobiology.  At Microsoft, he collaborated with Microsoft Research on one of the first... Read More →
avatar for Chris Power

Chris Power

Principal Engineer, Comcast
At Comcast I work on many things cloud native including cloud architecture, operational visibility and data platforms.


Thursday October 29, 2015 11:50am - 12:30pm JST
Matsuba

Attendees (0)