High-Performance Transactional Queues on HBase
Share this Session:
  Andreas Neumann   Andreas Neumann
Software Engineer


Wednesday, August 21, 2013
02:15 PM - 03:00 PM

Level:  Technical - Intermediate

In the Continuuity AppFabric, a flow processes events in real time with exactly-once guarantee, passing interstitial data downstream via queues. All operations involved in processing of a data object, from dequeuing through data operations in the course of processing to emission of new data objects, are performed with ACID properties: atomically, consistently, isolated and durably.

Queues in a flow are demanding: They must support roll-back of queue operations for transactions, reading of a queue by multiple, partitioned and distributed consumers, retention of data objects after dequeue, and deletion after all consumers have successfully processed, and at the same time they must be durably persistent.

We have developed the Continuity Data Fabric as a unified, transactional queuing and columnar table system that supports all of the above requirements. This talk will discuss the implementation on top of HBase, evaluate performance, scalability and reliability, and share experiences, best practices, and lessons learned.

Andreas develops big data software at Continuuity, and has formerly done so at places that are known for massive scale. He was the chief architect for Hadoop at Yahoo! and also for the foundational content management system that Yahoo! built on Hadoop. Previously he was a research engineer at Yahoo! and a search architect at IBM. Andreas holds a doctoral degree in computer science for his work on querying XML documents.

Close Window