Real Time Analytics with NoSQL and Combinators
Share this Session:
  Greg Lindahl   Greg Lindahl


Thursday, August 22, 2013
11:15 AM - 11:45 AM

Level:  Technical - Intermediate

blekko's home-grown NoSQL database has a compute-in-the-database feature named "Combinators" which is useful for (batch-oriented) search engine crawling and indexing. This feature, in combination with ram and SSD storage, is also useful for real-time analytics and computation. It is used to power our abuse detection system, trending queries and news stories, a real-time performance dashboard, and our real-time Map/Reduce system. We will cover:
  • the concept of "combinators", and examples of their use
  • transparent caching of data in ram/SSD, while minimizing writes to extend MLC flash lifetimes
  • various applications using these features
  • experiences and lessons learned

Greg Lindahl is CTO and Founder at blekko, where he works on datacenter operations, blekko's home-grown NoSQL datastore, and diverse things. Formerly, Greg was Founder and Distinguished Engineer at PathScale, at which he was the architect of the InfiniPath low-latency InfiniBand HCA, used to build tightly-coupled supercomputing clusters. PathScale was successfully sold to QLogic in 2006,and the InfiniPath technology is now being used by Intel for their exoscale computing efforts. Prior to PathScale's founding in 2001, he worked on commodity Linux clusters at HPTi, including the 1999 Forecast Systems Lab system, which was the first time a Linux cluster won a conventional supercomputing procurement. Prior to this, he worked on the Legion "grid" distributed OS project at the University of Virginia, and for D. E. Shaw & Co., a New York investment bank. Greg holds a MA in Astronomy from the University of Virginia, and a BA in Math and Physics from Brandeis University.

Close Window