Jump to Content

Hokusai | Sketching Streams in Real Time

Sergiy Matusevych
Alex Smola
Proceedings of the 28th International Conference on Conference on Uncertainty in Artificial Intelligence (UAI) (2012)

Abstract

We describe 北斎 Hokusai, a real time system which is able to capture frequency information for streams of arbitrary sequences of symbols. The algorithm uses the Count-Min sketch as its basis and exploits the fact that sketching is linear. It provides real time statistics of arbitrary events, e.g. streams of queries as a function of time. We use a factorizing approximation to provide point estimates at arbitrary (time, item) combinations.