Count–min sketch is a probabilistic data structure that serves as a frequency table of events in a stream of data. It uses hash functions to map events to frequencies, but unlike a hash table uses only sub-linear space, at the expense of overcounting some events due to collisions.
Bloom filters represent sets, while CM sketches represent multisets.
http://ift.tt/1v7NBNg
store a numerical value associated with each element, say the number of occurrences of the element in a stream
To obtain the count of an element, we take the minimum of the k fields that correspond to that element (as given by the hashes). This makes intuitive sense. Out of the k values, probably some have been incremented on other elements also (if there were collisions on the hash values).
the real count can never be larger than the reported number.
http://ift.tt/1cLZCbr
http://ift.tt/1zMY9B6
from Public RSS-Feed of Jeffery yuan. Created with the PIXELMECHANICS 'GPlusRSS-Webtool' at http://gplusrss.com http://ift.tt/1JhJAId
via LifeLong Community