Welcome to EverybodyWiki ! Sign in or create an account to improve, watchlist or create an article, a company page or a bio (yours ?)...
Follow us on https://twitter.com/EverybodyWiki !
Cuckoo filter
The following Wikipedia contributors may be personally or professionally connected to the subject of the article. Relevant policies and guidelines may include conflict of interest, autobiography, and neutral point of view. |
- (non-AfC-reviewer comment) If possible, it would be great to get a layman explanation in the introduction, the guideline Wikipedia:Make technical articles understandable might be of help. Would also be useful to mention what the use cases are. – Þjarkur (talk) 00:52, 17 March 2019 (UTC)
A cuckoo filter is a space-efficient probabilistic data structure to serve approximated set-membership queries like a Bloom filter. A cuckoo filter can also delete existing items, which is not supported by Bloom filters. In addition, for applications that store many items and target moderately low false positive rates, cuckoo filters can achieve lower space overhead than space-optimized Bloom filters.^{[1]}
Cuckoo filters were first described in 2014.^{[2]}
Contents
Algorithm description[edit]
A cuckoo filter uses a 4-way set-associative hash table based on cuckoo hashing to store the fingerprints of all items. Particularly, the two potential buckets in the table for a given item <math>x</math> required by cuckoo hashing are calculated by the following two hash functions (termed as partial-key cuckoo hashing^{[2]}):
- <math>h_1\left(x\right)=\text{hash}\left(x\right)</math>
- <math>h_2\left(x\right)=h_1\left(x\right)\mod\text{hash}\left(\text{fingerprint}\left(x\right)\right)</math>
Applying the above two hash functions to construct a cuckoo hash table enables item relocation based only on fingerprints when retrieving the original item is impossible. As a result, when inserting a new item that requires to relocate an existing item <math>y</math>, the other possible location <math>j</math> in the table for this item <math>y</math> kicked out from bucket <math>i</math> is calculated by
- <math>j = i\mod\text{hash}\left(\text{y's fingerprint}\right)</math>
Based on partial-key cuckoo hashing, the hash table can achieve both highly-utilization (thanks to cuckoo hashing), and compact because only fingerprints are stored. Lookup and delete operations of cuckoo filter are straightforward. There are a maximum of two locations to check by <math>h_1(x)</math> and <math>h_2(x)</math>. If found, the appropriate lookup or delete operation can be performed in <math>O(1)</math> time. More theoretical analysis of cuckoo filter can be found in related literatures.^{[3]}^{[4]}
Comparison to Bloom filters[edit]
A cuckoo filter is similar to a Bloom filter in way that they both are very fast and compact, and they may both return false positives answers to set-membership queries:
- A space-optimal Bloom filters use <math>1.44\log_2(1/\epsilon)</math> bits of space per inserted key, where <math>\epsilon</math> is the false positive rate. A cuckoo filter requires <math>(\log_2(1/\epsilon) + 2)/\alpha</math> where <math>\alpha</math> is the hash table load factor which can be <math>95.5\%</math> based on cuckoo filters' setting. Note that, the information theoretical lower bound requires <math>\log_2(1/\epsilon)</math> bits for each item.
- On a positive lookup, a space-optimal Bloom filter requires a constant <math>\log_2(1/\epsilon)</math> memory accesses into the bit array, whereas a cuckoo filter require constant two lookups at most.
- Cuckoo filters have degraded insertion speed after reaching a load threshold when table expanding is recommended. In contrast, Bloom filters can keep inserting new items at the cost of higher false positive rate before expansion.
Limitations[edit]
- A cuckoo filter can only delete items that are known to be inserted before.
- Insert can fail and rehashing is required like other cuckoo hash table. Note that, the amortized insertion complexity is still <math>O(1)</math>^{[5]}
References[edit]
Others articles of the Topic Computer programming : Ada Lovelace, Lua (programming language), Peachpie (compiler), Koseven (framework)
Some use of "" in your query was not closed by a matching "".Some use of "" in your query was not closed by a matching "".
- ↑ Michael D. Mitzenmacher. "Bloom Filters, Cuckoo Hashing, Cuckoo Filters, Adaptive Cuckoo Filters, and Learned Bloom Filters".
- ↑ ^{2.0} ^{2.1} Fan, Bin; Andersen, Dave G.; Kaminsky, Michael; Mitzenmacher, Michael D. (2014). Cuckoo filter: Practically better than Bloom. Proc. 10th ACM Int. Conf. Emerging Networking Experiments and Technologies (CoNEXT '14). pp. 75–88. doi:10.1145/2674005.2674994. ISBN 9781450332798.
- ↑ David Eppstein (22 June 2016). Cuckoo filter: Simplification and analysis. In 15th Scandinavian Symposium and Workshops on Algorithm Theory, SWAT 2016. Reykjavik, Iceland. pp. 8:1–8:12.
- ↑ Noah Fleming. "Cuckoo Hashing and Cuckoo Filters" (PDF).
- ↑ Pagh, Rasmus; Rodler, Flemming Friche (2001). "Cuckoo Hashing". Algorithms — ESA 2001. Lecture Notes in Computer Science. 2161. pp. 121–133. doi:10.1007/3-540-44676-1_10. ISBN 978-3-540-42493-2.
External links[edit]
This article "Cuckoo filter" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Cuckoo filter. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.