E - the type of elements that this filter accepts@Beta public final class CuckooFilter<E> extends Object implements ProbabilisticFilter<E>, Serializable
E that implements the ProbabilisticFilter
interface.
"Cuckoo filters can replace Bloom filters for approximate set membership tests. Cuckoo filters support adding and removing items dynamically while achieving even higher performance than Bloom filters. For applications that store many items and target moderately low false positive rates, cuckoo filters have lower space overhead than space-optimized Bloom filters. Cuckoo filters outperform previous data structures that extend Bloom filters to support deletions substantially in both time and space." - Fan, et. al.
Cuckoo filters offer constant time performance for the basic operations add(Object),
remove(Object), contains(Object) and sizeLong().
This class does not permit null elements.
Cuckoo filters implement the Serializable interface. They also support a more compact
serial representation via the writeTo(OutputStream) and readFrom(InputStream,
Funnel) methods. Both serialized forms will continue to be supported by future versions of this
library. However, serial forms generated by newer versions of the code may not be readable by
older versions of the code (e.g., a serialized cuckoo filter generated today may not be
readable by a binary that was compiled 6 months ago).
ref: Cuckoo Filter: Practically Better Than Bloom Bin Fan, David G. Andersen, Michael Kaminsky†, Michael D. Mitzenmacher‡ Carnegie Mellon University, †Intel Labs, ‡Harvard University
ProbabilisticFilter,
Serialized Form| Modifier and Type | Method and Description |
|---|---|
boolean |
add(E e)
Adds the specified element to this filter.
|
boolean |
addAll(Collection<? extends E> c)
Adds all of the elements in the specified collection to this filter.
|
boolean |
addAll(ProbabilisticFilter<E> f)
Combines
this filter with another compatible filter. |
long |
capacity()
Returns the number of elements this filter can represent at its requested
FPP. |
void |
clear()
Removes all of the elements from this filter.
|
boolean |
contains(E e)
Returns
true if this filter might contain the specified element, false
if this is definitely not the case. |
boolean |
containsAll(Collection<? extends E> c)
Returns
true if this filter might contain all of the elements of the specified
collection. |
boolean |
containsAll(ProbabilisticFilter<E> f)
Returns
true if this filter might contain all elements contained in the
specified filter. |
CuckooFilter<E> |
copy()
Returns a new
CuckooFilter that's a copy of this instance. |
static <T> CuckooFilter<T> |
create(com.google.common.hash.Funnel<? super T> funnel,
long capacity)
Creates a filter with the expected number of insertions and a default expected false positive
probability of 3.2%.
|
static <T> CuckooFilter<T> |
create(com.google.common.hash.Funnel<? super T> funnel,
long capacity,
double fpp)
Creates a filter with the expected number of insertions and expected false positive
probability.
|
double |
currentFpp()
Returns the current false positive probability (
FPP) of this filter. |
boolean |
equals(Object object) |
double |
fpp()
Returns the approximate
FPP limit of this filter. |
int |
hashCode() |
boolean |
isCompatible(ProbabilisticFilter<E> f)
Returns
true if f is compatible with this filter. |
boolean |
isEmpty()
Returns
true if this filter contains no elements. |
static <T> CuckooFilter<T> |
readFrom(InputStream in,
com.google.common.hash.Funnel<T> funnel)
Reads a byte stream, which was written by
writeTo(OutputStream), into a CuckooFilter. |
boolean |
remove(E e)
Removes the specified element from this filter.
|
boolean |
removeAll(Collection<? extends E> c)
Removes from this filter all of its elements that are contained in the specified collection.
|
boolean |
removeAll(ProbabilisticFilter<E> f)
Subtracts the specified filter from
this filter. |
long |
size()
Returns the number of elements contained in this filter (its cardinality).
|
long |
sizeLong()
Returns the number of elements contained in this filter (its cardinality).
|
String |
toString() |
void |
writeTo(OutputStream out)
Writes this cuckoo filter to an output stream, with a custom format (not Java serialization).
|
@CheckReturnValue public CuckooFilter<E> copy()
CuckooFilter that's a copy of this instance. The new instance is equal to
this instance but shares no mutable state.@CheckReturnValue public boolean contains(E e)
true if this filter might contain the specified element, false
if this is definitely not the case.contains in interface ProbabilisticFilter<E>e - element whose containment in this filter is to be testedtrue if this filter might contain the specified element, false
if this is definitely not the case.NullPointerException - if the specified element is null and this filter does not
permit null elementscontainsAll(Collection),
containsAll(ProbabilisticFilter),
add(Object),
remove(Object)public boolean containsAll(Collection<? extends E> c)
true if this filter might contain all of the elements of the specified
collection. More formally, returns true if contains(Object) == true
for all of the elements of the specified collection.containsAll in interface ProbabilisticFilter<E>c - collection containing elements to be checked for containment in this filtertrue if this filter might contain all elements of the specified
collectionNullPointerException - if the specified collection contains one or more null
elements, or if the specified collection is nullcontains(Object),
containsAll(ProbabilisticFilter)public boolean containsAll(ProbabilisticFilter<E> f)
true if this filter might contain all elements contained in the
specified filter. isCompatible(ProbabilisticFilter) must return true for the
given filter.containsAll in interface ProbabilisticFilter<E>f - cuckoo filter containing elements to be checked for probable containment in this
filtertrue if this filter might contain all elements contained in the
specified filter, false if this is definitely not the case.NullPointerException - if the specified filter is nullIllegalArgumentException - if isCompatible(ProbabilisticFilter) == false
given fcontains(Object),
containsAll(Collection)@CheckReturnValue public boolean add(E e)
true if e was successfully
added to the filter, false if this is definitely not the case, as would be the
case when the filter becomes saturated. Saturation may occur even if sizeLong() < capacity(), e.g. if e has already been added 2*b times to the
cuckoo filter, it will have saturated the number of entries per bucket (b) allocated
within the filter and a subsequent invocation will return false. A return value of
true ensures that contains(Object) given e will also return true.add in interface ProbabilisticFilter<E>e - element to be added to this filtertrue if e was successfully added to the filter, false if this
is definitely not the caseNullPointerException - if the specified element is nullcontains(Object),
addAll(Collection),
addAll(ProbabilisticFilter)@CheckReturnValue public boolean addAll(ProbabilisticFilter<E> f)
this filter with another compatible filter. The mutations happen to this instance. Callers must ensure this filter is appropriately sized to avoid
saturating it or running out of space.addAll in interface ProbabilisticFilter<E>f - filter to be combined into this filter - f is not mutatedtrue if the operation was successful, false otherwiseNullPointerException - if the specified filter is nullIllegalArgumentException - if isCompatible(ProbabilisticFilter) ==
falseadd(Object),
addAll(Collection),
contains(Object)public boolean addAll(Collection<? extends E> c)
c may have been added to the filter even when false
is returned. In this case, the caller may remove(Object) the additions by comparing
the filter sizeLong() before and after the invocation, knowing that additions from
c occurred in c's iteration order.addAll in interface ProbabilisticFilter<E>c - collection containing elements to be added to this filtertrue if all elements of the collection were successfully added, false
otherwiseNullPointerException - if the specified collection contains a null element, or if
the specified collection is nulladd(Object),
addAll(ProbabilisticFilter),
contains(Object)public void clear()
clear in interface ProbabilisticFilter<E>sizeLong(),
isEmpty()@CheckReturnValue public boolean remove(E e)
false is returned, this is definitely an indication that
the specified element wasn't contained in the filter prior to invocation. This condition is an
error, and this filter can no longer be relied upon to return correct false responses
from contains(Object), unless isEmpty() is also true.remove in interface ProbabilisticFilter<E>e - element to be removed from this filtertrue if this filter probably contained the specified element, false
otherwiseNullPointerException - if the specified element is null and this filter does not
permit null elementscontains(Object),
removeAll(Collection),
removeAll(ProbabilisticFilter)@CheckReturnValue public boolean removeAll(Collection<? extends E> c)
false is returned, this is definitely an indication that the specified
collection contained elements that were not contained in this filter prior to invocation, and
this filter can no longer be relied upon to return correct false responses from contains(Object), unless isEmpty() is also true.
Some elements of c may have been removed from the filter even when false is
returned. In this case, the caller may add(Object) the additions by comparing the
filter sizeLong() before and after the invocation, knowing that removals from c occurred in c's iteration order.removeAll in interface ProbabilisticFilter<E>c - collection containing elements to be removed from this filtertrue if all of the elements of the specified collection were successfully
removed from the filter, false if any of the elements was not successfully removedNullPointerException - if the specified collection contains one or more null
elements, or if the specified collection is nullcontains(Object),
remove(Object),
removeAll(ProbabilisticFilter)@CheckReturnValue public boolean removeAll(ProbabilisticFilter<E> f)
this filter. The mutations happen to this
instance. Callers must ensure that the specified filter represents elements that are currently
contained in this filter.
If false is returned, this is definitely an indication that the specified filter
contained elements that were not contained in this filter prior to invocation and this filter
can no longer be relied upon to return correct false responses from contains(Object), unless isEmpty() is also true.removeAll in interface ProbabilisticFilter<E>f - filter containing elements to remove from this filter - f is not
mutatedtrue if the operation was successful, false otherwiseNullPointerException - if the specified filter is nullIllegalArgumentException - if isCompatible(ProbabilisticFilter) == false
given fcontains(Object),
remove(Object),
removeAll(Collection)public long sizeLong()
Long.MAX_VALUE elements, returns Long.MAX_VALUE.sizeLong in interface ProbabilisticFilter<E>capacity(),
isEmpty(),
size()public long size()
Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.size in interface ProbabilisticFilter<E>capacity(),
isEmpty(),
sizeLong()public long capacity()
FPP. It's
sometimes possible to add more elements to a cuckoo filter than its capacity since the load
factor used to calculate its optimal storage size is less than 100%.capacity in interface ProbabilisticFilter<E>FPP.fpp(),
currentFpp(),
sizeLong(),
optimalLoadFactor(int)public double fpp()
FPP limit of this filter. This is not a hard limit, however a
cuckoo filter will not exceed its FPP by a significant amount as the filter becomes
saturated.fpp in interface ProbabilisticFilter<E>FPP limit of this filter.currentFpp()public double currentFpp()
FPP) of this filter.currentFpp in interface ProbabilisticFilter<E>contains(Object) will erroneously return true
given an element that has not actually been added to the filter. Unlike a bloom filter, a
cuckoo filter cannot become saturated to the point of significantly degrading its FPP.fpp()public boolean isEmpty()
true if this filter contains no elements.isEmpty in interface ProbabilisticFilter<E>true if this filter contains no elementssizeLong()public boolean isCompatible(ProbabilisticFilter<E> f)
true if f is compatible with this filter. f is
considered compatible if this filter can use it in combinatoric operations (e.g. addAll(ProbabilisticFilter)).isCompatible in interface ProbabilisticFilter<E>f - The filter to check for compatibility.true if f is compatible with this filter.ProbabilisticFilter.addAll(ProbabilisticFilter),
ProbabilisticFilter.containsAll(ProbabilisticFilter),
ProbabilisticFilter.removeAll(ProbabilisticFilter)@CheckReturnValue public static <T> CuckooFilter<T> create(com.google.common.hash.Funnel<? super T> funnel, long capacity, double fpp)
Note that overflowing a CuckooFilter with significantly more
objects than specified, will result in its saturation causing add(Object) to reject
new additions.
The constructed CuckooFilter will be serializable if the
provided Funnel<T> is.
It is recommended that the funnel be implemented as a
Java enum. This has the benefit of ensuring proper serialization and deserialization, which is
important since equals(java.lang.Object) also relies on object identity of funnels.
funnel - the funnel of T's that the constructed CuckooFilter will usecapacity - the number of expected insertions to the constructed CuckooFilter; must
be positivefpp - the desired false positive probability (must be positive and less than 1.0).CuckooFilter@CheckReturnValue public static <T> CuckooFilter<T> create(com.google.common.hash.Funnel<? super T> funnel, long capacity)
Note that overflowing a CuckooFilter with significantly
more objects than specified, will result in its saturation causing add(Object) to
reject new additions.
The constructed CuckooFilter will be serializable if the
provided Funnel<T> is.
It is recommended that the funnel be implemented as a
Java enum. This has the benefit of ensuring proper serialization and deserialization, which is
important since equals(java.lang.Object) also relies on object identity of funnels.
funnel - the funnel of T's that the constructed CuckooFilter will usecapacity - the number of expected insertions to the constructed CuckooFilter; must
be positiveCuckooFilterpublic void writeTo(OutputStream out) throws IOException
readFrom(InputStream, Funnel) to reconstruct the written CuckooFilter.IOException@CheckReturnValue public static <T> CuckooFilter<T> readFrom(InputStream in, com.google.common.hash.Funnel<T> funnel) throws IOException
writeTo(OutputStream), into a CuckooFilter. The Funnel to be used is not encoded in the stream, so it must be
provided here. Warning: the funnel provided must behave identically to the one
used to populate the original Cuckoo filter!IOException - if the InputStream throws an IOException, or if its data does not
appear to be a CuckooFilter serialized using the writeTo(OutputStream) method.Copyright © 2016. All rights reserved.