Bloom filters in Ethereum

2 minute read



Bloom filters are compact data structures for probabilistic representation of a set in order to support membership queries (i.e. queries that ask: “Is element X in set Y?”).

An empty Bloom filter is a bit array of m bits, all set to 0. There must also be k different hash functions defined, each of which maps or hashes some set element to one of the m array positions, generating a uniform random distribution. Typically, k is a constant, much smaller than m, which is proportional to the number of elements to be added; the precise choice of k and the constant of proportionality of m are determined by the intended false positive rate of the filter.

To add an element, feed it to each of the k hash functions to get k array positions. Set the bits at all these positions to 1.

To query for an element (test whether it is in the set), feed it to each of the k hash functions to get k array positions. If any of the bits at these positions is 0, the element is definitely not in the set – if it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then either the element is in the set, or the bits have by chance been set to 1 during the insertion of other elements, resulting in a false positive. In a simple Bloom filter, there is no way to distinguish between the two cases, but more advanced techniques can address this problem


  • space efficient


  • can’t store an associated object
  • deletions are not allowed
  • small false positive probability


  • weak password dictionary
  • cache sharing
  • query filtering and routing
  • blockchain (logs)

Ethereum bloom filter function

Here’s excerpt from the yellow paper:


If you didn’t understand anything, no worries, I will try to describe its implementation in the next section.

Elixir implementation

  @spec bloom(integer(), binary()) :: integer()
  def bloom(data, number \\ 0) do
    bits =
      |> sha3_hash            \\ (1)
      |> bit_numbers          \\ (2)

    number |> add_bits(bits)  \\ (3)

The bloom/2 method accepts 2 aguments:

  • data - binary. a new object to add to the bloom filter
  • number - integer. the bloom filter. default value is zero (for new filter)

Bloom filter generation can be divided into three parts:

  • calculate sha3 hash of an input data (1)
  • get three bit indices from this hash (2)
  • set these indices to one in the filter (3)
  @spec bit_numbers(binary()) :: [integer()]
  defp bit_numbers(hash) do
    {result, _} =
      |> Enum.reduce({[], hash}, fn(_, acc) ->
        {bits, <<head::integer-size(16), tail::bitstring>>} = acc
        new_bit = head &&& 2047

        {[new_bit|bits], tail}


The bit_numbers/1 method gets indices from the low order 11-bits of the first three double-bytes of the SHA3 hash.

  @spec add_bits(integer(), [integer()]) :: integer()
  defp add_bits(bloom_number, bits) do
    |> Enum.reduce(bloom_number, fn(bit_number, bloom) ->
      bloom ||| (1 <<< bit_number)

The last thing left to do is to define method to check if filter has an object:

  @spec contains?(integer(), binary()) :: boolean()
  def contains?(current_bloom, val)
      when is_integer(current_bloom) and
           is_binary(val) do
    temp_bloom = bloom(val)

    (temp_bloom &&& current_bloom) == temp_bloom

Now we can use our Bloom filter:

iex(1)> filter = EthBloom.create("rock")

iex(2)> new_filter = EthBloom.add(filter, "blues")

iex(3)> EthBloom.contains?(new_filter, "punk")

iex(4)> EthBloom.contains?(new_filter, "rock")

iex(5)> EthBloom.contains?(new_filter, "blues")

Described code has a GitHub repository -

See also