Bitsets

Track a lotta things is a smaller space

Published at

Table of Contents

Bitsets are cool, little data structures that track membership of a set by doing bit flipping. I also think they're a great illustration of how an alternate implementation of a structure can improve memory and/or process usage far more than continuing to fine tune an existing implementation.

Set interface

Let's start with defining our set interface. Sets only track membership so we need methods for inserting, removing and testing membership. We'll also want some set operations which combine two sets in several ways, I picked intersection, difference and union. Finally a method to check if two sets are equal in terms of storing exactly the same keys.

There are more operations that test two sets such as subset and strict subset testing but these are the only ones we'll be implementing. I've also decided to some what arbitrarily restrict our inputs to uint7, but more on this later.

1
2
3
4
5
6
7
8
9
type Set interface {
    Set(uint)
    Unset(uint)
    IsSet(uint)
    Intersect(Set) Set
    Difference(Set) Set
    Union(Set) Set
    Eq(Set) bool
}

We'll ignore that many types could implement this interface and we'll always cast an incoming Set to our current type.

Hashset

Go doesn't have a hashset type, but it does expose map which is backed by a hash table8. We'll use this as a convenience over implementing our hash table which isn't what I'm interested in today. The type itself just houses a map and the first few operations are similar thin repackaging of map behavior.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// since we need a value we'll use an empty struct
// however any singleton value is fine
var sentinel struct{} = struct{}

type HashSet struct {
    members map[uint]struct{}
}

func(h HashSet) Set(i uint) {
    h.members[i] = sentinel
}

func(h HashSet) Unset(i uint) {
    delete(h.members, i)
}

func(h HashSet) IsSet(i uint) bool {
    _, ok := h.members[i]
    return ok
}

func(h Hashset) Eq(s Set) bool {
    s, ok := s.(Hashset)
    if !ok {
        return false 
    }
    return reflect.DeepEqual(h, s)
}

Interacting with maps is easy so these methods are easy to implement around one. We don't need to use reflect.DeepEqual in our Eq(Set) bool other than continuing the brevity, we could compare the maps' lengths and if they're the same iterate through the keys of one and ensure they're all present in the other.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
func (h Hashset) Intersect(s Set) Set {
    r := make(map[uint]struct{})

    for k, i := range h.members {
        if s.IsSet(k) {
            r[k] = i
        }
    }

    return Hashset{r}
}

func(h Hashset) Difference(s Set) Set {
    r := make(map[uint]struct{})

    for k, i := range h.members {
        if !s.IsSet(k) {
            r[k] = i
        }
    }

    return Hashset{r}
}

func(h Hashset) Union(s Set) Set {
    r := make(map[uint]struct{})

    for k, i := range h.members {
        r[k] = i
    }

    for k, i := range s.members {
        r[k] = i
    }

    return Hashset{r}
}

I don't find these methods particularly interesting either, we loop through our current set and construct a new set by checking if a given key is present or not in the other set. For union we just dump everything into the new set. map's easy to use interface is still putting in work here, especially getting iteration that hands up both the key and value. We could write r[k] = sentinel but r[k] = i is shorter.

Interesting or not, we have a working set that fulfills our interface. Let's run a few benchmarks to get a feel for its performance.

Hash Benchmarks

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
go test -bench=.  -benchmem
goos: linux
goarch: amd64
pkg: sets
cpu: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
BenchmarkHashIntersection/input_size=10-4                1740823               703.2 ns/op           290 B/op          3 allocs/op
BenchmarkHashIntersection/input_size=100-4                 92962             13855 ns/op            6113 B/op         17 allocs/op
BenchmarkHashIntersection/input_size=1000-4                12140             97511 ns/op           46785 B/op         53 allocs/op
BenchmarkHashIntersection/input_size=10000-4                1288            931386 ns/op          382260 B/op        166 allocs/op
BenchmarkHashIntersection/input_size=100000-4                 73          13916640 ns/op         5930163 B/op       2867 allocs/op
BenchmarkHashIntersection/input_size=1000000-4                 4         252172806 ns/op        47847564 B/op      23549 allocs/op
BenchmarkHashIntersection/input_size=10000000-4                1        2628033324 ns/op        389551976 B/op    189851 allocs/op
BenchmarkHashDifference/input_size=10-4                  4119684               281.3 ns/op           128 B/op          2 allocs/op
BenchmarkHashDifference/input_size=100-4                  191556              6062 ns/op            2871 B/op         11 allocs/op
BenchmarkHashDifference/input_size=1000-4                  27051             45186 ns/op           23565 B/op         32 allocs/op
BenchmarkHashDifference/input_size=10000-4                  2668            423950 ns/op          192235 B/op         99 allocs/op
BenchmarkHashDifference/input_size=100000-4                  199           6143656 ns/op         2963950 B/op       1366 allocs/op
BenchmarkHashDifference/input_size=1000000-4                  10         105350987 ns/op        23897523 B/op      11814 allocs/op
BenchmarkHashDifference/input_size=10000000-4                  1        1190046673 ns/op        194338008 B/op     94705 allocs/op
BenchmarkHashUnion/input_size=10-4                       1375946               865.1 ns/op           162 B/op          1 allocs/op
BenchmarkHashUnion/input_size=100-4                        76006             15905 ns/op            5985 B/op         15 allocs/op
BenchmarkHashUnion/input_size=1000-4                        9514            113752 ns/op           46654 B/op         51 allocs/op
BenchmarkHashUnion/input_size=10000-4                       1077           1065363 ns/op          382112 B/op        163 allocs/op
BenchmarkHashUnion/input_size=100000-4                        57          17652993 ns/op         5934321 B/op       2882 allocs/op
BenchmarkHashUnion/input_size=1000000-4                        4         254964793 ns/op        47863844 B/op      23666 allocs/op
BenchmarkHashUnion/input_size=10000000-4                       1        3175715966 ns/op        389455064 B/op    188982 allocs/op
PASS

I'm not a benchmarking expert by any means, I didn't ensure the benchmarks were running in an isolated environment with consistent memory and cpu access. Instead these are running inside my editor, along side several browser tabs including a playing video and I'm pretty sure the plex server on this machine is in active use right now. That's how my code usually runs so I'm fine to include that noise in my benchmarks.

All that said, I'm not at all surprised by these results. For reference, 3031971678ns is roughly 3.03s which for iterating through 10 million items twice and throwing them all into a single map sounds about right to me. To say that the hash set has hot loops I think is an understatement, and branching certainly doesn't help. More accurate benchmarking would likely introduce some fuzziness or even more pathological conditions in the sets, but this is good enough for me.

3.03 seconds to combine twenty million items might as well be infinitely faster than I could do that myself but it does seem kind of slow for what's ultimately a bunch of integer comparisons.

Bitmask

Before getting into a Bitset, it's useful to look at it's little sibling the bit mask. Under the hood, integers are actually just a clump of bits that are interpreted as a binary number: uint64(7) == 0b00..1111. There's differences in how a uint64 and int64 are interpreted, namely with unsigned integers we get the full field and with signed integers we get 1 less bit because the highest bit is reserved for the sign bit, int64(7) == 0b100..00111. This also has implications for how large of an absolute number a signed integer can represent compared to an unsigned integer, that's less interesting for us but useful to know if you don't mess with unsigned and signed integers. Also note I've said "integer" because floats are handled much differently and I won't be covering them here.

The biggest reason we're using uint64 instead of int64 is to avoid negative numbers which are invalid indicies and result in a compiler error when used as shift argument.

We can take advantage of this underlying bit field to encode up to 642 individual flags. While we can't actually index into this field like field[54] = 1 we can effectively do that via shifting. 1 << 54 creates an integer where only the 54th bit is set. From there we can use other operators to combine this mask with our existing field.

If we want to set this bit and only the bit, we bitwise OR the two values together: field | (1 << 54). This operator says "take every value that's present in either bit field" which sets our single bit while leaving the rest of the field alone. Since 1 << 54 is mostly zeros, we'll take the values of the original field for those.

We can check if this bit is set by using a bitwise AND operator: field & (1 << 54) will return either 1 << 54 if the field also has the 54th bit set or 0 if it doesn't because we're taking only the bits present in both the field and the mask.

Finally for us we can unset a single bit by combining a bitwise COMPLEMENT and bitwise AND. First we create our mask 1 << 54 and then we invert the resulting mask ^(1 << 54). This turns our mask with only the 54th bit set to the opposite, every bit except the 54th is set. Finally we AND these to take every bit from the original field except the 54th since it is zero in our mask: field & ^(1 << 54).

These three operations are our set, is set and unset operations. If we use two fields with OR and AND then we also have union and intersect. We can also use the same COMPLEMENT AND combination to represent difference. But what if we want to store more than 64 things, what if we want to store 128 things? Even more than that?

Bitsets

That's where bitsets come into play. By dividing our input value by the number of bits in the field, again 64 in this case, we determine which bucket we need. From there we modulo our input to determine which bit we need to twiddle. Take a moment to convince yourself of this.

With that in mind, let's start looking at the implementation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
type Bitset64 struct {
    buckets []uint64
}

func bucket64(i uint) int {
    // slice idx are int
    return int(i / 64) 
}

func bit64(i uint) uint64 {
    return i % 64
}

The set type isn't very different from the hash set, just the container type is different and we have two helper functions. I prefer to keep these separate even in actual implementation because it helps me keep which value is which. Our set, unset and isset implementations are slightly more complicated than just finding a bucket and interrogating a single bit.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
func (b *Bitset64) Set(i uint) {
    bucket := bucket64(i)
    if bucket >= len(b.buckets) {
        buckets := make([]uint64, bucket+1)
        copy(buckets, b.buckets)
        b.buckets = buckets
    }

    bit := bit64(i)
    b.buckets[bucket] |= (1 << bit)
}

Set handles finding the bucket and setting the correct bit, but it also handles ensuring the slice is big enough for the largest value. There's lots of strategies to grow a slice, going to just big enough is easy enough to implement without adding too much noise to the code we're actually interested in examining.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
func (b *Bitset64) Unset(i uint) {
    bucket := bucket64(i)
    if bucket >= len(b.buckets) {
        return
    }

    bit := bit64(i)
    b.buckets[bucket] &= ^(1 << bit)
}

func (b *Bitset64) IsSet(i uint) bool {
    bucket := bucket64(i)
    if bucket >= len(b.buckets) {
        return false
    }
    bit := bit64(i)
    mask := 1 << bit
    return mask == (mask & uint(b.buckets[bucket]))
}

Unset and IsSet also take advantage of the number of buckets, if our target is bigger than we can store then we either have nothing to unset or we simply don't have it in our set. Otherwise we use the techniques described above to unset or interrogate the single bit.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
func (b *Bitset64) Intersect(o Set) Set {
    a := o.(*Bitset64)
    lo := min(len(a.buckets), len(b.buckets))

    r := new(Bitset64)
    r.buckets = make([]uint64, lo)

    for i := range r.buckets {
        r.buckets[i] = b.buckets[i] & a.buckets[i]
    }

    return r
}

Since intersection is defined as only the elements common to two sets we can't simply copy one of the existing sets since its could have more buckets than the other. Instead we create a bitset with the least amount of buckets necessary and then since we have at least that many buckets in both sets, we can collect their indicies and use bitwise AND to combine the two sets.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
func (b *Bitset64) Union(o Set) Set {
    a := o.(*Bitset64)
    r := new(Bitset64)

    lo, hi := a, b
    if len(lo.buckets) > len(hi.buckets) {
        lo, hi = hi, lo
    }

    r.buckets = make([]uint64, len(hi.buckets))
    copy(r.buckets, hi.buckets)

    for i := range lo.buckets {
        r.buckets[i] |= lo.buckets[i]
    }

    return r
}

For union we want every value in both sets so we can copy the set with more buckets. Aliasing a and b to low and high makes keeping track of things a little easier than flipping a and b around directly. Once we have the longer buckets copied in, we gather up the values in the second set and OR them into our values which gives us the complete union.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
func(b *Bitset64) Difference(o Set) Set {
    a := o.(*Bitset64)
    r := new(Bitset64)
    r.buckets = make([]uint64, len(b.buckets))
    copy(r.buckets, b.buckets)

    lo := min(len(b.buckets), len(a.buckets))

    for i := range r.buckets[:lo] {
        r.buckets[i] = b.bucket[i] & ^a.bucket[i]
    }

    return r
}

Difference and intersect are almost the same except we want to begin with every value in our current set b rather than least amount of buckets, and our operation is the AND-COMPLEMENT construct rather than just AND.

Finally, to complete the interface we need to implement Eq(Set) bool which is actually non-trivial. We can't just compare the lengths of both underlying slices because the higher buckets could just be 0 -- a high value was set at one point but isn't any longer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
func (b *Bitset64) Eq(o Set) bool {
    a := o.(*Bitset64)
    hi, lo := b.buckets, a.buckets

    if len(lo) > len(hi) {
        hi, lo = lo, hi
    }

    if len(lo) != len(hi) {
        for _, bucket := range hi[len(lo):] {
            if bucket != 0 {
                return false
            }
        }
    }

    for i := range lo {
        if lo[i] != hi[i] {
            return false
        }
    }

    return true
}

Bitset Benchmarks

Okay, we've done the implementation let's look at the benchmarks. Again, I've taken no particular care to ensure these ran in isolation, etc.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
go test -bench=.  -benchmem
goos: linux
goarch: amd64
pkg: sets
cpu: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
BenchmarkBitIntersection/input_size=10-4                61197063                19.45 ns/op            8 B/op          1 allocs/op
BenchmarkBitIntersection/input_size=100-4               36194941                31.10 ns/op           16 B/op          1 allocs/op
BenchmarkBitIntersection/input_size=1000-4              19380409                61.08 ns/op          128 B/op          1 allocs/op
BenchmarkBitIntersection/input_size=10000-4              2688464               425.0 ns/op          1280 B/op          1 allocs/op
BenchmarkBitIntersection/input_size=100000-4              333483              3641 ns/op           13568 B/op          1 allocs/op
BenchmarkBitIntersection/input_size=1000000-4              32014             37906 ns/op          131072 B/op          1 allocs/op
BenchmarkBitIntersection/input_size=10000000-4              2706            453040 ns/op         1253377 B/op          1 allocs/op
BenchmarkBitDifference/input_size=10-4                  56505009                20.10 ns/op            8 B/op          1 allocs/op
BenchmarkBitDifference/input_size=100-4                 38064043                30.02 ns/op           16 B/op          1 allocs/op
BenchmarkBitDifference/input_size=1000-4                17978188                63.96 ns/op          128 B/op          1 allocs/op
BenchmarkBitDifference/input_size=10000-4                2642118               443.4 ns/op          1280 B/op          1 allocs/op
BenchmarkBitDifference/input_size=100000-4                282554              3850 ns/op           13568 B/op          1 allocs/op
BenchmarkBitDifference/input_size=1000000-4                30898             39630 ns/op          131072 B/op          1 allocs/op
BenchmarkBitDifference/input_size=10000000-4                2480            490636 ns/op         1253378 B/op          1 allocs/op
BenchmarkBitUnion/input_size=10-4                       21759050                55.01 ns/op           32 B/op          2 allocs/op
BenchmarkBitUnion/input_size=100-4                      18084280                66.34 ns/op           40 B/op          2 allocs/op
BenchmarkBitUnion/input_size=1000-4                     11940280               105.0 ns/op           152 B/op          2 allocs/op
BenchmarkBitUnion/input_size=10000-4                     2272080               526.8 ns/op          1304 B/op          2 allocs/op
BenchmarkBitUnion/input_size=100000-4                     307348              3968 ns/op           13592 B/op          2 allocs/op
BenchmarkBitUnion/input_size=1000000-4                     29458             40671 ns/op          131096 B/op          2 allocs/op
BenchmarkBitUnion/input_size=10000000-4                     2488            500384 ns/op         1253400 B/op          2 allocs/op
PASS

These are massive improvements. Even the fastest hash set results were at least an order of magnitude larger than their bitset counterparts. It's probably unfair to compare the 10 million tests since the hash set benchmarks only ran once each, but:

It's possible to squeeze even more performance out of bitsets with techniques like loop unrolling and "single instruction, multiple data" or SIMD which would allow us to compare multiple buckets at the same time9.

The allocations are also far far far less, but why it isn't twice for each benchmark I'm not sure since every benched method creates both a new bitset and a new underlying slice for it.

However, I do want to also say that map[K]V is a general purpose collection and it needs to several general purpose needs. We could make a specialized implementation of a hash table that was well suited for unsigned integer keys. Effectively that's exactly what we've done; our hashing method is just division and modulo rather than another technique.

Bonus

There's a few more interesting methods I think worth exploring with bitsets. That's a Len() method that calculates from the current state of the bitset and iterating through the bitset. If you were as unfamiliar with bit twiddling as I was when I started implementing this change in my ECS you might be you might be thinking both operations grow linearly with each bucket added. Maybe something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
func(b *Bitset) Len() int {
    var count int

    for _, bucket := range b.buckets {
        for i := 0; i < 64; i++ {
            if (bucket & (1 << i)) == (1 << i) {
                count++
            }
        }
    }

    return count
}

The benefit of this approach is its simple, requires little explanation and we can easily unroll the inner loop to hard code the 64 comparisons if we wanted. Some creativity could avoid the branch as well. Regardless we'll always check all 64 bits of each bucket, which is fine for very dense bitsets you won't be losing much between that and other solutions. However, if we have primarily sparse bitsets then we might want to explore other options. I found myself in the primarily sparse bitsets situation.

First let's look at one that uses an interesting property of binary numbers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
func(b *Bitset64) Len() int {
    var count int

    for _, bucket := range b.buckets {
        for bucket != 0 {
            bucket &= bucket - 1
            count++
        }
    }

    return count
}

What the heck does n & (n-1) do? You may have encountered it as a way to check if a number if a power of 2 but that's just one application of this characteristic. Subtracting 1 from a binary number causes all the bits following the least significant bit to flip. Consider a bitset of {0, 8, 11}, that is 1st, 9th and 12th bits set, and how that would be processed:3

1
2
3
4
5
6
7
8
n            | n - 1        | n & (n-1)
---------------------------------------
        2305 |         2304 |         2304
100100000001 | 100100000000 | 100100000000
        2304 |         2303 |         2048
100100000000 | 100011111111 | 100000000000
        2048 |         2047 |            0
100000000000 | 011111111111 | 000000000000

Instead of checking against all 64 bits we just counted how iterations it takes to reduce the bucket to 0. We can actually do even better with some complicated bit flipping but the Go standard library already did that for us and put it behind bits.OnesCount644 so our length count can actually just be:

1
2
3
4
5
6
7
8
9
func(b *Bitset) Len() int {
    var count int

    for _, bucket := range b.buckets {
        count += bits.OnesCount64(bucket)
    }

    return count
}

We can use the same naive method to locate every member set, instead of incrementing a count we'd calculate bucketNum * 64 + i to reconstruct the original integer toss it into a slice. Instead, we can use find first set techniques to figure out what individual bits are set in each bucket, in particular bits.TrailingZeros64 which will let us do exactly the same algorithm but with a different i: k * 64 + TrailingZeros64(bucket). A trailing zeros operation starts counting at the first bit and counts every 0 until it reaches the first 1, the least significant bit5.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
func(b *Bitset64) Elems() []uint64 {
    var elems []uint64

    for k, bucket := range b.buckets {
        k := uint64(k)
        for bucket != 0 {
            tz := uint64(bits.TrailingZeros64(bucket))
            elems = append(elems, k*64+tz)
            bucket ^= (1 << tz)
        }
    }

    return elems
}

bucket ^= (1 << tz) looks confusingly like our friend bitwise complement but it is actually XOR, or eXclusive OR. XOR records false or 0 when inputs are the same and true or 1 when the inputs are different.

1
2
3
4
5
6
A | B | R
---------
T | T | F
T | F | T
F | T | T
F | F | F

Let's work through our {0, 8, 11} with this algorithm:

1
2
3
4
5
6
7
8
n            |1 << tz       | n ^ (1 << tz)
-------------------------------------------
             |       1 << 0 |             
100100000010 | 000000000001 | 100100000000
             |       1 << 8 |             
100100000000 | 000100000000 | 100000000000
             |      1 << 11 |             
100000000000 | 100000000000 | 000000000000

At each step the number of trailing zeros represents the original value we stored6 and the XOR with (1 << tz) removes that single bit from the set, leaving us with more zeroes to count.

Full Code Listing

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
package sets

import (
    "reflect"
)

type Set interface {
    Set(int)
    Unset(int)
    IsSet(int) bool
    Intersect(Set) Set
    Difference(Set) Set
    Union(Set) Set
    Eq(Set) bool
}

var sentinel struct{}

type HashSet struct {
    members map[int]struct{}
}

func (h HashSet) Set(i int) {
    h.members[i] = sentinel
}

func (h HashSet) Unset(i int) {
    delete(h.members, i)
}

func (h HashSet) IsSet(i int) bool {
    _, ok := h.members[i]
    return ok
}

func (h HashSet) Eq(s Set) bool {
    s, ok := s.(HashSet)
    if !ok {
        return false
    }
    return reflect.DeepEqual(h, s)
}

func (h HashSet) Intersect(s Set) Set {
    r := make(map[int]struct{})

    for k := range h.members {
        if s.IsSet(k) {
            r[k] = sentinel
        }
    }

    return HashSet{r}
}

func (h HashSet) Difference(s Set) Set {
    r := make(map[int]struct{})

    for k := range h.members {
        if !s.IsSet(k) {
            r[k] = sentinel
        }
    }

    return HashSet{r}
}

func (h HashSet) Union(s Set) Set {
    r := make(map[int]struct{})

    for k := range h.members {
        r[k] = sentinel
    }

    for k := range (s.(HashSet)).members {
        r[k] = sentinel
    }

    return HashSet{r}
}

type Bitset64 struct {
    buckets []uint64
}

func bucket64(i int) int {
    return i / 64
}

func bit64(i int) uint64 {
    return uint64(i) % 64
}

func (b *Bitset64) Eq(o Set) bool {
    a := o.(*Bitset64)
    hi, lo := b.buckets, a.buckets

    if len(lo) > len(hi) {
        hi, lo = lo, hi
    }

    if len(lo) < len(hi) {
        for _, bucket := range hi[len(lo):] {
            if bucket != 0 {
                return false
            }
        }
    }

    for i := range lo {
        if lo[i] != hi[i] {
            return false
        }
    }

    return true
}

func (b *Bitset64) Set(i int) {
    bucket := bucket64(i)
    if bucket >= len(b.buckets) {
        buckets := make([]uint64, bucket+1)
        copy(buckets, b.buckets)
        b.buckets = buckets
    }

    bit := bit64(i)
    b.buckets[bucket] |= (1 << bit)
}

func (b *Bitset64) Unset(i int) {
    bucket := bucket64(i)
    if bucket >= len(b.buckets) {
        return
    }

    bit := bit64(i)
    b.buckets[bucket] &= ^(1 << bit)
}

func (b *Bitset64) IsSet(i int) bool {
    bucket := bucket64(i)
    if bucket >= len(b.buckets) {
        return false
    }
    bit := bit64(i)
    mask := 1 << bit
    return mask == (mask & int(b.buckets[bucket]))
}

func (b *Bitset64) Intersect(o Set) Set {
    a := o.(*Bitset64)
    r := new(Bitset64)
    r.buckets = make([]uint64, len(b.buckets))
    copy(r.buckets, b.buckets)

    lo := min(len(a.buckets), len(b.buckets))

    for i := 0; i < lo; i++ {
        r.buckets[i] = b.buckets[i] & a.buckets[i]
    }

    return r
}

func (b *Bitset64) Difference(o Set) Set {
    a := o.(*Bitset64)
    r := new(Bitset64)
    r.buckets = make([]uint64, len(b.buckets))
    copy(r.buckets, b.buckets)
    lo := min(len(a.buckets), len(b.buckets))

    for i := range r.buckets[:lo] {
        r.buckets[i] = b.buckets[i] & ^a.buckets[i]
    }

    return r
}

func (b *Bitset64) Union(o Set) Set {
    a := o.(*Bitset64)
    r := new(Bitset64)

    lo, hi := a, b
    if len(lo.buckets) > len(hi.buckets) {
        lo, hi = hi, lo
    }

    r.buckets = make([]uint64, len(hi.buckets))
    copy(r.buckets, hi.buckets)

    for i := range lo.buckets {
        r.buckets[i] |= lo.buckets[i]
    }

    return r
}
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
package sets

import (
    "fmt"
    "math/bits"
    "testing"
)

var setSizes = []int{
    10,
    100,
    1000,
    10_000,
    100_000,
    1_000_000,
    10_000_000,
}

func pairOfHashSets() (HashSet, HashSet) {
    var a, b HashSet
    a.members = make(map[int]struct{})
    b.members = make(map[int]struct{})
    return a, b
}

func pairOfBitSets() (*Bitset64, *Bitset64) {
    return new(Bitset64), new(Bitset64)
}

func BenchmarkHashIntersection(bench *testing.B) {
    for _, size := range setSizes {
        size = size
        bench.Run(fmt.Sprintf("input_size=%d", size), func(bb *testing.B) {

            a, b := pairOfHashSets()
            for i := 0; i < size; i++ {
                a.Set(i)
                b.Set(i)
            }

            bb.ResetTimer()

            for i := 0; i < bb.N; i++ {
                c := a.Intersect(b)
                c.Unset(0)
            }
        })
    }
}

func BenchmarkHashDifference(bench *testing.B) {
    for _, size := range setSizes {
        size = size
        bench.Run(fmt.Sprintf("input_size=%d", size), func(bb *testing.B) {

            a, b := pairOfHashSets()
            for i := 0; i < size; i++ {
                if i%2 == 0 {
                    a.Set(i)
                } else {
                    b.Set(i)
                }
            }

            bb.ResetTimer()

            for i := 0; i < bb.N; i++ {
                c := a.Difference(b)
                c.Unset(0)
            }
        })
    }
}

func BenchmarkHashUnion(bench *testing.B) {
    for _, size := range setSizes {
        thisSize := size
        bench.Run(fmt.Sprintf("input_size=%d", thisSize), func(bb *testing.B) {
            a, b := pairOfHashSets()
            for i := 0; i < thisSize; i++ {
                a.Set(i)
                b.Set(i)
            }

            bb.ResetTimer()

            for i := 0; i < bb.N; i++ {
                c := a.Union(b)
                c.Unset(0)
            }
        })
    }
}

func BenchmarkBitIntersection(bench *testing.B) {
    for _, size := range setSizes {
        size = size
        bench.Run(fmt.Sprintf("input_size=%d", size), func(bb *testing.B) {

            a, b := pairOfBitSets()
            for i := 0; i < size; i++ {
                a.Set(i)
                b.Set(i)
            }

            bb.ResetTimer()

            for i := 0; i < bb.N; i++ {
                c := a.Intersect(b)
                c.Unset(0)
            }
        })
    }
}

func BenchmarkBitDifference(bench *testing.B) {
    for _, size := range setSizes {
        size = size
        bench.Run(fmt.Sprintf("input_size=%d", size), func(bb *testing.B) {

            a, b := pairOfBitSets()
            for i := 0; i < size; i++ {
                if i%2 == 0 {
                    a.Set(i)
                } else {
                    b.Set(i)
                }
            }

            bb.ResetTimer()

            for i := 0; i < bb.N; i++ {
                c := a.Difference(b)
                c.Unset(0)
            }
        })
    }
}

func BenchmarkBitUnion(bench *testing.B) {
    for _, size := range setSizes {
        thisSize := size
        bench.Run(fmt.Sprintf("input_size=%d", thisSize), func(bb *testing.B) {
            a, b := pairOfBitSets()
            for i := 0; i < thisSize; i++ {
                a.Set(i)
                b.Set(i)
            }

            bb.ResetTimer()

            for i := 0; i < bb.N; i++ {
                c := a.Union(b)
                c.Unset(0)
            }
        })
    }
}

  1. Eliding some of the spots for space. There's 64 bits here for both signed and unsigned, but most of them are going to be zero in this example. 

  2. The size of our bit field, a uint16 would be 16 and a uint32 would have 32. uint is hardware dependent and is probably 64 but it could be a smaller value, or possible larger if you're reading this in the future! 

  3. I've truncated the higher bits off the 64 to save horizontal space. 

  4. Currently ordering a copy of Hacker's Delight so I can learn more bit tricks. 

  5. Here's the general idea:

    1
    2
    3
    4
    5
    6
     n | tz
    --------
     111 | 0
     110 | 1
     100 | 2
    1000 | 3
    
     

  6. Not the result of the left shift, not that I've tricked myself into thinking that before. 

  7. The set could receive a generic type but I felt it would distract from the core point.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    type Hasher interface {
      // should be constant to provide identity
      Hash() uint
    }
    
    type Set[T Hasher] interface {
      Set(T)
      // ...
    }
    
    And then we would call T.Hash() and we're back at our implementations. A more interesting generic parameter is [T ~uint64], at least for the purposes I ended up using bitsets for. Without the first thing you end doing is uint64(i) to place the value into the set.
    1
    2
    3
    4
    5
    6
    7
    8
    type Id uint64
    type Set[T ~uint64] interface {
      Set(T)
    }
    
    var ids Set[Id] = // ...
    var i Id = 9
    ids.Set(i)
    
     

  8. As of time of writing at least 

  9. It's also possible to swap division and modulo, min/max and possibly even some of the branches with more complicated bitwise operations if we were trying to squeeze every, single drop and then some.