I’ve always been fascinated by how rsync can synchronize large files so quickly just by sending the differences. Instead of just reading about it again, I decided to actually implement the core engine behind it: the rolling checksum.
Specifically, I spent the weekend with the Adler32 rolling implementation, following the research from Andrew Tridgell's original thesis.
The Modulo Trap
The "rolling" part of Adler32 is mathematically brilliant because it lets you calculate a new hash in O(1) time by just subtracting the byte that left the window and adding the new byte that entered. The magic number is 65521 (the largest prime under 2^16), which keeps the sums a and b within 16 bits.
Here's the math translated into Go:
const mod = 65521
func (r *RollSum) Rotate(b byte) {
if len(r.window) == 0 {
return
}
leave, enter := r.circle(b)
r.a = (((r.a + enter - leave) % mod) + mod) % mod
r.b = (((r.b - r.windowLen*leave - 1 + r.a) % mod) + mod) % mod
}...And here is where I lost an hour to a completely avoidable bug. In Go (and C, and many others), the % operator isn't actually a mathematical modulo—it's a remainder operator. If the dividend is negative after subtracting the leave byte, % can return a negative value.
To get a true mathematical modulo, I had to use the ((n % mod) + mod) % mod pattern. It’s an ugly, verbose little detail that completely ruins the checksum if you forget it. I assumed the standard library hash/adler32 would have some magic helper for this, but it doesn't do rolling out of the box, so you're on your own.
Manual Window Management
To actually roll the hash, you need to know exactly which byte is falling out of the window. I implemented this using a simple slice:
func (r *RollSum) circle(b byte) (leave, enter uint32) {
r.removed = r.window[0]
r.window = append(r.window[1:], b)
enter = uint32(b)
leave = uint32(r.removed)
return
}This works, but it's honestly terrible for performance. Calling append and shifting the slice on every single byte means the garbage collector is going to have a field day. For a toy weekend project, a slice is fine for readability, but if I actually wanted to process gigabytes of data, I’d have to rewrite this entire thing using a pre-allocated circular buffer.
Proving I Didn't Break It
Because the math is so finicky, I had to write a battery of tests against Go's standard (non-rolling) hash/adler32 implementation just to prove my math wasn't hallucinating values.
package rollsum
import (
"fmt"
"hash/adler32"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func classic(data []byte) uint32 {
a := adler32.New()
a.Write(data)
return a.Sum32()
}
func TestRollInAndOut(t *testing.T) {
a := []byte("Adler-32 is a checksum")
roll := New(1 << 16)
for _, x := range a {
roll.In(x)
}
require.Equal(t, classic(a), roll.Sum32())
// Testing the actual roll
roll.Out()
a = a[1:]
assert.Equal(t, classic(a), roll.Sum32())
}We spend so much of our time working with high-level abstractions like HTTP frameworks and ORMs that taking a weekend to deal with memory allocation and bitwise shifts feels almost therapeutic. It’s not about building a production-ready rsync replacement, it’s just nice to remember how the underlying mechanics actually work without a framework holding your hand.
