Skip to content
All posts

Why stack fingerprints matter for deduplication

A good error dashboard has one row per bug. A bad one has one row per occurrence, and nobody can tell which of the hundred TypeError entries on the first screen are actually the same underlying problem. The thing that separates the two is fingerprinting — and specifically, what you put into the fingerprint.

Why the message alone isn't enough

A naive fingerprint hashes the error message. It sort of works for clean errors ("Cannot read properties of undefined (reading 'id')") and breaks catastrophically for anything that includes a dynamic value. Consider these three messages captured in the wild:

  • Request to /users/a1b2c3d4 failed with 500
  • Request to /users/e5f6g7h8 failed with 500
  • Request to /users/x9y0z1a2 failed with 500

All three are the same bug, but a message-based fingerprint sees three different errors. Multiply by the number of ids, timestamps, IP addresses, and hex chunks your application emits, and the dashboard becomes unusable.

Why the message alone isn't unique enough either

The opposite failure mode is worse. Two unrelated code paths can throw the same TypeError for different reasons — a missing guard in the dashboard header and a missing guard in the billing reducer are structurally identical messages but totally different bugs. If your fingerprint collapses them, you fix one place and wonder why the "same" error keeps firing.

What goes into a good fingerprint

nreactive's fingerprint is a hash over four pieces: a normalized message, the error type, the normalized file path, and a signature of the top few stack frames. Each piece handles a different failure mode:

  • Normalized message. We replace hex strings longer than eight characters, IP addresses, millisecond and second counts, floating-point numbers, query strings, UUID-like path segments, and numeric path segments with placeholders. The three "request to /users/..." errors above collapse to one. But "Cannot read 'id' of undefined" and "Cannot read 'name' of undefined" stay distinct because the literal property name survives.
  • Error type. Keeps TypeError and ReferenceError in different buckets even when the messages rhyme.
  • Normalized file path. Strips query params and normalizes hex chunks, so the same file seen across deploys with different bundle hashes still fingerprints identically.
  • Stack signature. The top five stack frames, each reduced to file:line with the line snapped to its nearest 10-bucket, joined with a separator. Two errors with the same top frames are almost always the same bug; the line-snapping means trivial upstream edits don't shatter the fingerprint.

Why the 10-line snap matters

Every time someone adds an import or tweaks a comment above a function, every line in that file shifts by one. If you fingerprint on exact line numbers, the very next deploy creates a brand-new fingerprint for every error in that file, and the dashboard fills up with duplicates overnight. Snapping to 10-buckets gives you enough resolution to distinguish different functions in a file while being stable across normal editing.

Snap too aggressively (100-buckets, say) and two genuinely different bugs in the same large file collapse into one. Ten is the empirically stable sweet spot for the codebases we've tested against.

What happens when the SDK doesn't send a stack

Some errors have no stack — early boot failures, resource-load failures for images and scripts. In those cases the stack signature degrades to empty and the fingerprint falls back to message plus type plus file. That's enough to dedupe the common cases and accepts some over-collapsing for the edge cases.

Regression detection

The fingerprint does one more job: it's how regressions are caught. When a merged PR's error fingerprint reappears within 48 hours, the PR gets flagged as regressed. And when a long-dormant fingerprint (seven days or more since last-seen) fires again, that's recorded as a regression too, even without a prior fix. The dashboard surfaces both classes at the top of the list so they don't get lost in day-to-day noise.

Fingerprinting is one of those pieces that feels like plumbing until you turn it off for a weekend and discover the dashboard is uninhabitable. Treat it as a first-class feature.

The minimum useful signal

One way to reason about whether a fingerprint is good enough: imagine a reviewer opening the dashboard after a holiday week. Can they tell from a glance which errors are ongoing problems, which are new, and which have regressed? If yes, the fingerprint is doing its job. If not, no amount of downstream polish will make the dashboard usable. Everything else in the pipeline — dedup windows, regression detection, occurrence trends, suppression thresholds — sits on top of that one layer. Get it right, and the rest composes cleanly.