Five invariants of a correct double-entry ledger

Five invariants every double-entry ledger has to enforce at write time, with a reference implementation in Go and the named tests that pin each one down.

Most public ledger implementations will happily double-post a payout under retry. Several of the top GitHub results store amounts as floating point. Most have no test that proves replay produces the same balances as the live projection. The question is not even asked. The boring properties are where ledgers go wrong, and they are also the ones that never make it into a README.

The categories repeat regardless of language or stack. Floating-point arithmetic, because someone wanted easy math and did not think about the third decimal place. Idempotency, because retries get bolted on at the HTTP layer instead of treated as an invariant the ledger itself enforces. Replay, because the projection reads the wall clock or iterates a map and quietly diverges from itself. Currency, because the cleanest way to handle FX in code is to convert at the entry boundary, and the cleanest way to fail an audit is the same thing.

ledgercore is around 800 lines of Go with zero third-party dependencies. It is written as a reference implementation. It is not the ledger you ship, it's for you to name the five properties out loud, and then go check whether your own ledger has them.

The rest of this post walks the five invariants. For each: a one-sentence statement, a concrete failure mode, the code that prevents it, and the named test that pins it down. Then a section on what the implementation deliberately leaves out, and why those lines were drawn where they were.

Invariant 1: balanced postings

Every posted transaction's signed entry amounts must sum to zero.

The most common way I have seen this fail in practice is a backfill script. Someone writes a one-off to import historical data: adjustments, fee corrections, the kind of work nobody wants to do twice. The script computes amounts in floating point, rounds at the last step, and produces transactions that are off by one minor unit each. Each one looks balanced if you eyeball it. Six weeks later finance cannot tie out to the bank statement and nobody knows when the drift started. The repair is forensic. You walk the event log, find every transaction that summed to ±1, and decide one at a time whether to reverse it.

func (l *Ledger) checkBalanceInvariant(tx Transaction) error {
    var sum int64
    for _, e := range tx.Entries {
        next := sum + e.Amount.Amount()
        // Detect int64 overflow in the running sum so a malicious or
        // buggy caller cannot wrap around to zero.
        if (e.Amount.Amount() > 0 && next < sum) || (e.Amount.Amount() < 0 && next > sum) {
            return ErrAmountOverflow
        }
        sum = next
    }
    if sum != 0 {
        return ErrUnbalancedTransaction
    }
    return nil
}

The overflow check is the part that matters and the part that is usually missing. A naive sum over a fixed-width signed integer will silently wrap, and a pair of MaxInt64 entries with one cancelling negative will satisfy sum == 0 by overflowing to it. The check above rejects the transaction before the sum can wrap. In a language with arbitrary-precision integers by default (Python, Ruby, Erlang), the wrap-around concern goes away, but the rest of the rule is identical: store amounts as integer minor units, sum them, and refuse to write if the sum is non-zero. It costs one comparison per entry and removes a class of edge cases you would otherwise have to think about every time you read this code.

Pinned to TestInvariant_DoubleEntryBalance_RejectsUnbalancedTransaction, which also asserts the cache is untouched on rejection.

The trade-off: I re-validate on every Post rather than trusting the caller to have pre-checked. The cost is one pass over the entries. The cost of trusting wrong is silent corruption.

Invariant 2: idempotency

Two calls to Post with the same TransactionID produce exactly one event. The second call returns the original EventID along with ErrIdempotentReplay.

The canonical failure goes like this. A payment processor's webhook hits your endpoint, your handler posts the transaction to the ledger and starts writing the response, then the pod gets OOM-killed before the 200 leaves the wire. The processor's retry policy fires. The same payment is now in the ledger twice, the customer's balance is doubled, and the next reconciliation run will surface it, but only if reconciliation is comparing on transaction ID rather than aggregating amounts. If it aggregates, the discrepancy hides as "an extra hundred bucks" until someone notices that the count of transactions does not match the upstream count.

func (l *Ledger) checkIdempotency(tx Transaction) (EventID, bool, error) {
    has, err := l.store.HasTransaction(tx.ID)
    if err != nil {
        return 0, false, err
    }
    if !has {
        return 0, false, nil
    }
    return l.seenTxn[tx.ID], true, nil
}

Two implementation choices worth naming, both language-agnostic. First, the source of truth for "has this transaction been seen" is the persistent store, not the in-memory map. On a fresh process the in-memory map is empty until replay finishes, and a duplicate that arrived mid-replay would otherwise slip through. The map exists only to recover the original EventID for the response. Second, the duplicate path returns a distinguished error rather than a silent success. Callers need to distinguish "fresh write" from "deduplicated retry" for logging, metrics, and for deciding whether to emit downstream side effects (notification emails, ledger fan-out, webhooks of their own).

Pinned to TestInvariant_Idempotency_DuplicateTransactionIDProducesOneEvent.

The trade-off: TransactionID is the caller's responsibility. The ledger does not generate it. If the caller chooses badly, picking a fresh UUID per retry instead of a stable key per business event, idempotency degrades to none, and the ledger has no way to know. There is more on the upstream side of this in our piece on the idempotency key.

Invariant 3: monotonic ordering and deterministic replay

EventIDs are strictly increasing, and replaying the event log against a fresh projection reproduces the live projection bit-for-bit.

A read replica disagrees with the primary on a balance because two transactions for the same account were applied in different orders. The amounts net to the same number, but BalanceAt(t) returns different values on the two replicas because the intermediate states differ. An auditor pulls a point-in-time report from the replica that happens to be wrong, and you scramble to determine which one is authoritative. The answer is unsatisfying: "they both are, depending on how you ordered concurrent writes." There is no answer. You needed strict ordering at write time and you did not have it.

func (s *inMemoryStore) Append(e Event) (EventID, error) {
    s.mu.Lock()
    defer s.mu.Unlock()
    id := EventID(s.nextID)
    s.nextID++
    e.ID = id
    s.events = append(s.events, e)
    if e.Transaction != nil {
        s.txIndex[e.Transaction.ID] = struct{}{}
    }
    return id, nil
}

Monotonic IDs come from a single counter under the store's write lock. For a single-process reference this is enough. For a distributed store, a sequencer or hybrid logical clock takes its place — the contract on the EventStore interface is the same in any language: assign a strictly greater ID per Append. Postgres bigserial, a Kafka offset, a Raft log index, a synchronised counter behind a single-writer queue: any of them satisfies the property. The deeper point is that replay is deterministic by construction. The projection iterates events in ID order and applies pure functions of the event payload. Nothing in replayFromLocked reads the wall clock, pulls from a random source, or branches on map iteration order. Hand the same store to two fresh Ledgers and they cannot diverge. This is what makes the ledger an event-sourced ledger in any useful sense; without determinism the event log is just a logfile.

Pinned to TestInvariant_MonotonicOrdering_ReplayIsDeterministic, which checks ID contiguity and re-runs replay against a live ledger.

The trade-off: every write goes through one mutex on the store. That is a throughput ceiling. I accept it because the alternative (sharded stores with per-shard counters) moves correctness reasoning into the place where it most needs to stay simple.

Invariant 4: negative-balance prevention

An account whose AllowNegative flag is false cannot be driven below zero by a posting. Asset accounts default to off.

A retail POS books a refund before its matching sale because the messages arrived out of order on a queue. From the ledger's perspective both transactions balance, debits equal credits, so the refund is happily accepted and the cash drawer's projected balance goes to -$300. Bookkeeping is intact. Reality is not: there is not actually -$300 in the drawer. Without a hard floor the ledger has no way to refuse this, and the discrepancy only surfaces at end-of-day cash count, by which point the audit trail is twelve hours of intermingled transactions that all individually look fine.

func (l *Ledger) checkNegativeBalance(tx Transaction) (map[AccountID]Money, error) {
    projected := make(map[AccountID]Money, len(tx.Entries))
    for _, e := range tx.Entries {
        // ... apply signed delta into projected[e.Account] ...
    }
    for id, bal := range projected {
        if !l.accounts[id].AllowNegative && bal.Amount() < 0 {
            return nil, ErrNegativeBalance
        }
    }
    return projected, nil
}

Two design choices that survive a port to any language. First, the projection is computed in memory and only committed if the check passes. A rejection leaves the store and the cache untouched, which is a property the test asserts explicitly. The function returns projected so Post can apply it without recomputing. Second, the policy is opt-out for assets, not opt-in. Forgetting to set the flag has to fail safe. The handful of asset accounts that genuinely tolerate going negative (a credit card receivable that swings into a customer overpayment, say) are the exceptions, and they declare themselves with WithAllowNegative(true) at open time.

Pinned to TestInvariant_NegativeBalance_AssetAccountRejectsOverdraw, with a sub-test for the opt-in path.

The trade-off: liability, equity and revenue accounts default to allowing negatives, since their natural-balance side is credit. If you want a liability that cannot go positive you set the flag explicitly. The ledger does not try to model every chart-of-accounts convention; it makes the safe direction the default and lets you declare exceptions.

Invariant 5: currency isolation

A single transaction is single-currency. Every entry's currency must match every other entry's, and must match the currency of the account it touches.

A treasury team books a EUR-to-USD transfer as one transaction with a EUR debit and a USD credit. The amounts sum to zero arithmetically because the calling code applied an FX conversion before constructing the entries. Three months later a customer disputes the rate. You go back to the ledger and find no record of which rate was used, where it was sourced, when it was struck, or who approved it. Only the converted amounts. The ledger looked correct because a number was zero. The audit trail you needed lived in a function that has since been refactored away.

func (l *Ledger) checkCurrencyInvariant(tx Transaction) error {
    first := tx.Entries[0].Amount.Currency()
    for _, e := range tx.Entries {
        if e.Amount.Currency() != first {
            return ErrCurrencyMismatch
        }
        if l.accounts[e.Account].Currency != e.Amount.Currency() {
            return ErrCurrencyMismatch
        }
    }
    return nil
}

Two checks, both mandatory in any implementation. The first rejects mixed-currency entries. The second rejects entries whose currency does not match the account they target. A USD account must not receive a GBP entry, even if every other entry in the transaction is also GBP. Without that second check a typo in the calling code can post numerically-balanced GBP entries against a set of USD accounts, and the ledger will accept them; the discrepancy only surfaces when someone reads the balance and gets a value in the wrong denomination. Both checks are O(n) over a small n and run on every Post.

Pinned to TestInvariant_CurrencyIsolation_RejectsMixedCurrencyTransaction, which covers both rejection paths.

The trade-off: cross-currency transfers cannot be expressed as one transaction. The caller has to construct two single-currency transactions linked by a domain-level FX record that captures rate, source, and timestamp. That is more work for the caller and it is the right amount of work. The alternative is hiding a domain decision (which rate, when, with whose markup) inside what looks like a transfer.

What the reference deliberately leaves out

There is no multi-asset support and there is no arbitrary-precision arithmetic. Amounts are fixed-width signed integers in minor units, full stop. I considered first-class assets with per-asset scale, the obvious move if you want to handle crypto, securities, or high-precision instruments. I decided against it because shifting from a fixed-width integer to an arbitrary-precision type is a load-bearing change in any statically-typed language. It propagates into every arithmetic site, every overflow check, every comparison, and it makes the simple fiat case meaningfully harder to read. A multi-asset ledger is a different library, not a configuration flag bolted onto this one.

There is no persistent store. The only EventStore implementation keeps events in a slice in memory. A real backing store needs a write-ahead log, fsync semantics, an answer for partial-write recovery, and a story for snapshot-and-resume. The EventStore interface is the seam where that work lands; the ledger above it does not change. Persistence is a serious project on its own, and writing a half-version of it inside the reference would teach the wrong lesson.

There is no multi-tenancy. A Ledger is a single tenant. Tenant isolation is a property of the storage layer and the access-control layer above it, not of the double-entry mechanics. Folding tenants into the ledger struct would multiply the test surface for every invariant without making any of them more correct. If you want per-tenant ledgers, you instantiate per-tenant ledgers.

There is no HTTP or gRPC surface. ledgercore is a Go package; the transport is the caller's problem. Wire protocols change, idempotency keys move from headers to message metadata, authentication conventions vary by stack. Bundling a transport into a correctness library couples two things that have no business being coupled, and the version skew between them is a maintenance tax I do not want to pay on someone else's behalf.

A reference implementation is useful when it is small enough to verify and specific enough to argue with. The moment it tries to be a platform, it usually stops being either.

Closing

The thread that runs through all five invariants is the same. Correctness is enforced at write time. Not at read time, not in a nightly job, not in a separate reconciliation service. The ledger refuses to accept the data that would make it wrong. Reconciliation and audit are how you discover that a write-time invariant was missing; they are not where invariants live. Once you internalise that, the design choices above stop reading like a list of features and start reading like the same decision applied in five places.

None of this is specific to Go. Translate the file to Java, Kotlin, Python, TypeScript, Rust, Elixir, OCaml. The invariants do not move. The trade-offs around int64 overflow checks become trade-offs about BigInteger cost or Decimal semantics; the trade-off about a single mutex becomes a trade-off about row-level locks or a single-writer queue. The shape is the same. If your ledger does not enforce these five things at the boundary where writes are accepted, no amount of language choice or framework polish will fix it.

This is a reference, not a production system. Your ledger needs persistence, snapshots, observability, and the operational surface this codebase deliberately omits.

The repository is at github.com/xvica-ltd/ledgercore. At XVICA we build this kind of correctness-first infrastructure for regulated firms in payments, lending, and treasury, anywhere a wrong number costs more than the engineering to prevent it. If that is the problem in front of you: hi@xvica.com.