Skip to main content

Solving Sudoku in Go: Backtracking, Constraint Caching, and Benchmarks

·9 mins
Table of Contents
Sudoku solving is a classic constraint satisfaction problem (CSP). The same algorithmic techniques – backtracking with constraint propagation – appear in production systems for job scheduling, resource allocation, and configuration validation. Understanding how to implement and optimize a Sudoku solver gives you a concrete mental model for tackling these problems at scale.

Problem Representation
#

A Sudoku grid is a 9x9 matrix of integers where 0 represents an empty cell. A fixed-size array is the right type: it lives on the stack, has no allocation overhead, and can be passed by value for snapshot semantics during backtracking.

sudoku.go
package sudoku

// Grid is a 9x9 Sudoku board. 0 means empty.
type Grid [9][9]int

The three constraint regions that a valid placement must satisfy:

  • Row: each digit 1-9 appears at most once in each row
  • Column: each digit 1-9 appears at most once in each column
  • Box: each digit 1-9 appears at most once in each 3x3 sub-grid

The box index for cell (row, col) is (row/3)*3 + col/3, giving values 0-8 for the nine sub-grids.

The Naive Backtracking Algorithm
#

Backtracking is a depth-first search over the space of partial assignments. At each empty cell, try every digit that does not violate the current constraints. If none works, backtrack (undo the last assignment) and try the next digit.

sudoku_naive.go
package sudoku

// SolveNaive solves the grid in-place using backtracking with O(n) validity checks.
// Returns true if a solution was found.
func SolveNaive(grid *Grid) bool {
    for row := 0; row < 9; row++ {
        for col := 0; col < 9; col++ {
            if grid[row][col] != 0 {
                continue
            }
            for num := 1; num <= 9; num++ {
                if isValidNaive(grid, row, col, num) {
                    grid[row][col] = num
                    if SolveNaive(grid) {
                        return true
                    }
                    grid[row][col] = 0 // backtrack
                }
            }
            return false // no valid digit for this cell
        }
    }
    return true // all cells filled
}

isValid: O(n) Per Call
#

The naive validity check scans all three constraint regions on every placement attempt. For a standard 9x9 grid, this is 27 comparisons per call. During backtracking on a hard puzzle, isValid can be called millions of times.

sudoku_naive.go
// isValidNaive checks whether num can be placed at (row, col).
// Time complexity: O(n) where n=9 -- three linear scans.
func isValidNaive(grid *Grid, row, col, num int) bool {
    // Check row
    for i := 0; i < 9; i++ {
        if grid[row][i] == num {
            return false
        }
    }
    // Check column
    for i := 0; i < 9; i++ {
        if grid[i][col] == num {
            return false
        }
    }
    // Check 3x3 box
    startRow, startCol := (row/3)*3, (col/3)*3
    for i := 0; i < 3; i++ {
        for j := 0; j < 3; j++ {
            if grid[startRow+i][startCol+j] == num {
                return false
            }
        }
    }
    return true
}

Constraint Caching Optimization
#

The key insight: instead of re-scanning regions on every call, maintain three boolean tables that record which digits are already used in each row, column, and box. A placement updates the tables in O(1). A validity check becomes three array lookups – also O(1).

Note

This is not memoization. Memoization caches the return value of a pure function for repeated inputs with the same arguments. Here there are no repeated function calls with identical arguments. This is constraint caching (also called a validity cache or used-set pre-computation): we maintain a running state that reflects which placements have already been made.

The correct type is [9][10]bool, not [9][9][10]bool. The first dimension is the region index (0-8 for rows, columns, or boxes). The second dimension is the digit (1-9; index 0 is unused, which simplifies the indexing arithmetic). There is no third dimension.

sudoku_cached.go
package sudoku

// usedInRow[r][d] is true if digit d is already placed in row r.
// usedInCol[c][d] is true if digit d is already placed in column c.
// usedInBox[b][d] is true if digit d is already placed in box b.
// Index 0 of the digit dimension is unused; digits are 1-9.
var (
    usedInRow [9][10]bool
    usedInCol [9][10]bool
    usedInBox [9][10]bool
)

// InitConstraintCache scans the initial grid and populates the constraint tables.
func InitConstraintCache(grid *Grid) {
    // Zero out tables.
    usedInRow = [9][10]bool{}
    usedInCol = [9][10]bool{}
    usedInBox = [9][10]bool{}

    for r := 0; r < 9; r++ {
        for c := 0; c < 9; c++ {
            d := grid[r][c]
            if d != 0 {
                b := (r/3)*3 + c/3
                usedInRow[r][d] = true
                usedInCol[c][d] = true
                usedInBox[b][d] = true
            }
        }
    }
}

// isValidCached is an O(1) constraint check using the pre-computed tables.
func isValidCached(row, col, num int) bool {
    b := (row/3)*3 + col/3
    return !usedInRow[row][num] && !usedInCol[col][num] && !usedInBox[b][num]
}

func placeNumber(grid *Grid, row, col, num int) {
    b := (row/3)*3 + col/3
    grid[row][col] = num
    usedInRow[row][num] = true
    usedInCol[col][num] = true
    usedInBox[b][num] = true
}

func removeNumber(grid *Grid, row, col, num int) {
    b := (row/3)*3 + col/3
    grid[row][col] = 0
    usedInRow[row][num] = false
    usedInCol[col][num] = false
    usedInBox[b][num] = false
}

// SolveCached solves the grid using backtracking with O(1) constraint checks.
// Call InitConstraintCache before calling this function.
func SolveCached(grid *Grid) bool {
    for row := 0; row < 9; row++ {
        for col := 0; col < 9; col++ {
            if grid[row][col] != 0 {
                continue
            }
            for num := 1; num <= 9; num++ {
                if isValidCached(row, col, num) {
                    placeNumber(grid, row, col, num)
                    if SolveCached(grid) {
                        return true
                    }
                    removeNumber(grid, row, col, num)
                }
            }
            return false
        }
    }
    return true
}

Benchmark
#

Go’s testing package makes it straightforward to measure the speedup. We run both versions on the same hard puzzle and compare allocations and time.

sudoku_bench_test.go
package sudoku_test

import (
    "testing"
    "sudoku"
)

var hardPuzzle = sudoku.Grid{
    {8, 0, 0, 0, 0, 0, 0, 0, 0},
    {0, 0, 3, 6, 0, 0, 0, 0, 0},
    {0, 7, 0, 0, 9, 0, 2, 0, 0},
    {0, 5, 0, 0, 0, 7, 0, 0, 0},
    {0, 0, 0, 0, 4, 5, 7, 0, 0},
    {0, 0, 0, 1, 0, 0, 0, 3, 0},
    {0, 0, 1, 0, 0, 0, 0, 6, 8},
    {0, 0, 8, 5, 0, 0, 0, 1, 0},
    {0, 9, 0, 0, 0, 0, 4, 0, 0},
}

func BenchmarkSolveNaive(b *testing.B) {
    for b.Loop() {
        grid := hardPuzzle // copy
        sudoku.SolveNaive(&grid)
    }
}

func BenchmarkSolveCached(b *testing.B) {
    for b.Loop() {
        grid := hardPuzzle // copy
        sudoku.InitConstraintCache(&grid)
        sudoku.SolveCached(&grid)
    }
}

Run with:

go test -bench=. -benchmem ./...

Typical results on the “hardest” known Sudoku puzzles:

BenchmarkSolveNaive-8     3000    412,803 ns/op    0 B/op    0 allocs/op
BenchmarkSolveCached-8   18000     64,112 ns/op    0 B/op    0 allocs/op

The cached version is roughly 6x faster on hard puzzles. The speedup comes from eliminating 27 comparisons per validity check and replacing them with 3 array lookups. Both implementations allocate nothing on the heap during solving (the grid is stack-allocated as a value type, and the constraint tables are package-level variables). The absolute times vary by CPU and puzzle difficulty, but the relative speedup is consistent.

Tip

The b.Loop() idiom (Go 1.24+) is preferred over for i := 0; i < b.N; i++ because it correctly handles benchmark warmup and timer resets. For older Go versions, use for range b.N.

Backtracking Tree with Pruning
#

graph TD
    Root["Empty cell (0,0)"] --> T1["Try 1"]
    Root --> T2["Try 2"]
    Root --> T8["Try 8 (valid)"]

    T1 --> V1["isValid? No -- row conflict"]
    T2 --> V2["isValid? No -- box conflict"]
    T8 --> Next["Next empty cell (0,1)"]

    Next --> N1["Try 1"]
    Next --> N2["Try 2 (valid)"]

    N1 --> NV1["isValid? No -- col conflict"]
    N2 --> NextNext["Continue..."]

    NextNext --> Backtrack["Dead end -- backtrack"]
    Backtrack --> Root

    style V1 fill:#f44336,color:#fff
    style V2 fill:#f44336,color:#fff
    style NV1 fill:#f44336,color:#fff
    style Backtrack fill:#FF9800,color:#fff
    style T8 fill:#4CAF50,color:#fff
    style N2 fill:#4CAF50,color:#fff

Constraint caching prunes branches earlier because the validity check is cheaper to compute. With the naive approach, the solver may enter a subtree before realizing a constraint is violated; with caching, the check is fast enough to evaluate eagerly at every step.

Limitations: When Backtracking Explodes
#

Standard backtracking with constraint caching handles all published Sudoku puzzles in milliseconds. However, for near-empty grids (fewer than 17 givens), the search space explodes combinatorially. A grid with 0 givens has approximately 6.7 x 10^21 valid solutions, and naive backtracking degenerates to exhaustive enumeration.

Warning

A Sudoku puzzle with fewer than 17 givens is not uniquely solvable. If you need to validate puzzle uniqueness (a common requirement in puzzle generators), backtracking alone is insufficient. You need to count solutions, which is dramatically more expensive.

For the hardest cases, two techniques go substantially further:

Constraint propagation (arc consistency). Before placing a digit, propagate the constraint to all cells in the same row, column, and box, reducing their candidate sets. This is the technique used by human solvers (“if this cell can only be X, then no other cell in this row can be X”). Peter Norvig’s classic Sudoku essay shows this reduces most hard puzzles to a single search step.

Algorithm X with Dancing Links (DLX). Donald Knuth’s Algorithm X, implemented with the Dancing Links data structure, is the state-of-the-art for exact cover problems (of which Sudoku is one). It solves even degenerate cases orders of magnitude faster than backtracking. The key idea: represent the puzzle as a sparse matrix and use DLX to efficiently backtrack through the cover.

Real-World Connection: CSPs in Production
#

The same backtracking-with-constraint-caching pattern appears in production systems:

  • Job scheduling: assign N jobs to M machines with capacity and deadline constraints. The constraint cache tracks which machine capacities are exhausted and which time slots are occupied.
  • Resource allocation in Kubernetes: scheduling pods to nodes with resource requests is a bin-packing CSP. Kubernetes’ scheduler maintains a pre-computed “fit” state per node (analogous to the constraint tables) to avoid rescanning all pods on each placement attempt.
  • Configuration validation: validating infrastructure-as-code against a set of policy rules (OPA/Rego policies) is a constraint satisfaction problem. Caching evaluated rule results avoids re-evaluating the same policy for each resource.
**Using `[9][9][10]bool` instead of `[9][10]bool` for the constraint tables.** The constraint tables index by (region, digit). For rows, the region is the row index (0-8). There is no third dimension. Using `[9][9][10]bool` declares 810 booleans instead of 270 and introduces an indexing dimension with no semantic meaning, leading to off-by-one errors. **Calling the optimization memoization.** Memoization caches pure function results. The constraint tables here are mutable state that changes with every placement and removal. Calling it memoization suggests the wrong mental model and leads to incorrect refactors (for example, trying to share the cache across concurrent solvers). **Not resetting the constraint cache on backtracking.** Every `placeNumber` call must be paired with a `removeNumber` call on backtrack. Missing the removal causes the tables to reflect placements that have been undone, making future validity checks incorrect. **Passing the grid by value to the naive solver but forgetting the copy in benchmarks.** If you benchmark with the same grid pointer across iterations, the second iteration starts with a solved grid and returns immediately, making the benchmark useless. Always copy the grid before each benchmark iteration.

If you want to go deeper on any of this, I offer 1:1 coaching sessions for engineers working on AI integration, cloud architecture, and platform engineering. Book a session (50 EUR / 60 min) or reach out at manuel.fedele+website@gmail.com.

Related