Back to work / Case study

NovaSort.

A C++20 header-only adaptive hybrid sorting library tuned for real input distributions, structured data, and benchmark-driven performance work.

RoleAlgorithms / C++ / Benchmarking
Year2026
PlatformC++20 Library
Status Latest iteration in progress
97
Fastest integer scenarios
162
Benchmark scenarios
50K
Fuzzing rounds
10M
Stress-test scale

Sorting performance depends on data shape, not just average-case complexity.

NovaSort explores a practical question: can a small embeddable C++ library adapt across random, nearly sorted, reversed, duplicate-heavy, and structured object inputs?

The library exposes a simple `nova_sort(begin, end, cmp)` API while routing internally between strategies based on the input it sees.

A routing-based hybrid sorter chooses different paths for different distributions.

  1. S1Structure scan

    The entry path checks for sorted, reversed, repeated, and naturally run-structured inputs before committing to a general algorithm.

  2. S2Hybrid core

    Random and weakly structured data use introsort-style partitioning, while structured data can route into natural merge or TimSort-style merging.

  3. S3Safety fallbacks

    Depth limits, targeted shuffle, and heapsort fallback defend against bad partitions and adversarial input.

The project trades simplicity inside the implementation for predictable behavior at the call site.

Adaptive routingover One fixed algorithm

Different distributions reward different strategies, so the implementation spends effort detecting structure early.

Header-only APIover Runtime dependency

A single-header library keeps integration simple for tools, games, and desktop software.

Benchmark suiteover Anecdotal speed claims

The project compares std::sort, pdqsort, vergesort, timsort, and NovaSort across many sizes and data patterns.

Performance work is backed by fuzzing, stress tests, and non-trivial object benchmarks.

Small-array optimization

Sorting networks for tiny ranges reduce overhead inside recursive and local subproblems.

Branchless partitioning

Cache-aware block partitioning reduces branch misprediction and memory churn in the quicksort path.

Run-aware merging

Natural run detection and galloping merge behavior improve nearly sorted and multi-run inputs.

Object-aware testing

Benchmarks include heavy structs and long-string key trees, not only plain integer arrays.

The next phase is polishing portability and publishing the latest iteration.

  • - Sync the latest local implementation back to the public repository
  • - Document benchmark methodology and supported compiler behavior
  • - Keep adding adversarial and structured-input regression cases
Next case study
l10n-expansion-data
2026 / Localization text expansion dataset