Plan 9 from Bell Labs’s /usr/web/sources/contrib/mospak/libregexp-listsize-bump/README

Copyright © 2021 Plan 9 Foundation.
Distributed under the MIT License.
Download the Plan 9 distribution.


libregexp-listsize-bump: bump LISTSIZE 10 -> 12

A bounded workaround for an off-by-one in libregexp's
Rune-side thread-list overflow detection.  The OOB byte write
corrupts heap memory adjacent to the regex engine's stack.
Affects every libregexp consumer that exercises a regex large
enough to fill the thread list -- observed in practice through
abaco's URL validation regex (10 protocol alternatives).

LISTSIZE caps a fixed-size array of regex threads -- one slot
per parallel-match attempt the engine spawns when matching an
alternation like (a|b|c|...).  abaco's URL regex has 10
alternatives and fills a 10-slot list exactly; the off-by-one
in the overflow check then writes one byte past the array end
and corrupts whatever heap data sits next to it.  Bumping to
12 leaves two spare slots, so the stray write lands inside
the array on memory nothing else uses.  13 would push a
related fallback array (also sized in multiples of LISTSIZE)
over the 64 KB thread stack -- 12 is the largest single-line
bump that's safe.

CHANGES
    v1 -- 2026-04-25 (initial)
        libregexp-listsize-bump   bumps LISTSIZE 10 -> 12 in
                                  libregexp/regcomp.h.


    v2 -- 2026-05-04 (ASCII normalization)
        patch + README            Em-dash and arrow characters
                                  replaced with '--' and '->' for
                                  9legacy stable convention.
                                  Comments and prose only; no code
                                  changes.

Scope
    One-line change to sys/src/libregexp/regcomp.h.  No
    kernel, API, or ABI change.  Each binary that links
    libregexp must be rebuilt to pick up the new constant.

Files
    libregexp-listsize-bump.diff   the fix

Apply
    cd /
    ape/patch -p0 < /path/to/libregexp-listsize-bump/libregexp-listsize-bump.diff

Rebuild
    cd /sys/src/libregexp && mk install
    cd /sys/src/cmd/abaco && mk install   (re-link consumers)

Prerequisites
    A 9legacy tree with sys/src/libregexp present.  No other
    patches required.

    Sanity-check before:  grep LISTSIZE /sys/src/libregexp/regcomp.h
                          # should print 10
    Sanity-check after:   same grep should print 12

Verification
    Before: a regex consumer exercising 10+ thread-list slots
    may suicide on first match -- symptom is environment-
    dependent (heap layout determines which adjacent memory
    the OOB write corrupts).
    After: same workload runs cleanly.

Rollback
    cd /
    ape/patch -R -p0 < /path/to/libregexp-listsize-bump/libregexp-listsize-bump.diff

    Then rebuild libregexp + consumers as above.

Proper fix
    The off-by-one itself is unfixed at root.  A bound-aware
    _renewthread, or changing every `== nle` / `== tle` overflow
    check to `>=` with a guaranteed spare slot, would close it --
    a multi-site change across regaux.c, regexec.c, and
    rregexec.c, out of scope for this single-line bump.

Bell Labs OSI certified Powered by Plan 9

(Return to Plan 9 Home Page)

Copyright © 2021 Plan 9 Foundation. All Rights Reserved.
Comments to webmaster@9p.io.