Broken out from T30146, which started with a narrower focus which was solved by a narrower fix.
Per notes & patches on that bug, the preg_match_all() in UtfNormal::quickIsNFCVerify uses a lot of memory for mixed ASCII/non-ASCII strings such as one finds in languages using Latin scripts with accented or other non-ASCII letters.
This results in hitting memory limits on largeish input strings, much sooner than we really ought to.
Rewriting the function so that it works through the string in chunks as it's splitting should avoid that huge memory bump, but my initial tests were too slow using preg_match and an offset, and still slowish using preg_replace_callback.
includes/normal/UtfNormalMemStress.php can be used to stress-test this.
Version: 1.18.x
Severity: enhancement