In the existing implementation in the C++ library, the code does a series of tests to see how many items it needs to sort and calls the dedicated sorting function for that number of items. The revised code does something much weirder. It tests if there are two items and calls out to a separate function to sort them if needed. If it’s greater than two items, the code calls out to sort the first three items. If there are three items, it returns the results of that sort.
If there are four items to sort, however, it runs specialized code that is extremely efficient at inserting a fourth item into the appropriate place within a set of three sorted items. This sounds like a weird approach, but it consistently outperformed the existing code.
I work quite close to these topics and the computational complexity of compiler optimization is harder than sorting algorithms. And while I’ve seen ML develop heuristics for combinatorial logic, pushdown automata and Turning machines are still too complex. Though we’re approaching pushdown automata already. I think ML driven compiler optimizations are at least 10 years away. Likely more.