Stayed up late to test codex53 and opus46 — effects not very noticeable

Stayed up late to test codex53 and opus46; not a strong feeling (compared with previous major versions—you can also tell this time it’s just a minor version-number update). Domestic models can speed up the “shortening the distance” part now, but they still can’t change the fact that there are people in the same batch who spend 1m for a 0.1 improvement.

One more thing: the ebb tide of benchmarks should be here. Whichever CEO reiterates “benchmark” again—whatever product that company has is truly out of ideas.
I’ll randomly grab some HN comments and then go to sleep.