I’ve been processing the new Krakauer, Krakauer & Mitchell paper on emergence in LLMs alongside Michael Levin’s recent work on ingressing mind” and there is much to chew. Both are grappling with where order comes from in intelligent systems, and both reject the lazy conflation of knowledge accumulation with intelligence. Krakauer’s memorable formulation - “intelligence is doing more with less” - isn’t just about efficiency. It’s about systems that discover compression schemes so powerful they reveal something fundamental about the structure of reality. When he describes true emergence as requiring “coarse-graining that produces novel, more parsimonious descriptions,” he may be pointing toward the same insight Levin articulates: that certain patterns have a reality independent of their physical instantiation. Krakuer paper lays out clear criteria for genuine emergence: internal reorganization, breaking of scaling laws, novel bases that allow radically simplified descriptions. They’re rightfully skeptical of LLMs, viewing them as massive knowledge-in systems that accumulate capabilities through brute force rather than discovering elegant principles. As Krakauer puts it elsewhere, they’re “really shit programming” that needs enormous amounts of natural language to achieve goals a truly intelligent system might reach through insight. But here’s what’s fascinating: neither paper discusses the linear representation hypothesis, and I ponder whether this could be actually our best current example of emergence in machine learning.