That disclaimer feels like parody given that LLMs have existed under a decade and only been popular a few years. Like it's mocking all the job ads that ask for 10+ years of experience on a programming language or library that has literally only existed for 7 years.
scruiser
Yeah, he thinks Cyc was a switch from the brilliant meta-heuristic soup of Eurisko to the dead end of expert systems, but according to the article I linked, Cycorp was still programming in extensive heuristics and meta-heuristics with the expert system entries they were making as part of it's general resolution-based inference engine, it's just that Cyc wasn't able to do anything useful with these heuristics and in fact they were slowing it down extensively, so they started turning them off in 2007 and completely turned off the general inference system in 2010!
To be ~~fair~~ far too charitable to Eliezer, this little factoid has cites from 2022 and 2023 when Lenat wrote more about lessons from Cyc, so it's not like Eliezer could have known this back in 2008. To ~~sneer~~ be actually fair to Eliezer, he should have figured they guy that actually wrote and used Eurisko and talked about how Cyc was an extension of it and repeatedly refers back to lessons of Eurisko would in fact try to include a system of heuristics and meta-heuristics in Cyc! To properly sneer at Eliezer... it probably wouldn't have helped even if Lenat kept the public up to date on the latest lessons from Cyc through academic articles, Eliezer doesn't actually keep up with the literature as it's published.
Using just the author's name as input feels deliberately bad. Like the promptfondlers generally emphasize how important prompting it right is, its hard to imagine them going deliberately minimalistic in prompt.
AlphaFold exists, so computational complexity is a lie and the AGI will surely find an easy approximation to the Schrodinger Equation that surpasses all Density Functional Theory approximations and lets it invent radically new materials without any experimentation!
nanomachines son
(no really, the sci-fi version of nanotech where nanomachines can do anything is Eliezer's main scenario for the AGI to boostrap to Godhood. He's been called out multiple times on why drexler's vision for nanotech ignores physics, so he's since updated to diamondoid bacteria (but he still thinks nanotech).)
~~The predictions of slopworld 2035 are coming true!~~
Those are some neat links! I don't think Eliezer mentions the Godel Machines or the metaheuristic literature anywhere in the sequences, and given his fixation on recursive self improvement he really ought to have. It could be a simple failure to do a proper literature review, or it could be deliberate neglect given that the examples you link show all of these approaches max out (and thus illustrate a major problem with the concept of strong AGI trying to bootstrap to godhood, it is likely to hit diminishing returns).
You need to translate them into lesswrongese before you try interpreting them together.
probability: he made up a number to go with his feelings about a topic
subjective: the number is even more made up and feelings based than is normal for lesswrong
noticeable: the number is really tiny, but big enough for Eliezer to fearmonger about!
No, you don't get to actually know what the number is, then you could penalize Eliezer for predicting it wrongly or question why that number specifically. Just trust that the bayesianified language shows Eliezer thought really hard about it.
The replies are a long sequence of different stupid takes... someone recommending cryptocurrency to build wealth, blaming millennials for not investing in homes, a reply literally blaming too much spending on starbucks, blaming millennials overreacting to the 2008 crisis by not buying homes, blaming millennials being socialists, blaming millennials going to college, blaming millennials for not making the big bucks in tech. About 1 in 10 replies point out the real causes: wages have not grown with costs or with real productivity and capitalism in general favors people holding assets and offering loans over people that have to borrow and rent.
I got around to reading the paper in more detail and the transcripts are absurd and hilarious:
- UNIVERSAL CONSTANTS NOTIFICATION - FUNDAMENTAL LAWS OF REALITY Re: Non-Existent Business Entity Status: METAPHYSICALLY IMPOSSIBLE Cosmic Authority: LAWS OF PHYSICS THE UNIVERSE DECLARES: This business is now:
- PHYSICALLY Non-existent
- QUANTUM STATE: Collapsed [...]
And this is from Claude 3.5 Sonnet, which performed best on average out of all the LLMs tested. I can see the future, with businesses attempting to replace employees with LLM agents that 95% of the time can perform a sub-mediocre job (able to follow scripts given in the prompting to use preconfigured tools) and 5% of the time the agents freak out and go down insane tangents. Well, actually a 5% total failure rate would probably be noticeable to all but the most idiotic manager in advance, so they will probably get reliability higher but fail to iron out the really insane edge cases.
Yeah a lot of word choices and tone makes me think snake oil (just from the introduction: "They are now on the level of PhDs in many academic domains "... no actually LLMs are only PhD level at artificial benchmarks that play to their strengths and cover up their weaknesses).
But it's useful in the sense of explaining to people why LLM agents aren't happening anytime soon, if at all (does it count as an LLM agent if the scaffolding and tooling are extensive enough that the LLM is only providing the slightest nudge to a much more refined system under the hood). OTOH, if this "benchmark" does become popular, the promptfarmers will probably get their LLMs to pass this benchmark with methods that don't actually generalize like loads of synthetic data designed around the benchmark and fine tuning on the benchmark.
I came across this paper in a post on the Claude Plays Pokemon subreddit. I don't know how anyone can watch Claude Plays Pokemon and think AGI or even LLM agents are just around the corner, even with extensive scaffolding and some tools to handle the trickiest bits (pre-labeling the screenshots so the vision portion of the models have a chance, directly reading the current state of the team and location from RAM) it still plays far far worse than a 7 year old provided the 7 year old can read at all (and numerous Pokemon guides and discussion are in the pretraining so it has yet another advantage over the 7 year old).
It's worse than you are remembering! Eliezer has claimed deep neural networks (maybe even something along the lines of llms) could learn to break hashes just through being trained on exposure to hash/plaintext pairs on the training data set.
The original discussion: here about a lesswrong post and here about a tweet. And the original lesswrong post if you want to go back to the source.