Each December, the LessWrong community reflects on the best blogposts of yesteryear, to decide which posts stood the tests of time.
In this post, I aim to:
The most important changes are:
For the Nomination Voting Phase, the main ask I'm making is: "spend ~30 minutes casting nomination votes, and write 2...
Wait I'm a moron and the thing I checked was actually whether it was an exponential function, sorry.
The 3 most important paragraphs, extracted to save readers the trouble of clicking on a link:
...The Anduril and OpenAI strategic partnership will focus on improving the nation’s counter-unmanned aircraft systems (CUAS) and their ability to detect, assess and respond to potentially lethal aerial threats in real-time.
[...]
The accelerating race between the United States and China to lead the world in advancing AI makes this a pivotal moment. If the United States cedes ground, we risk losing the technological edge that has underpinned our national security for de
ok, options.
- Review of 108 ai alignment plans
- write-up of Beyond Distribution - planned benchmark for alignment evals beyond a models distribution, send to the quant who just joined the team who wants to make it
- get familiar with the TPUs I just got access to
- run hhh and it's variants, testing the idea behind Beyond Distribution, maybe make a guide on itr
- continue improving site design
- fill out the form i said i was going to fill out and send today
- make progress on cross coders - would prob need to get familiar with those tpus
- writeup o...
ETA: I'm not saying that MIRI thought AIs wouldn't understand human values. If there's only one thing you take away from this post, please don't take away that. Here is Linch's attempted summary of this post, which I largely agree with.
Recently, many people have talked about whether some of the main MIRI people (Eliezer Yudkowsky, Nate Soares, and Rob Bensinger[1]) should update on whether value alignment is easier than they thought given that GPT-4 seems to follow human directions and act within moral constraints pretty well (here are two specific examples of people talking about this: 1, 2). Because these conversations are often hard to follow without much context, I'll just provide a brief caricature of how I think this argument has gone in the places I've...
While I did agree that Linch's comment reasonably accurately summarized my post, I don't think a large part of my post was about the idea that we should now think that human values are much simpler than Yudkowsky portrayed them to be. Instead, I believe this section from Linch's comment does a better job at conveying what I intended to be the main point,
...
- Suppose in 2000 you were told that a100-line Python program (that doesn't abuse any of the particular complexities embedded elsewhere in Python) can provide a perfect specification of human values. Then you
Midjourney, “the dream machine”
I recently started working at Renaissance Philanthropy. It’s a new organization, and most people I’ve met haven’t heard of it.[1] So I thought I’d explain, in my own words and speaking for myself rather than my employers, what we (and I) are trying to do here.
The “Renaissance” in Renaissance Philanthropy is a reference to the Italian Renaissance, when wealthy patrons like the Medicis commissioned great artworks and inventions.
The idea is that “modern Medicis” — philanthropists — should be funding the great scientists and innovators of our day to tackle ambitious challenges.
RenPhil’s role is to facilitate that process: when a philanthropist wants to pursue a goal, we help them turn that into a more concrete plan, incubate and/or administer new organizations to implement that plan,...
This is the third in a sequence of posts scrutinizing computational functionalism (CF). In a previous post, I defined a concrete claim that computational functionalists tend to make:
Theoretical CF: A simulation of a human brain on a computer, with physics perfectly simulated down to the atomic level, would cause the conscious experience of that brain.
I contrasted this with “practical CF”, the claim that a suitably low-fidelity simulation of a brain, like one that only captures functional properties, would be conscious. In the last post, I discussed practical CF. In this post, I’ll scrutinize theoretical CF.
To evaluate theoretical CF, I’m going to meet functionalists where they (usually) stand and adopt a materialist position about consciousness. That is to say that I’ll assume all details of a human’s conscious...
You can also disambiguate between
a) computation that actually interacts in a comprehensible way with the real world and
b) computation that has the same internal structure at least momentarily but doesn't interact meaningfully with the real world.
I expect that (a) can usually be uniquely pinned down to a specific computation (probably in both senses (1) and (2)), while (b) can't.
But I also think it's possible that the interactions, while important for establishing the disambiguated computation that we interact with, are not actually crucial to i...
[ TBC, I know orbits can oscillate. However, most 3D shell orbits do not look like oscillating, but locally stable, 2D orbits. ]
I'd guess that the amount spent on image and voice is negligible for this BOTEC?
I do think that the amount spent on inference for customers should be a big deal though. My understanding is that OpenAI has a much bigger userbase than Anthropic. Shouldn't that mean that, all else equal, Anthropic has more compute to spare for training & experiments? Such that if Anthropic has about as much compute total, they in effect have a big compute advantage?