LessWrong

The 2023 LessWrong Review: The Basic Ask

Ω 155h

Each December, the LessWrong community reflects on the best blogposts of yesteryear, to decide which posts stood the tests of time.

In this post, I aim to:

Explain some subtleties of what I'm hoping we get out of the Annual Review
Make some explicit asks (even of busy people), which I think have reasonable cost/benefit ratios.
Describe the new UI features.

The most important changes are:

Reviews are much more prominent than they used to be
You can nominate posts from elsewhere on the internet using the Linkpost Importer. (Use this to nominate posts that contribute to the overall LessWrong or Alignment Forum conversation). It can import PDFs from arXiV, and blogposts from most urls.

For the Nomination Voting Phase, the main ask I'm making is: "spend ~30 minutes casting nomination votes, and write 2...

(Continue Reading – 2537 more words)

2habryka10m

How are the triangle numbers not quadratic? n(n+1)2=n2+n2 Sure looks quadratic to me.

DanielFilan9mΩ220

Wait I'm a moron and the thing I checked was actually whether it was an exponential function, sorry.

2Kaj_Sotala2h

This link goes to the nomination page for the 2022 review rather than the 2023 one.

2cubefox2h

More specifically, for a Jeffrey utility function U defined over a Boolean algebra of propositions, and some propositions a,b, "the sum is greater than its parts" would be expressed as the condition U(a∧b)>U(a)+U(B) (which is, of course, not a theorem). The respective general theorem only states that U(a∧b)=U(a)+U(b∣a), which follows from the definition of conditional utility U(b∣a)=U(a∧b)−U(a).

dirk's Shortform

dirk

7mo

3dirk1h

OpenAI is partnering with Anduril to develop models for aerial defense: https://www.anduril.com/article/anduril-partners-with-openai-to-advance-u-s-artificial-intelligence-leadership-and-protect-u-s/

sunwillrise38m10

The 3 most important paragraphs, extracted to save readers the trouble of clicking on a link:

The Anduril and OpenAI strategic partnership will focus on improving the nation’s counter-unmanned aircraft systems (CUAS) and their ability to detect, assess and respond to potentially lethal aerial threats in real-time.
[...]
The accelerating race between the United States and China to lead the world in advancing AI makes this a pivotal moment. If the United States cedes ground, we risk losing the technological edge that has underpinned our national security for de

... (read more)

Kabir Kumar's Shortform

Kabir Kumar

1mo

Kabir Kumar38m10

ok, options.
- Review of 108 ai alignment plans
- write-up of Beyond Distribution - planned benchmark for alignment evals beyond a models distribution, send to the quant who just joined the team who wants to make it
- get familiar with the TPUs I just got access to
- run hhh and it's variants, testing the idea behind Beyond Distribution, maybe make a guide on itr
- continue improving site design

- fill out the form i said i was going to fill out and send today
- make progress on cross coders - would prob need to get familiar with those tpus
- writeup o... (read more)

Evaluating the historical value misspecification argument

172

Matthew Barnett

Ω 471y

ETA: I'm not saying that MIRI thought AIs wouldn't understand human values. If there's only one thing you take away from this post, please don't take away that. Here is Linch's attempted summary of this post, which I largely agree with.

Recently, many people have talked about whether some of the main MIRI people (Eliezer Yudkowsky, Nate Soares, and Rob Bensinger^[1]) should update on whether value alignment is easier than they thought given that GPT-4 seems to follow human directions and act within moral constraints pretty well (here are two specific examples of people talking about this: 1, 2). Because these conversations are often hard to follow without much context, I'll just provide a brief caricature of how I think this argument has gone in the places I've...

(Continue Reading – 1950 more words)

3Martin Randall6h

My read of older posts from Yudkowsky is that he anticipated a midrange level of complexity of human values, compared to your scale of simple mathematical function to perfect simulation of human experts. Yudkowsky argued against very low complexity human values in a few places. There's an explicit argument against Fake Utility Functions that are simple mathematical functions. The Fun Theory Sequence is too big if human values are a 100 line python program. But also Yudkowsky's writing is incompatible with extremely complicated human values that require a perfect simulation of human experts to address. This argument is more implicit, I think because that was not a common position. Look at Thou Art Godshatter and how it places the source of human values in the human genome, downstream of the "blind idiot god" of Evolution. If true, human values must be far less complicated than the human genome. GPT-4 is about 1,000x bigger than the human genome. Therefore when we see that GPT-4 can represent human values with high fidelity this is not a surprise to Godshatter Theory. It will be surprising if we see that very small AI models, much smaller than the human genome, can represent human values accurately. Disclaimers: I'm not replying to the thread about fragility of value, only complexity. I disagree with Godshatter Theory on other grounds. I agree that it is a small positive update that human values are less complex than GPT-4.

Matthew Barnett42m20

While I did agree that Linch's comment reasonably accurately summarized my post, I don't think a large part of my post was about the idea that we should now think that human values are much simpler than Yudkowsky portrayed them to be. Instead, I believe this section from Linch's comment does a better job at conveying what I intended to be the main point,

Suppose in 2000 you were told that a100-line Python program (that doesn't abuse any of the particular complexities embedded elsewhere in Python) can provide a perfect specification of human values. Then you

... (read more)

2Linch6h

Thanks, I'd be interested in @Matthew Barnett's response.

The Dream Machine

sarahconstantin

Midjourney, “the dream machine”

I recently started working at Renaissance Philanthropy. It’s a new organization, and most people I’ve met haven’t heard of it.^[1] So I thought I’d explain, in my own words and speaking for myself rather than my employers, what we (and I) are trying to do here.

Modern Medicis

The “Renaissance” in Renaissance Philanthropy is a reference to the Italian Renaissance, when wealthy patrons like the Medicis commissioned great artworks and inventions.

The idea is that “modern Medicis” — philanthropists — should be funding the great scientists and innovators of our day to tackle ambitious challenges.

RenPhil’s role is to facilitate that process: when a philanthropist wants to pursue a goal, we help them turn that into a more concrete plan, incubate and/or administer new organizations to implement that plan,...

(Continue Reading – 3555 more words)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Do simulacra dream of digital sheep?

EuanMcLean

This is the third in a sequence of posts scrutinizing computational functionalism (CF). In a previous post, I defined a concrete claim that computational functionalists tend to make:

Theoretical CF: A simulation of a human brain on a computer, with physics perfectly simulated down to the atomic level, would cause the conscious experience of that brain.

I contrasted this with “practical CF”, the claim that a suitably low-fidelity simulation of a brain, like one that only captures functional properties, would be conscious. In the last post, I discussed practical CF. In this post, I’ll scrutinize theoretical CF.

To evaluate theoretical CF, I’m going to meet functionalists where they (usually) stand and adopt a materialist position about consciousness. That is to say that I’ll assume all details of a human’s conscious...

(Continue Reading – 2852 more words)

simon1h20

You can also disambiguate between

a) computation that actually interacts in a comprehensible way with the real world and

b) computation that has the same internal structure at least momentarily but doesn't interact meaningfully with the real world.

I expect that (a) can usually be uniquely pinned down to a specific computation (probably in both senses (1) and (2)), while (b) can't.

But I also think it's possible that the interactions, while important for establishing the disambiguated computation that we interact with, are not actually crucial to i... (read more)

4martinkunev2h

The way we figure out which one is "correct" is by comparing their predictions to what the subject says. In other words, one of those predictions is consistent with the subject's brain's output and this causes everbody to consider it as the "true" prediction. There could be countless other conscious experiences in the head, but they are not grounded by the appropriate input and output (they don't interact with the world in a reasonable way). I think it only seems that consciousness is a natural kind and this is because there is one computation that interacts with the world in the appropriate way and manifests itself in it. The other computations are, in a sense, disconnected. I don't see why consciousness has to be objective other than this being our intuition (which is notorious for being wrong out of hunter-gatherer contexts). Searle's wall is a strong argument that consciousness is as subjective as computation.

1Karl Krueger4h

What do you think of Daniel Böttger's "Consciousness as Recursive Reflections"? If qualia are instantiated as self-sustaining oscillations in the brain, which can merge to form larger patterns of thought that can recursively reflect on themselves, then this is a nonmagical physical process that could be simulated. It seems to me that if that simulation was sufficiently faithful that its oscillations could process information about themselves, those oscillations would be conscious in the same sense that our thoughts are conscious.

2Charlie Steiner4h

What I'm going to say is that I really do mean phenomenal consciousness. The person who turns off the alarm not realizing it's an alarm, poking at the loud thing without understanding it, is already so different from my waking self. And those are just the ones that I remember - the shape of the middle of the distribution implies the existence of an unremembered tail of the distribution. If I'm sleeping dreamlessly, and take a reflexive action such as getting goosebumps, am I having a kinesthetic experience? If you say yes here, then perhaps there is no mystery and you just use 'experience' idiosyncratically.

Lorec's Shortform

Lorec

1mo

1Lorec7h

Two physics riddles, since my last riddle has positive karma: 1. Why do we use the right-hand rule to calculate the Lorentz force, rather than using the left-hand rule? 2. Why do planetary orbits stabilize in two dimensions, rather than three dimensions [i.e. a shell] or zero [i.e. relative fixity]? [ It's clear why they don't stabilize in one dimension, at least: they would have to pass through the center of mass of the system, which the EMF usually prevents. ]

Lorec1h10

[ TBC, I know orbits can oscillate. However, most 3D shell orbits do not look like oscillating, but locally stable, 2D orbits. ]

Vladimir_Nesov's Shortform

Vladimir_Nesov

Ω 42mo

6Daniel Kokotajlo6h

Are you saying Anthropic actually has more compute (in the relevant sense) than OpenAI right now? That feels like a surprising claim, big if true.

9gwern3h

And in a way, they ought to be rolling in even more compute than it looks because they are so much more focused: Anthropic isn't doing image generation, it isn't doing voice synthesis, it isn't doing video generation... It does text LLMs. That's it. But nevertheless, an hour ago, working on a little literary project, I hit Anthropic switching my Claude to 'concise' responses to save compute. (Ironically, I think that may have made the outputs better, not worse, for that project, because Claude tends to 'overwrite', especially in what I was working on.)

Daniel Kokotajlo1h30

I'd guess that the amount spent on image and voice is negligible for this BOTEC?

I do think that the amount spent on inference for customers should be a big deal though. My understanding is that OpenAI has a much bigger userbase than Anthropic. Shouldn't that mean that, all else equal, Anthropic has more compute to spare for training & experiments? Such that if Anthropic has about as much compute total, they in effect have a big compute advantage?

14Vladimir_Nesov3h

For OpenAI, there are currently 3 datacenter buildings[1] near Phoenix Goodyear Airport that Dylan Patel is claiming are 48 megawatts each and filled with H100s, for about 100K H100s. This probably got online around May 2024, the reason for the announcement and the referent of Kevin Scott's blue whale slide. There are claims about a future cluster of 300K B200s and a geographically distributed training system of 500K-700K B200s, but with B200s deliveries in high volume to any given customer might only start in early to mid 2025, so these systems will probably get online only towards end of 2025. In the meantime, Anthropic might have a lead in having the largest cluster, even if they spend less on compute for smaller experiments overall. It might take a while to get it working, but there might be a few months there. And given how good Claude 3.5 Sonnet is, together with the above musings on how it's plausibly merely 4e25 FLOPs based on Dario Amodei's (somewhat oblique) claim about cost, additionally getting compute advantage in training a frontier model could carry them quite far. ---------------------------------------- 1. There are 4.5 buildings now at that site, but you can see with Google Street View from Litchfield Rd that in Sep 2024 only the first 3 had walls, so the 4th is probably not yet done. ↩︎

LESSWRONG
is fundraising!
LW
$

The 2023 Review
Nomination Voting

The 2023 Review

Nomination Voting

Quick Takes

Popular Comments

Recent Discussion

Modern Medicis

The 2023 ReviewNomination Voting

The 2023 Review

Nomination Voting

Quick Takes

Popular Comments

Recent Discussion

Modern Medicis

The 2023 Review
Nomination Voting