Customize
The Ammann Hypothesis: Free Will as a Failure of Self-Prediction A fox chases a hare. The hare evades the fox. The fox tries to predict where the hare is going - the hare tries to make it as hard to predict as possible.  Q: Who needs the larger brain?  A: The fox.  This is a little animal tale meant to illustrate the following phenomenon:  Generative complexity can be much smaller than predictive complexity under partial observability. In other words, when partially observing a blackbox there are simple internal mechanism that create complex patterns that require very large predictors to predict well.  Consider the following simple 2-state HMM  Note that the symbol 0 is output in three different ways: A -> A, A-> B, and B -> B. This means that if we see the symbol 0 we don't know where we are. We can use Bayesian updating to  guess where we are but starting from a stationary distribution our belief states can become extremely complicated - in fact, the data sequence generated by the simple nonunifalar source has an optimal predictor HMM that requires infinitely many states : This simple example illustrates the gap between generative complexity and predictive complexity, a generative-predictive gap.  I note that in this case the generative-predictive is intrinsic. The gap happens even (especially!) in the ideal limit of perfect prediction! Free Will as generative-predictive gap The brain is a predictive engine. So much is accepted. Now imagine an organism/agent endowed with a brain predicting the external world. To do well, it may be helpful to predict its own actions. What if this process has a predictive-generative gap?The brain will ascribe an inherent uncertainty ['entropy'] to its own actions! An agent having a generative-predictive gap for predicting its own action would experience a mysterious force ' choosing'  its actions. It may even decide to call this irreducible uncertainty of self-prediction "Free Will" .     ******************************
abramdemskiΩ26540
17
Here's what seem like priorities to me after listening to the recent Dwarkesh podcast featuring Daniel Kokotajlo: 1. Developing the safer AI tech (in contrast to modern generative AI) so that frontier labs have an alternative technology to switch to, so that it is lower cost for them to start taking warning signs of misalignment of their current tech tree seriously. There are several possible routes here, ranging from small tweaks to modern generative AI, to scaling up infrabayesianism (existing theory, totally groundbreaking implementation) to starting totally from scratch (inventing a new theory). Of course we should be working on all routes, but prioritization depends in part on timelines. * I see the game here as basically: look at the various existing demos of unsafety and make a counter-demo which is safer on multiple of these metrics without having gamed the metrics. 2. De-agentify the current paradigm or the new paradigm: * Don't directly train on reinforcement across long chains of activity. Find other ways to get similar benefits. * Move away from a model where the AI is personified as a distinct entity (eg, chatbot model). It's like the old story about building robot arms to help feed disabled people -- if you mount the arm across the table, spoonfeeding the person, it's dehumanizing; if you make it a prosthetic, it's humanizing. * I don't want AI to write my essays for me. I want AI to help me get my thoughts out of my head. I want super-autocomplete. I think far faster than I can write or type or speak. I want AI to read my thoughts & put them on the screen. * There are many subtle user interface design questions associated with this, some of which are also safety issues, eg, exactly what objective do you train on? * Similarly with image generation, etc. * I don't necessarily mean brain-scanning tech here, but of course that would be the best way to achieve it. * Basically, use AI to overcome human information-processing bottlene
The recent push for coal power in the US actually makes a lot of sense. A major trend in US power over the past few decades has been the replacement of coal power plants by cheaper gas-powered ones, fueled largely by low-cost natural gas from fracking. Much (most?) of the power for recently constructed US data centers have come from the continued operation of coal power plants that would otherwise been decommissioned.  The sheer cost (in both money and time) of building new coal plants in comparison to gas power plants still means that new coal power plants are very unlikely to be constructed. However, not shutting down a coal power plant is instant when compared to the 12-36 months needed to build a gas power plant.
Snapshot of a local(=Czech) discussion detailing motivations and decision paths of GAI actors, mainly the big developers: Contributor A, initial points: For those not closely following AI progress, two key observations: 1. Public Models vs. True Capability: Publicly accessible AI models will become increasingly poor indicators of the actual state-of-the-art in AI. Competitive AI labs will likely prioritize using their most advanced models internally to accelerate their own research and gain a dominant position, rather than releasing these top models for potentially temporary revenue gains. 2. Recursive Self-Improvement Timeline: The onset of recursive self-improvement (leading to an "intelligence explosion," where AI significantly accelerates its own research and development) is projected by some authors to potentially begin around the end of 2025. Analogy to Exponential Growth: The COVID-19 pandemic demonstrated how poorly humans perceive and react to exponential phenomena (e.g., ignoring low initial numbers despite a high reproduction rate). AI development is also progressing exponentially. This means it might appear that little is happening from a human perspective, until a period of rapid change occurs over just a few months, potentially causing socio-technical shifts equivalent to a century of normal development. This scenario underpins the discussion. Contributor C: * Raises a question regarding point 1: Since AI algorithm and hardware development are relatively narrow domains, couldn't their progress occur somewhat in parallel with the commercial release of more generally focused models? Contributor A: * Predicts this is unlikely. Assumes computational power ("compute") will remain the primary bottleneck. * Believes that with sufficient investment, the incentive will be to dedicate most inference compute to AI-driven AI research (or synthetic data, etc.) once recursive self-improvement starts. Notes this might already be happening, with the deplo
Forecasting and scenario building has become quite popular and prestigious in EA-adjacent circles. I see extremely detailed scenario building & elaborate narratives.  Yes AI will be big, AGI plausibly close. But how much detail can one really expect to predict? There were a few large predictions that some people got right, but once one zooms in the details don't fit while the correct predictions were much more widespread within the group of people that were paying attention.  I can't escape the feeling that we're quite close to the limits of the knowable and 80% of EA discourse on this is just larp.   Does anybody else feel this way?

Popular Comments

Recent Discussion

Summary:

When stateless LLMs are given memories they will accumulate new beliefs and behaviors, and that may allow their effective alignment to evolve. (Here "memory" is learning during deployment that is persistent beyond a single session.)[1]

LLM agents will have memory: Humans who can't learn new things ("dense anterograde amnesia") are not highly employable for knowledge work. LLM agents that can learn during deployment seem poised to have a large economic advantage. Limited memory systems for agents already exist, so we should expect nontrivial memory abilities improving alongside other capabilities of LLM agents.

Memory changes alignment: It is highly useful to have an agent that can solve novel problems and remember the solutions. Such memory includes useful skills and beliefs like "TPS reports should be filed in the folder ./Reports/TPS"....

khafra20

Good timing--the day after you posted this, a round of new Tom & Jerry cartoons swept through twitter, fueled by transformer models which included in their layers MLPs that can learn at test time.  Github repo here: https://github.com/test-time-training (The videos are more eye-catching, but they've also done text models). 

I spent a couple of weeks writing this new introduction to AI timelines. Posting here in case useful to share and for feedback. The aim is to be up-to-date, more balanced than Situational Awareness, but still relatively accessible.

In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress:

  • OpenAI’s Sam Altman: Shifted from saying in November “the rate of progress continues” to declaring in January “we are now confident we know how to build AGI”

  • Anthropic’s Dario Amodei: Stated in January “I’m more confident than I’ve ever been that we’re close to powerful capabilities… in the next 2-3 years”

  • Google DeepMind’s Demis Hassabis: Changed from “as soon as 10 years” in autumn to “probably three to five years away” by January.

What explains the shift? Is...

6Vladimir_Nesov
Traditionally steps of GPT series are roughly 100x in raw compute (I'm not counting effective compute, since it's not relevant to cost of training). GPT-4 is 2e25 FLOPs. Which puts "GPT-6" at 2e29 FLOPs. To train a model in 2028, you would build an Nvidia Rubin Ultra NVL576 (Kyber) training system in 2027. Each rack holds 576 compute dies at about 3e15 BF16 FLOP/s per die[1] or 1.6e18 FLOP/s per rack. A Blackwell NVL72 datacenter costs about $4M per rack to build, possibly a non-Ultra Rubin NVL144 datacenter will cost about $5M per rack, and a Rubin Ultra NVL576 datacenter might cost about $12M per rack[2]. To get 2e29 BF16 FLOPs in 4 months at 40% utilization, you'd need 30K racks that would cost about $360B all-in (together with the rest of the training system). Which is significantly more than "tens of billions of dollars". "GPT-8" is two steps of 100x in raw compute up from "GPT-6", at 2e33 FLOPs. You'd need to use 10000x more compute than what $360B buy in 2027. Divide it by how much cheaper that compute gets within a few years, let's say 8x cheaper. What we get is $450T, which is much more than merely "trillions", and also technologically impossible to produce at that time without transformative AI. ---------------------------------------- 1. Chips in Blackwell GB200 systems are manufactured with 4nm process and produce about 2.5 dense BF16 FLOP/s per chip, with each chip holding 2 almost reticle sized compute dies. Rubin moves to 3nm, compared to Blackwell at 4nm, which makes each die about 2x more performant (from more transistors and higher clock speed, but the die size must remain the same), which predicts about 2.5 dense BF16 FLOP/s per die or 5 BF16 FLOP/s per 2-die chip. (Nvidia announced that dense FP8 performance will increase 3.3x, but that's probably due to giving more transistors to FP8, which can't be done as much for BF16 since it already needs a lot.) To separately support this, today Google announced Ironwood, their 7th generation

Thanks, useful to have these figures and an independent data on these calculations.

I've been estimating it based on a 500x increase in effective FLOP per generation, rather than 100x of regular FLOP.

Rough calculations are here.

At the current trajectory, the GPT-6 training run costs $6bn in 2028, and GPT-7 costs $130bn in 2031. 

I think that makes GPT-8 a couple of trillion in 2034.

You're right that if you wanted to train GPT-8 in 2031 instead, then it would cost roughly 500x more than training GPT-7 that year.

This is another post in my ongoing "Exploring Cooperation" substack series, focused on something more directly related to LLMs and alignment - I am including the post in its entirety.

Throughout this series, we’ve repeatedly circled around the requirements for genuine cooperation—shared context, aligned goals, and outcomes that matter to the participants. In earlier posts, we explored the importance of preferences and identity, noting that cooperation depends not just on behavior, but on agents who care about how things turn out and persist long enough for reciprocity and coordination to make sense. We also discussed why status and power can undermine cooperation, especially when incentives diverge or when agents lack continuity across time. Now, after laying this conceptual groundwork across discussions of evolution, economics, and history, we’re finally...

Short AI takeoff timelines seem to leave no time for some lines of alignment research to become impactful. But any research rebalances the mix of currently legible research directions that could be handed off to AI-assisted alignment researchers or early autonomous AI researchers whenever they show up. So even hopelessly incomplete research agendas could still be used to prompt future capable AI to focus on them, while in the absence of such incomplete research agendas we'd need to rely on AI's judgment more completely. This doesn't crucially depend on giving significant probability to long AI takeoff timelines, or on expected value in such scenarios driving the priorities.

Potential for AI to take up the torch makes it reasonable to still prioritize things that have no hope at all...

That seems correct, but I don't think any of those aren't useful to investigate with AI, despite the relatively higher bar.

2ChristianKl
The key is still to distinguish good from bad ideas.  In the linked post, you essentially make the argument that "Whole brain emulation artificial intelligence is safer than LLM-based artificial superintelligence". That's a claim that might be true or not true. On aspect of spending more time with that idea would be to think more critically about whether that's true. However, even if it would be true, it wouldn't help in a scenario where we already have LLM-based artificial superintelligence.
2avturchin
The same is valid for life extension research. It requires decades, and many, including Brian Johnson, say that AI will solve aging and therefore human research in aging is not relevant. However, most of aging research is about collecting data about very slow processes. The more longitudinal data we collect, the easier it will be for AI to "take up the torch."

“In the loveliest town of all, where the houses were white and high and the elms trees were green and higher than the houses, where the front yards were wide and pleasant and the back yards were bushy and worth finding out about, where the streets sloped down to the stream and the stream flowed quietly under the bridge, where the lawns ended in orchards and the orchards ended in fields and the fields ended in pastures and the pastures climbed the hill and disappeared over the top toward the wonderful wide sky, in this loveliest of all towns Stuart stopped to get a drink of sarsaparilla.”
— 107-word sentence from Stuart Little (1945)

Sentence lengths have declined. The average sentence length was 49 for Chaucer (died 1400), 50...

Having studied Latin, or other such classical training, seems to be but one method of imbuing oneself with the the style of writing longer, more complicated sentences. Personally I acquired the taste for such eccentricities perusing sundry works from earlier times. Romances, novels and other such frivolities from, or set in, the 18-th century being the main culprits.

I suppose this sort of proves your point, in that those authors learnt to create complicated sentences from learning Latin, and the later writers copied the style, thinking either that it's fun, correct, or wanting to seem more authentic.

This is a linkpost for https://arxiv.org/abs/2504.06820

Diffractor is the first author of this paper.
Official title: "Regret Bounds for Robust Online Decision Making"

Abstract: We propose a framework which generalizes "decision making with structured observations" by allowing robust (i.e. multivalued) models. In this framework, each model associates each decision with a convex set of probability distributions over outcomes. Nature can choose distributions out of this set in an arbitrary (adversarial) manner, that can be nonoblivious and depend on past history. The resulting framework offers much greater generality than classical bandits and reinforcement learning, since the realizability assumption becomes much weaker and more realistic. We then derive a theory of regret bounds for this framework. Although our lower and upper bounds are not tight, they are sufficient to fully characterize power-law learnability. We demonstrate this theory

...
2Alexander Gietelink Oldenziel
Congratulations on this paper. It seems like a major result.  Any chance of more exposition for those of us less cognitively-inclined? =)

Thank you <3

Any chance of more exposition for those of us less cognitively-inclined? =)

Read the paper! :)

It might seem long at first glance, but all the results are explained in the first 13 pages, the rest is just proofs. If you don't care about the examples, you can stop on page 11. Naturally, I welcome any feedback on the exposition there.

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

Epistemic status: Amateur synthesis of medical research that is still recent but now established enough to make it into modern medical textbooks. Some specific claims vary in evidence strength. I’ve spent ~20-30 hours studying the literature and treatment approaches, which were very effective for me.

Disclaimer: I'm not a medical professional. This information is educational only, not medical advice. Consult healthcare providers for medical conditions.

Key claims

This post builds on previous discussions about the fear-pain cycle and learned chronic pain. The post adds the following claims:

  1. Neuroplastic pain - pain learned by the brain (and/or spinal cord) - is a well-evidenced phenomenon and widely accepted in modern medical research (very high confidence).
  2. It explains many forms of chronic pain previously attributed to structural causes - not just wrist pain and back pain (high confidence). Other
...

In 2019, the WHO's added "nociplastic pain" (another word for neuroplastic pain) as an official new category of pain, alongside the long established nociceptic and neuropathic pain categories

It's worth noting that in 2019 the WHO also added various diagnosis from Chinese traditional medicine. The process that the WHO uses is not about finding truth but to provide codes that allow healthcare providers to talk with each other and store diagnoses. 

1Rafka
Thanks for the write-up—I hadn’t looked into neuroplastic pain before, but it rang a bell. A year ago, I messed up my leg (probably sciatic nerve, not diagnosed), and the pain stuck around way longer than it should have. I couldn’t walk for more than five minutes without it flaring up, even weeks after the initial strain. It clearly should’ve healed by then—nothing was torn, broken, or visibly inflamed—but the pain stayed. What finally worked wasn’t rest, it was more walking. Slow, deliberate, painful-but-not-too-painful walking, plus stretching. It hurt, but it got better. And once I saw that, something flipped—now whenever that sensation comes back, I’m not worried. I just think, “yeah, I know this one,” and it fades. That sounds a lot like the “engage with the pain while reframing it as safe” strategy you described, and it tracks well with my experience. I’ll be experimenting to see if the same approach works on other kinds of pain, too.

Snapshot of a local(=Czech) discussion detailing motivations and decision paths of GAI actors, mainly the big developers:

Contributor A, initial points:

For those not closely following AI progress, two key observations:

  1. Public Models vs. True Capability: Publicly accessible AI models will become increasingly poor indicators of the actual state-of-the-art in AI. Competitive AI labs will likely prioritize using their most advanced models internally to accelerate their own research and gain a dominant position, rather than releasing these top models for potentia
... (read more)

Researchers used RNA sequencing to observe how cell types change during brain development. Other researchers looked at connection patterns of neurons in brains. Clear distinctions have been found between all mammals and all birds. They've concluded intelligence developed independently in birds and mammals; I agree. This is evidence for convergence of general intelligence.

Your headline overstates the results. The last common ancestor of birds an mammals probably wasn't exactly unintelligent. (In contrast to our last common ancestor with the octopus, as the article discusses.)

1Jonas Hallgren
I see your point, yet if the given evidence is 95% in the past, the 5% in the future only gets a marginal amount added to it, I do still like the idea of crossing off potential filters to see where the risks are so fair enough!
3Knight Lee
What I'm trying to argue is that there could easily be no Great Filter, and there could exist trillions of trillions of observers who live inside the light cone of an old alien civilization, whether directly as members of the civilization, or as observers who listen to their radio. It's just that we're not one of them. We're one of the first few observers who aren't in such a light cone. Even though the observers inside such light cones outnumber us a trillion to one, we aren't one of them. :) if you insist on scientific explanations and dismiss anthropic explanations, then why doesn't this work as an answer?
3Julian Bradshaw
Oh okay. I agree it's possible there's no Great Filter.