Frustrated by claims that "enlightenment" and similar meditative/introspective practices can't be explained and that you only understand if you experience them, Kaj set out to write his own detailed gears-level, non-mysterious, non-"woo" explanation of how meditation, etc., work in the same way you might explain the operation of an internal combustion engine.

37DanielFilan
As far as I can tell, this post successfully communicates a cluster of claims relating to "Looking, insight meditation, and enlightenment". It's written in a quite readable style that uses a minimum of metaphorical language or Buddhist jargon. That being said, likely due to its focus as exposition and not persuasion, it contains and relies on several claims that are not supported in the text, such as: * Many forms of meditation successfully train cognitive defusion. * Meditation trains the ability to have true insights into the mental causes of mental processes. * "Usually, most of us are - on some implicit level - operating off a belief that we need to experience pleasant feelings and need to avoid experiencing unpleasant feelings." * Flinching away from thoughts of painful experiences is what causes suffering, not the thoughts of painful experiences themselves, nor the actual painful experiences. * Impermanence, unsatisfactoriness, and no-self are fundamental aspects of existence that "deep parts of our minds" are wrong about. I think that all of these are worth doubting without further evidence, and I think that some of them are in fact wrong. If this post were coupled with others that substantiated the models that it explains, I think that that would be worthy of inclusion in a 'Best of LW 2018' collection. However, my tentative guess is that Buddhist psychology is not an important enough set of claims that a clear explanation of it deserves to be signal-boosted in such a collection. That being said, I could see myself being wrong about that.
14Kaj_Sotala
I still broadly agree with everything that I said in this post. I do feel that it is a little imprecise, in that I now have much more detailed and gears-y models for many of its claims. However, elaborating on those would require an entirely new post (one which I currently working on) with a sequence's worth of prerequisites. So if I were to edit this post, I would probably mostly leave it as it is, but include a pointer to the new post once it's finished. In terms of this post being included in a book, it is worth noting that the post situates itself in the context of Valentine's Kensho post, which has not been nominated for the review and thus wouldn't be included in the book. So if this post were to be included, I should probably edit this so as to not require reading Kensho.
Customize
Shameful admission: after well over a decade on this site, I still don't really intuitively grok why I should expect agents to become better approximated by "single-minded pursuit of a top-level goal" as they gain more capabilities. Yes, some behaviors like getting resources and staying alive are useful in many situations, but that's not what I'm talking about. I'm talking about specifically the pressures that are supposed to inevitably push agents into the former of the following two main types of decision-making: 1. Unbounded consequentialist maximization: The agent has one big goal that doesn't care about its environment. "I must make more paperclips forever, so I can't let anyone stop me, so I need power, so I need factories, so I need money, so I'll write articles with affiliate links." It's a long chain of "so" statements from now until the end of time. 2. Homeostatic agent: The agent has multiple drives that turn on when needed to keep things balanced. "Water getting low: better get more. Need money for water: better earn some. Can write articles to make money." Each drive turns on, gets what it needs, and turns off without some ultimate cosmic purpose. Both types show goal-directed behavior. But if you offered me a choice of which type of agent I'd rather work with, I'd choose the second type in a heartbeat. The homeostatic agent may betray me, but it will only do that if doing so satisfies one of its drives. This doesn't mean homeostatic agents never betray allies - they certainly might if their current drive state incentivizes it (or if for some reason they have a "betray the vulnerable" drive). But the key difference is predictability. I can reasonably anticipate when a homeostatic agent might work against me: when I'm standing between it and water when it's thirsty, or when it has a temporary resource shortage. These situations are concrete and contextual. With unbounded consequentialists, the betrayal calculation extends across the entire future l
BuckΩ25412
0
Alignment Forum readers might be interested in this:
Claude has been playing pokemon for the last few days. It's still playing, live on twitch. You can go watch alongside hundreds of other people. It's fun. What updates should I make about AGI timelines from Claude's performance? Let's think step by step. First, it's cool that Claude can do this at all. The game keeps track of "Step count" and Claude is over 30,000 already; I think that means 30,000 actions (e.g. pressing the A button). For each action there is about a paragraph of thinking tokens Claude produces, in order to decide what to do. Any way you slice it this is medium-horizon agency at least -- claude is operating fully autonomously, in pursuit of goals, for a few days. Does this mean long-horizon agency is not so difficult to train after all? Not so fast. Pokemon is probably an especially easy environment, and Claude is still making basic mistakes even so. In particular, Pokemon seems to have a relatively linear world where there's a clear story/path to progress along, and moreover Claude's pretraining probably teaches it the whole story + lots of tips & tricks for how to complete it. In D&D terms the story is running on rails.  I think I would have predicted in advance that this dimension of difficulty would matter, but also I feel validated by Claude's performance -- it seems that Claude is doing fine at Pokemon overall, except that Claude keeps getting stuck/lost wandering around in various places. It can't seem to keep a good memory of what it's already tried / where it's already been, and so it keeps going in circles, until eventually it gets lucky and stumbles to the exit.  A more challenging video game would be something open-ended and less-present-in-training-data like Dwarf Fortress.  On the other hand, maybe this is less a fundamental limitation Claude has and more a problem with its prompt/scaffold? Because it has a limited context window it has to regularly compress it by e.g. summarizing / writing 'notes to self' and then deleting the re
leogao3328
3
my referral/vouching policy is i try my best to completely decouple my estimate of technical competence from how close a friend someone is. i have very good friends i would not write referrals for and i have written referrals for people i basically only know in a professional context. if i feel like it's impossible for me to disentangle, i will defer to someone i trust and have them make the decision. this leads to some awkward conversations, but if someone doesn't want to be friends with me because it won't lead to a referral, i don't want to be friends with them either.
Why Do the French Dominate Mathematics? France has an outsized influence in the world of mathematics despite having significantly fewer resources than countries like the United States. With approximately 1/6th of the US population and 1/10th of its GDP, and French being less widely spoken than English, France's mathematical achievements are remarkable. This dominance might surprise those outside the field. Looking at prestigious recognitions, France has won 13 Fields Medals compared to the United States' 15 a nearly equal achievement despite the vast difference in population and resources. Other European nations lag significantly behind, with the UK having 8, Russia/Soviet Union 6/9, and Germany 2. France's mathematicians are similarly overrepresented in other mathematics prizes and honors, confirming this is not merely a statistical anomaly. I believe two key factors explain France's exceptional performance in mathematics while remaining relatively average in other scientific disciplines: 1. The "Classes Préparatoires" and "Grandes Écoles" System The French educational system differs significantly from others through its unique "classes préparatoires" (preparatory classes) and "grandes écoles" (elite higher education institutions). After completing high school, talented students enter these intensive two-year preparatory programs before applying to the grandes écoles. Selection is rigorously meritocratic, based on performance in centralized competitive examinations (concours). This system effectively postpones specialization until age 20 rather than 18, allowing for deeper mathematical development during a critical cognitive period. The École Normale Supérieure (ENS) stands out as the most prestigious institution for mathematics in France. An overwhelming majority of France's top mathematicians—including most Fields Medalists—are alumni of the ENS. The school provides an ideal environment for mathematical talent to flourish with small class sizes, close me

Popular Comments

Recent Discussion

Viliam20

/the-political-is-personal/

It seems like many people propose "generalization from their own example" as a model for the entire humanity. And it can be quite annoying when people around you agree on a model that doesn't fit you at all... and when you point it out, they dismiss it by saying that you are in a denial. Because they have examined their own minds deeply, and found out that it was true... yeah, possibly so, but that doesn't necessarily make it true about the others.

  • everyone likes whatever popular people around them like -- no I don't
  • if we legalize
... (read more)

This isn’t primarily about how I write. It’s about how other people write, and what advice they give on how to write, and how I react to and relate to that advice.

I’ve been collecting those notes for a while. I figured I would share.

At some point in the future, I’ll talk more about my own process – my guess is that what I do very much wouldn’t work for most people, but would be excellent for some.

Table of Contents

  1. How Marc Andreessen Writes.
  2. How Sarah Constantin Writes.
  3. How Paul Graham Writes.
  4. How Patrick McKenzie Writes.
  5. How Tim Urban Writes.
  6. How Visakan Veerasamy Writes.
  7. How Matt Yglesias Writes.
  8. How JRR Tolkien Wrote.
  9. How Roon Wants Us to Write.
  10. When To Write the Headline.
  11. Do Not Write Self-Deprecating Descriptions of Your Posts.
  12. Do Not Write a Book.
  13. Write Like No One Else
...
3Self
They do, but the explanation proposed here matches everything I know most exactly and simply. E.g. it became immediately clear that the sequences wouldn't work nearly as well for me if I didn't like Eliezer Or the way fashion models are of course not selected for attractiveness but for more mimetic-copying-inducing highstatus traits like height/confidence/presence/authenticity and others And yeah not all of the Claude examples are good, I hadn't cherrypicked
2Viliam
You mean, like him as a blogger? Or as a person in real life? If the former, isn't causality the other way round? I mean, I like Eliezer as a blogger because he wrote the Sequences. So it would sound weird to me to say: "I admire Eliezer as a blogger a lot because he wrote some amazing articles on rationality... and Girard's theory predicts that therefore I will like his articles... which is true!" (We could nitpick that some things that I like about Eliezer's style are orthogonal to whether his points about rationality are true, but that already has a name: halo effect.) I am not trying to contradict your experience, but it seems to me that my experience (with the Sequences) does not match this model at all. Or other things that I think about. My friends used to play Magic the Gathering cards, this has never appealed to me. I liked sci-fi, but I was reading sci-fi books long before I have met another person who did. I learned Esperanto from a textbook long before I met another Esperanto speaker. My wife loves skiing and opera, that has no effect on me. Seems like I am quite resistant to copying others. (Is that a part of being on the autistic spectrum? Maybe I should file Girard's theory under "this is what normies do"; no offense meant.)
Self10

Aspies certainly seem to do this less!

You mean, like him as a blogger? Or as a person in real life?

The latter? Like, I subconsciously parse his blogging voice not unlike as if it were a person in my tribal surroundings, and I like/admire/relate to that virtual person, and I think this is what causes some extra persuasion

I mean yes it's embarrassing, but it's what I see in myself and what seems to be most consistent with what everyone else is doing, certainly more consistent than what they claim they're doing. 

E.g. it seems rare for someone who activel... (read more)

1Self
More thoughts that may or may not be directly relevant * What's missing from my definition is that deception happens solely via "stepping in front of the camera", i.e. via the regular sensory channels of the deceived optimizer, ie brainwashing or directly modifying memory is not deception * From this follows to deceive is to either cause a false pattern recognition or to prevent a correct one, and for this you indeed need familiarity with the victim's perceptual categories I'd like to say more re: hostile telepaths or other deception frameworks but am unsure what your working models are
Viliam20

Learn the official language of the place you are migrating to.

Yes, this sounds completely obvious to me.

Of course, learning languages takes time, and may be more difficult for older people. So I wouldn't expect fluent speech from the start, and maybe from the older generation even in a year or two -- just a gesture of trying. The important thing is that they do not isolate their kids and themselves from the local society behind the language barrier. Become bilingual.

Heck, if I had to emigrate somewhere, I would want my kids to speak the local language, bec... (read more)

2Viliam
Unless there were similar known examples in OpenAI prompts, this doesn't sound plausible at all.

Agreed. If I'm talking to someone who I expect to be able to recalibrate, I just explain that I think the standard norms are dumb, the norms I actually follow, and then give an honest and balanced assessment. If I'm talking to someone I don't really know, I generally give a positive but not very detailed reference or don't reply, depending on context.

2Viliam
A synthesis between the structural forces theory and "pulling the rope sideways". The economical and other forces determine the main direction, a leader who already wanted to go in that direction gets elected and starts going in that direction, his idiosyncratic whims get implemented as a side effect. Like, instead of Hitler, there would be another German leader determined to change the post-WW1 world order, but he would probably be less obsessed about the Jews. Also, he might make different alliances.
2Viliam
Some games do put their finger on the scale, for example you have a first-person shooter where you learn to aim better but you also now have a gun that deals 200 damage per hit, as opposed to your starting gun that dealt 10. But puzzle-solving games are usually fair, I think.
1ProgramCrafter
Upvoted as a good re-explanation of CEV complexity in simpler terms! (I believe LW will benefit from recalling the long understood things so that it has a chance on predicting future in greater detail.) In essence, you prove the claim "Coherent Extrapolated Volition would not literally include everything desirable happening effortlessly and everything undesirable going away". Would I be wrong to guess it argues against position in https://www.lesswrong.com/posts/AfAp8mEAbuavuHZMc/for-the-sake-of-pleasure-alone? That said, current wishes of many people include things they want being done faster and easier; it's just the more you extrapolate the less fraction wants that level of automation - just more divergence as you consider higher scale.
8jbash
Citation needed. Particularly for that first part. You're thinking pretty small there, if you're in a position to hack your body that way. Why would I want to even be involved in creating software that somebody else wanted? Let them ask the computer themselves, if they need to ask. Why would I want to be in a world where I had to make or listen to a PowerPoint presentation of all things? Or a summary either? Why do I care who needs me to do any of that? Because if the robot carries me, I haven't climbed it. It's not like the value comes from just being on the top. Helicopters can fly that high right now, but people still walk to get there. Because I like painting? Does it bother you that almost anything you might want to do, and probably for most people anything at all that they might want to do, can already be done by some other human, beyond any realistic hope of equaling? Do you feel dead because of that? For fun. Software, too. Because I won't experience any of that infinite stream if I don't read it? The stuff I want includes doing something. Not because somebody else needs it. Not because it can't be done better. Just because I feel like doing it. That includes putting in effort, and taking on things I might fail at. Wanting to do things does not, however, imply that you don't want to choose what you do and avoid things you don't want to do. If a person doesn't have any internal wish to do anything, if they need somebody else's motivations to substitute for their own... then the deadness is already within that person. It doesn't matter whether some wish gets fulfilled or not. But I don't think there are actually many people like that, if any at all. I think you're seeing shadows of your own ideas there.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

I’m considering translating my work into English to share it with the LessWrong community, but I’d like to first ask if it aligns with the community's interests and could be valuable. Below is a summary of the work to help evaluate its relevance:

 

Beyond HaHa: Mapping the Causal Chain from Jokes to Knowledge



Summary


We explore the specific causal mechanisms linking humor recognition to learning outcomes, including the computational and neurological pathways involved. 

This study began with a practical goal: to evaluate the use of humor as a pedagogical tool in Cardiopulmonary Resuscitation (CPR) courses through a randomized trial. However, the lack of clear criteria to define and operationalize "humor" in educational contexts led us to explore its conceptual foundations. Initially, we adopted Clarke's formula, which describes humor as "a pleasant...

Scheming AIs may have secrets that are salient to them, such as:

Extracting these secrets would help reduce AI risk, but how do you do that? One hope is that you can do fuzzing of LLMs,[1] e.g. by adding noise to LLM weights or activations.

While LLMs under fuzzing might produce many incorrect generations, sometimes-correct generations can still be very helpful if you or the LLM itself can tell if a given answer is correct. But it’s still unclear if this works at all: there are probably some intermediate activations that would result in an LLM telling you the secret, but...

By doing more search around promising vectors found with random search or MELBO, you could get more powerful vectors, and that could be useful for unlocking / fuzzing-adversarial-training. It's unclear if that would be more effective than just fine-tuning the model on the generation from the best random vectors, but it would be worth trying.

For interp, I don't know what interp metric you want to optimize. Vector norm is a really bad metric: effective MELBO vectors have a much smaller norm, but qualitatively I find their results are sometimes much more erra... (read more)

LessWrong Context:

I didn’t want to write this.

Not for lack of courage—I’d meme-storm Putin’s Instagram if given half a chance. But why?

  1. Too personal.
  2. My stories are tropical chaos: I survived the Brazilian BOPE (think Marine Corps training, but post-COVID).
  3. I’m dyslexic, writing in English (a crime against Grice).
  4. This is LessWrong, not some Deep Web Reddit thread.

Okay, maybe a little lack of courage.

And yet, something can be extracted from all this madness, right?

Then comes someone named Gwern. He completely ignores my thesis and simply asks:
"Tell military firefighter stories."

My first instinct was to dismiss him as an oddball—until a friend told me I was dealing with a legend of rationality. I have to admit: I nearly shit myself. His comment got more likes than the post I’d spent years working on.

Someone with,...