You don't need to. You just need to be in good condition yourself and actually paying attention. Professional delivery drivers routinely achieve seemingly absurd mileages per incident.
TI-83 Basic was the first programming language I really felt like I had mastered. For a while in my first CS college class I was writing code in TI basic and translating it to C++.
Ah, those were the days. It was incredibly exciting to download and try assembly games. Bill Nagel blew my mind with Turbo Breakout and Snake...
Does anyone remember the TI website wars? TI Files (later TI Philes) vs the lowly weak ticalc.org... but look which one is still around :-)
If you dont believe me you can look up latency tests for it. without DLSS rendering is ~15ms. With preformance mode DLSS that raises to about 30-35MS. Balanced is about 60ms and beautiful is 90-100ms input latency.
I have seen similar issues when the linked Drive folder gets deleted or permissions change. Sometimes creating a new folder and triggering the upload flow again recreates the link. Not very obvious, but worth trying.
It genuinely is, and Iâd sooner see regulation targeting it than someoneâs multileg parlay. Thereâs a much clearer line between alcohol on demand and public misconduct or injuries from DUI, than gambling and a more nebulous societal harm.
Nobody could find it back in 1994 either! That was part of the fun. You stumbled on a webring or somebody's curated oracle and found a bunch of interesting weird tiny websites.
(founder of Lossfunk, the lab behind this research.)
Esolang-Bench went viral on X. A lot of discussion ensued; addressing some of the common points that came up. Addressing a few questions about our Esolang-Bench. Hope it helps.
a) Why do it? Does it measure anything useful?
It was a curiosity-driven project. We're interested in how humans exhibit sample-efficiency in learning and OOD generalization. So we simply asked: if models can zero/few shot correct answers for simple programming problems in Python, can they do the same in esoteric languages as well?
The benchmark is what it is. Different people can interpret its usefulness differently, and we encourage that.
b) But humans can't also write esoteric languages well. It's an unfair comparison.
Primarily, we're interested in measuring LLM capabilities. With the talk of ASI, it is supposed that their capabilities will soon be super-human. So, our primary motivation wasn't to compare to humans but to check what they can do this by-construction difficult benchmark.
However, we do believe that humans are able to teach themselves a new domain by transferring their old skills. So this benchmark was to set a starting point to explore how AI systems can do the same as well (which is what we're exploring now)
c) But Claude Code crushes it. You limited models artificially.
Yes, we tested models in zero and few shot capabilities. And in the agentic loop we describe in the paper, we limit the number of iterations. As we wrote above, we wanted to understand their performance from a comparative point of view (say on highly represented languages like Python) and that's by the benchmark by design is like this.
After the paper was finalized, we experimented with agentic systems where we gave models tools like bash and allowed unlimited iterations (but limited submission attempts). They indeed perform much better.
The question that's relevant is what makes these models perform so well when you give them tools and iterations v/s when you don't. Are they reasoning / learning like humans or is it something else?
d) So, are LLMs hyped? Or is our study clickbait?
The paper, code and benchmark are all open source.
We encourage whoever is interested to read it, and make up their own minds.
(We couldn't help notice that the same set of results were interpreted wildly differently within the community. A debate between opposing camps of LLMs ensued. Perhaps that's a good thing?)
> Authorâs argument is those hardware improvements could have been had for free with X11 upgrades.
I do NOT miss having tearing all the time with X11. There were always kludgy workarounds. Even if you stopped and said ok, lets not run nvidia, let's do intel they have great FOSS driver support, we look back at X11 2D acceleration history. EXA, SNA, UMA, XAA? Oh right all replaced with GLAMOR, OK run modesetting driver, right need a compositor on top of our window manager still because we don't vsync without it.
Do you have monitors with a different refresh rate? Do you have muxes with different cards driving different outputs? All this stuff X11 sucks at. Ok the turd has been polished well now after decades, it doesn't need to run as root/suid anymore, doesn't listen for connections on your network, but the security model still sucks compared to wayland, and once you mix multiple video cards all bets are off.
But yeah, clipboard works reliably, big W for X11.
I run a small homelab (Mac Mini + RPi5) and
tried Cockpit too. Great for single server
monitoring, but once I had multiple nodes,
I kept SSH-ing into each box anyway.
Ended up wanting something CLI-first that
could check all servers at once without
opening a browser. The web UI is nice for
a quick glance though.
Nope, I stopped using Apple devices in early 2019. I can't accept their attitude anymore, of deciding what I'm allowed to install on my hardware. macOS is a bit more open than iOS, but is every year shifting more and more into the same direction.
I feel like this is absolutely not the case. Our corporate infosec guys are freaking out, as developers and general users alike are finding all new ways to poke holes in literally everything.
We're finding out quickly that enterprise endpoints are not locked down anywhere near enough, and the stuff that users are creating on the local endpoints is quickly outpacing the rate at which SOC teams can investigate what's going on.
If you're using Claude via Anthropic's SaaS service it's near impossible to collect logs of what actually happened in a user's session. We happen to proxy Claude Code usage through Amazon Bedrock and the Bedrock logs have already proven to be instrumental in figuring out what led a user to having repeated attempts to install software that they wouldn't have otherwise attempted to install - all because they turned their brains off and started accepting every Claude Code prompt to install random stuff.
Sandboxing works to an extent, but it's a really difficult balance to strike between locking it down so much that you neuter the tool and having a reasonable security policy.
Apple literally made step by step instructions for compiling and running your own kernel on Apple silicon. Not sure how you think asahi Linux works otherwise. Sure the drivers are anywhere from bad to non existent but thatâs not the same thing as being unable to run your own kernel.
Much simpler. Those in power, in every place and in every time, adopt self-serving beliefs that justify their place as the ones in power and flatter themselves. No different in any day or any time. Same quasi-messianic ideals as ever. Their beliefs donât have to pay rent or correspond with reality.