Tags: auto

126

sparkline

Tuesday, January 21st, 2025

What I’ve learned about writing AI apps so far | Seldo.com

LLMs are good at transforming text into less text

Laurie is really onto something with this:

This is the biggest and most fundamental thing about LLMs, and a great rule of thumb for what’s going to be an effective LLM application. Is what you’re doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it’s probably going to be great at it. If you’re asking it to convert into a roughly equal amount of text it will be so-so. If you’re asking it to create more text than you gave it, forget about it.

Depending how much of the hype around AI you’ve taken on board, the idea that they “take text and turn it into less text” might seem gigantic back-pedal away from previous claims of what AI can do. But taking text and turning it into less text is still an enormous field of endeavour, and a huge market. It’s still very exciting, all the more exciting because it’s got clear boundaries and isn’t hype-driven over-reaching, or dependent on LLMs overnight becoming way better than they currently are.

Monday, September 30th, 2024

Preventing automated sign-ups

The Session goes through periods of getting spammed with automated sign-ups. I’m not sure why. It’s not like they do anything with the accounts. They’re just created and then they sit there (until I delete them).

In the past I’ve dealt with them in an ad-hoc way. If the sign-ups were all coming from the same IP addresses, I could block them. If the sign-ups showed some pattern in the usernames or emails, I could use that to block them.

Recently though, there was a spate of sign-ups that didn’t have any patterns, all coming from different IP addresses.

I decided it was time to knuckle down and figure out a way to prevent automated sign-ups.

I knew what I didn’t want to do. I didn’t want to put any obstacles in the way of genuine sign-ups. There’d be no CAPTCHAs or other “prove you’re a human” shite. That’s the airport security model: inconvenience everyone to stop a tiny number of bad actors.

The first step I took was the bare minimum. I added two form fields—called “wheat” and “chaff”—that are randomly generated every time the sign-up form is loaded. There’s a connection between those two fields that I can check on the server.

Here’s how I’m generating the fields in PHP:

$saltstring = 'A string known only to me.';
$wheat = base64_encode(openssl_random_pseudo_bytes(16));
$chaff = password_hash($saltstring.$wheat, PASSWORD_BCRYPT);

See how the fields are generated from a combination of random bytes and a string of characters never revealed on the client? To keep it from goint stale, this string—the salt—includes something related to the current date.

Now when the form is submitted, I can check to see if the relationship holds true:

if (!password_verify($saltstring.$_POST['wheat'], $_POST['chaff'])) {
    // Spammer!
}

That’s just the first line of defence. After thinking about it for a while, I came to conclusion that it wasn’t enough to just generate some random form field values; I needed to generate random form field names.

Previously, the names for the form fields were easily-guessable: “username”, “password”, “email”. What I needed to do was generate unique form field names every time the sign-up page was loaded.

First of all, I create a one-time password:

$otp = base64_encode(openssl_random_pseudo_bytes(16));

Now I generate form field names by hashing that random value with known strings (“username”, “password”, “email”) together with a salt string known only to me.

$otp_hashed_for_username = md5($saltstring.'username'.$otp);
$otp_hashed_for_password = md5($saltstring.'password'.$otp);
$otp_hashed_for_email = md5($saltstring.'email'.$otp);

Those are all used for form field names on the client, like this:

<input type="text" name="<?php echo $otp_hashed_for_username; ?>">
<input type="password" name="<?php echo $otp_hashed_for_password; ?>">
<input type="email" name="<?php echo $otp_hashed_for_email; ?>">

(Remember, the name—or the ID—of the form field makes no difference to semantics or accessibility; the accessible name is derived from the associated label element.)

The one-time password also becomes a form field on the client:

<input type="hidden" name="otp" value="<?php echo $otp; ?>">

When the form is submitted, I use the value of that form field along with the salt string to recreate the field names:

$otp_hashed_for_username = md5($saltstring.'username'.$_POST['otp']);
$otp_hashed_for_password = md5($saltstring.'password'.$_POST['otp']);
$otp_hashed_for_email = md5($saltstring.'email'.$_POST['otp']);

If those form fields don’t exist, the sign-up is rejected.

As an added extra, I leave honeypot hidden forms named “username”, “password”, and “email”. If any of those fields are filled out, the sign-up is rejected.

I put that code live and the automated sign-ups stopped straight away.

It’s not entirely foolproof. It would be possible to create an automated sign-up system that grabs the names of the form fields from the sign-up form each time. But this puts enough friction in the way to make automated sign-ups a pain.

You can view source on the sign-up page to see what the form fields are like.

I used the same technique on the contact page to prevent automated spam there too.

Thursday, September 26th, 2024

The datalist element on iOS

The datalist element is good. It was a bit bumpy there for a while, but browser implementations have improved over time. Now it’s by far the simplest and most robust way to create an autocompleting combobox widget.

Hook up an input element with a datalist element using the list and id attributes and you’re done. You can even use a bit of Ajax to dynamically update the option elements inside the datalist in response to the user’s input. The browser takes care of all the interaction. If you try to roll your own combobox implementation, it’s almost certainly going to involve a lot of JavaScript and still probably won’t account for all use cases.

Safari on iOS—and therefore all browsers on iOS—didn’t support datalist for quite a while. But once it finally shipped, it worked really nicely. The options showed up just like automplete suggestions above the keyboard.

But that broke a while back.

The suggestions still appeared, but if you tapped on one of them, nothing happened. The input element didn’t get updated. You had to tap on a little downward arrow inside the input in order to see the list of options.

That was really frustrating for anybody on iOS using The Session. By far the most common task on the site is searching for a tune, something that’s greatly (progressively) enhanced with a dynamically-updating datalist.

I just updated to iOS 18 specifically to see if this bug has been fixed, and it has:

Fixed updating the input value when selecting an option from a datalist element.

Hallelujah!

But now there’s some additional behaviour that’s a little weird.

As well as showing the options in the autocomplete list above the keyboard, Safari on iOS—and therefore all browsers on iOS—also pops up the options as a list (as if you had tapped on that downward arrow). If the list is more than a few options long, it completely obscures the input element you’re typing into!

I’m not sure if this is a bug or if it’s the intended behaviour. It feels like a bug, but I don’t know if I should file something.

For now, I’ve updated the datalist elements on The Session to only ever hold three option elements in order to minimise the problem. Seeing as the autosuggest list above the keyboard only ever shows a maximum of three suggestions anyway, this feels like a reasonable compromise.

Sunday, September 8th, 2024

Manual ’till it hurts

I’ve been going buildless—or as Brad crudely puts it, raw-dogging websites on a few projects recently. Not just obviously simple things like Clearleft’s Browser Support page, but sites like:

They also have 0 dependencies.

Like Max says:

Funnily enough, many build tools advertise their superior “Developer Experience” (DX). For my money, there’s no better DX than shipping code straight to the browser and not having to worry about some cryptic node_modules error in between.

Making websites without a build step is a gift to your future self. When you open that project six months or a year or two years later, there’ll be no faffing about with npm updates, installs, or vulnerabilities.

Need to edit the CSS? You edit the CSS. Need to change the markup? You change the markup.

It’s remarkably freeing. It’s also very, very performant.

If you’re thinking that your next project couldn’t possibly be made without a build step, let me tell you about a phrase I first heard in the indie web community: “Manual ‘till it hurts”. It’s basically a two-step process:

  1. Start doing what you need to do by hand.
  2. When that becomes unworkable, introduce some kind of automation.

It’s remarkable how often you never reach step two.

I’m not saying premature optimisation is the root of all evil. I’m just saying it’s premature.

Start simple. Get more complex if and when you need to.

You might never need to.

Tuesday, September 3rd, 2024

Why “AI” projects fail

“AI” is heralded (by those who claim it to replace workers as well as those that argue for it as a mere tool) as a thing to drop into your workflows to create whatever gains promised. It’s magic in the literal sense. You learn a few spells/prompts and your problems go poof. But that was already bullshit when we talked about introducing other digital tools into our workflows.

And we’ve been doing this for decades now, with every new technology we spend a lot of money to get a lot of bloody noses for way too little outcome. Because we keep not looking at actual, real problems in front of us – that the people affected by them probably can tell you at least a significant part of the solution to. No we want a magic tool to make the problem disappear. Which is a significantly different thing than solving it.

Monday, September 2nd, 2024

Does AI benefit the world? – Chelsea Troy

Our ethical struggle with generative models derives in part from the fact that we…sort of can’t have them ethically, right now, to be honest. We have known how to build models like this for a long time, but we did not have the necessary volume of parseable data available until recently—and even then, to get it, companies have to plunder the internet. Sitting around and waiting for consent from all the parties that wrote on the internet over the past thirty years probably didn’t even cross Sam Altman’s mind.

On the environmental front, fans of generative model technology insist that eventually we’ll possess sufficiently efficient compute power to train and run these models without the massive carbon footprint. That is not the case at the moment, and we don’t have a concrete timeline for it. Again, wait around for a thing we don’t have yet doesn’t appeal to investors or executives.

Sunday, August 11th, 2024

Aboard Newsletter: Why So Bad, AI Ads?

The human desire to connect with others is very profound, and the desire of technology companies to interject themselves even more into that desire—either by communicating on behalf of humans, or by pretending to be human—works in the opposite direction. These technologies don’t seem to be encouraging connection as much as commoditizing it.

Tuesday, July 9th, 2024

Pop Culture

Despite all of this hype, all of this media attention, all of this incredible investment, the supposed “innovations” don’t even seem capable of replacing the jobs that they’re meant to — not that I think they should, just that I’m tired of being told that this future is inevitable.

The reality is that generative AI isn’t good at replacing jobs, but commoditizing distinct acts of labor, and, in the process, the early creative jobs that help people build portfolios to advance in their industries.

One of the fundamental misunderstandings of the bosses replacing these workers with generative AI is that you are not just asking for a thing, but outsourcing the risk and responsibility.

Generative AI costs far too much, isn’t getting cheaper, uses too much power, and doesn’t do enough to justify its existence.

Sunday, June 30th, 2024

Ideas Aren’t Worth Anything - The Biblioracle Recommends

The fact that writing can be hard is one of the things that makes it meaningful. Removing this difficulty removes that meaning.

There is significant enthusiasm for this attitude inside the companies that produce an distribute media like books, movies, and music for obvious reasons. Removing the expense of humans making art is a real savings to the bottom line.

But the idea of this being an example of democratizing creativity is absurd. Outsourcing is not democratizing. Ideas are not the most important part of creation, execution is.

Thursday, June 27th, 2024

How do we build the future with AI? – Chelsea Troy

This is the transcript of a fantastic talk called “The Tools We Still Need to Build with AI.”

Absorb every word!

Monday, June 24th, 2024

The mainstreaming of ‘AI’ scepticism – Baldur Bjarnason

  1. Tech is dominated by “true believers” and those who tag along to make money.
  2. Politicians seem to be forever gullible to the promises of tech.
  3. Management loves promises of automation and profitable layoffs.

But it seems that the sentiment might be shifting, even among those predisposed to believe in “AI”, at least in part.

Because There’s No “AI” in “Failure”

My new favourite blog on Tumblr.

Monday, June 17th, 2024

Saturday, June 15th, 2024

Rise of the Ghost Machines - The Millions

This thing that we’ve been doing collectively with our relentless blog posts and pokes and tweets and uploads and news story shares, all 30-odd years of fuck-all pointless human chatterboo, it’s their tuning fork. Like when a guitarist plays a chord on a guitar and compares the sound to a tuner, adjusts the pegs, plays the chord again; that’s what has happened here, that’s what all my words are, what all our words are, a thing to mimic, a mockingbird’s feast.

Every time you ask AI to create words, to generate an answer, it analyzes the words you input and compare those words to the trillions of relations and concepts it has already categorized and then respond with words that match the most likely response. The chatbot is not thinking, but that doesn’t matter: in the moment, it feels like it’s responding to you. It feels like you’re not alone. But you are.

Wednesday, June 5th, 2024

Fine-tuning Text Inputs

Garrett talks through some handy HTML attributes: spellcheck, autofocus, autocapitalize, autocomplete, and autocorrect:

While they feel like small details, when we set these attributes on inputs, we streamline things for visitors while also guiding the browser on when it should just get out of the way.

Wednesday, May 29th, 2024

The Danger Of Superhuman AI Is Not What You Think - NOEMA

Once you have reduced the concept of human intelligence to what the markets will pay for, then suddenly, all it takes to build an intelligent machine — even a superhuman one — is to make something that generates economically valuable outputs at a rate and average quality that exceeds your own economic output. Anything else is irrelevant.

By describing as superhuman a thing that is entirely insensible and unthinking, an object without desire or hope but relentlessly productive and adaptable to its assigned economically valuable tasks, we implicitly erase or devalue the concept of a “human” and all that a human can do and strive to become. Of course, attempts to erase and devalue the most humane parts of our existence are nothing new; AI is just a new excuse to do it.

Thursday, May 23rd, 2024

Generative AI is for the idea guys

Generative AI is like the ultimate idea guy’s idea! Imagine… if all they needed to create a business, software or art was their great idea, and a computer. No need to engage (or pay) any of those annoying makers who keep talking about limitations, scope, standards, artistic integrity etc. etc.

Thursday, May 16th, 2024

What Are We Actually Doing With A.I. Today? – Pixel Envy

The marketing of A.I. reminds me less of the cryptocurrency and Web3 boom, and more of 5G. Carriers and phone makers promised world-changing capabilities thanks to wireless speeds faster than a lot of residential broadband connections. Nothing like that has yet materialized.

Wednesday, May 15th, 2024

AI Safety for Fleshy Humans: a whirlwind tour

This is a terrificly entertaining level-headed in-depth explanation of AI safety. By the end of this year, all three parts will be published; right now the first part is ready for you to read and enjoy.

This 3-part series is your one-stop-shop to understand the core ideas of AI & AI Safety — explained in a friendly, accessible, and slightly opinionated way!

( Related phrases: AI Risk, AI X-Risk, AI Alignment, AI Ethics, AI Not-Kill-Everyone-ism. There is no consensus on what these phrases do & don’t mean, so I’m just using “AI Safety” as a catch-all.)

Saturday, May 4th, 2024

AI is not like you and me

AI is the most anthropomorphized technology in history, starting with the name—intelligence—and plenty of other words thrown around the field: learning, neural, vision, attention, bias, hallucination. These references only make sense to us because they are hallmarks of being human.

But ascribing human qualities to AI is not serving us well. Anthropomorphizing statistical models leads to confusion about what AI does well, what it does poorly, what form it should take, and our agency over all of the above.

There is something kind of pathological going on here. One of the most exciting advances in computer science ever achieved, with so many promising uses, and we can’t think beyond the most obvious, least useful application? What, because we want to see ourselves in this technology?

Meanwhile, we are under-investing in more precise, high-value applications of LLMs that treat generative A.I. models not as people but as tools.

Anthropomorphizing AI not only misleads, but suggests we are on equal footing with, even subservient to, this technology, and there’s nothing we can do about it.