Link tags: crawling

4

sparkline

Pulling my site from Google over AI training – Tracy Durnell

I’m not down with Google swallowing everything posted on the internet to train their generative AI models.

A principled approach to evolving choice and control for web content

This would mean a lot more if it happened before the wholesale harvesting of everyone’s work.

But I’m sure Google will put a mighty fine lock on that stable door that the horse bolted from.

Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites

1,841 instances of dark patterns on ecommerce sites, in the categories of sneaking, urgency, misdirection, social proof, scarcity, obstruction, and forced action. You can browse this overview, read the paper, or look at the raw data.

We conducted a large-scale study, analyzing ~53K product pages from ~11K shopping websites to characterize and quantify the prevalence of dark patterns.

Archiving web sites [LWN.net]

As it turns out, some sites are much harder to archive than others. This article goes through the process of archiving traditional web sites and shows how it falls short when confronted with the latest fashions in the single-page applications that are bloating the modern web.