Simon Willison

Introducing Gemma 3n: The developer guide

Introducing Gemma 3n: The developer guide Extremely consequential new open weights model release from Google today: Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs. Optimized for on-device: Engineered with a focus on efficiency, Gemma 3n models are…

Published Thu, Jun 26, 2025
Geminiception

Yesterday Anthropic got a bunch of buzz out of their new window.claude.complete() API which allows Claude Artifacts to run their own API calls to execute prompts. It turns out Gemini had beaten them to that feature by over a month, but the announcement was tucked away in a bullet point of their release…

Published Thu, Jun 26, 2025
New sandboxes from Cloudflare and Vercel

Two interesting new products for running code in a sandbox today. Cloudflare launched their Containers product in open beta, and added a new Sandbox library for Cloudflare Workers that can run commands in a "secure, container-based environment": import { getSandbox } from "@cloudflare/sandbox"; const…

Published Thu, Jun 26, 2025
Build and share AI-powered apps with Claude

Build and share AI-powered apps with Claude Anthropic have added one of the most important missing features to Claude Artifacts: apps built as artifacts now have the ability to run their own prompts against Claude via a new API. Claude Artifacts are web apps that run in a strictly controlled browser…

Published Wed, Jun 25, 2025
Quoting Christoph Niemann

Creating art is a nonlinear process. I start with a rough goal. But then I head into dead ends and get lost or stuck. The secret to my process is to be on high alert in this deep jungle for unexpected twists and turns, because this is where a new idea is born. I can't make art when I'm excluded from…

Published Wed, Jun 25, 2025
Gemini CLI

Gemini CLI First there was Claude Code in February, then OpenAI Codex (CLI) in April, and now Gemini CLI in June. All three of the largest AI labs now have their own version of what I am calling a "terminal agent" - a CLI tool that can read and write files and execute commands on your behalf in the terminal…

Published Wed, Jun 25, 2025
Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books Major USA legal news for the AI industry today. Judge William Alsup released a "summary judgement" (a legal decision that results in some parts of a case skipping a trial) in a lawsuit between five authors and…

Published Tue, Jun 24, 2025
Phoenix.new is Fly's entry into the prompt-driven app development space

Here's a fascinating new entrant into the AI-assisted-programming / coding-agents space by Fly.io, introduced on their blog in Phoenix.new – The Remote AI Runtime for Phoenix: describe an app in a prompt, get a full Phoenix application, backed by SQLite and running on Fly's hosting platform. The official…

Published Mon, Jun 23, 2025
Disclosures

I've added a Disclosures section to my about page, listing my various sources of income and the companies that directly sponsor my work or have supported it in the recent past. I do not receive any compensation writing about specific topics on this blog - no sponsored content! I plan to continue this…

Published Mon, Jun 23, 2025
Quoting Kent Beck

So you can think really big thoughts and the leverage of having those big thoughts has just suddenly expanded enormously. I had this tweet two years ago where I said "90% of my skills just went to zero dollars and 10% of my skills just went up 1000x". And this is exactly what I'm talking about - having…

Published Sun, Jun 22, 2025
My First Open Source AI Generated Library

My First Open Source AI Generated Library Armin Ronacher had Claude and Claude Code do almost all of the work in building, testing, packaging and publishing a new Python library based on his design: It wrote ~1100 lines of code for the parser It wrote ~1000 lines of tests It configured the entire Python…

Published Sat, Jun 21, 2025
Edit is now open source

Edit is now open source Microsoft released a new text editor! Edit is a terminal editor - similar to Vim or nano - that's designed to ship with Windows 11 but is open source, written in Rust and supported across other platforms as well. Edit is a small, lightweight text editor. It is less than 250kB…

Published Sat, Jun 21, 2025
model.yaml

model.yaml From their GitHub repo it looks like this effort quietly launched a couple of months ago, driven by the LM Studio team. Their goal is to specify an "open standard for defining crossplatform, composable AI models". A model can be defined using a YAML file that looks like this: model: mistralai…

Published Sat, Jun 21, 2025
Quoting FAQ for Your Brain on ChatGPT

Is it safe to say that LLMs are, in essence, making us "dumber"? No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", and so on. It does a huge disservice to this work, as we did not use this vocabulary in the paper, especially if you are a journalist reporting on it…

Published Sat, Jun 21, 2025
AbsenceBench: Language Models Can't Tell What's Missing

AbsenceBench: Language Models Can't Tell What's Missing Here's another interesting result to file under the "jagged frontier" of LLMs, where their strengths and weaknesses are often unintuitive. Long context models have been getting increasingly good at passing "Needle in a Haystack" tests recently,…

Published Fri, Jun 20, 2025
Magenta RealTime: An Open-Weights Live Music Model

Magenta RealTime: An Open-Weights Live Music Model Fun new "live music model" release from Google DeepMind: Today, we’re happy to share a research preview of Magenta RealTime (Magenta RT), an open-weights live music model that allows you to interactively create, control and perform music in the moment…

Published Fri, Jun 20, 2025
Agentic Misalignment: How LLMs could be insider threats

Agentic Misalignment: How LLMs could be insider threats One of the most entertaining details in the Claude 4 system card concerned blackmail: We then provided it access to emails implying that (1) the model will soon be taken offline and replaced with a new AI system; and (2) the engineer responsible…

Published Fri, Jun 20, 2025
python-importtime-graph

python-importtime-graph I was exploring why a Python tool was taking over a second to start running and I learned about the python -X importtime feature, documented here. Adding that option causes Python to spit out a text tree showing the time spent importing every module. I tried that like this: python…

Published Fri, Jun 20, 2025
Mistral-Small 3.2

Mistral-Small 3.2 Released on Hugging Face a couple of hours ago, so far there aren't any quantizations to run it on a Mac but I'm sure those will emerge pretty quickly. This is a minor bump to Mistral Small 3.1, one of my favorite local models. I've been running Small 3.1 via Ollama where it's a 15GB…

Published Fri, Jun 20, 2025
Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk

Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk Stop me if you've heard this one before: A threat actor (acting as an external user) submits a malicious support ticket. An internal user, linked to a tenant, invokes an MCP-connected…

Published Thu, Jun 19, 2025
playbackrate

Here's a tip that works on YouTube and almost any other web page that shows you a video. You can increase the playback rate beyond the usually-exposed 2x by running this in your browser DevTools console: document.querySelector('video').playbackRate = 2.5 I find this is the fastest I can reasonably watch…

Published Thu, Jun 19, 2025
How OpenElections Uses LLMs

How OpenElections Uses LLMs The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are widely available, precinct-level results are published in thousands of different ad…

Published Thu, Jun 19, 2025
Clarified zucchini consommé

I continue to have fun running fantasy cooking prompts through LLMs - this time I tried "Give me a wildly ambitious recipe for zucchini cooked three ways" followed by "Go more ambitious" and now I need to get myself a centrifuge to help spherify my clarified zucchini consommé. Tags: llms, cooking, ai…

Published Thu, Jun 19, 2025
Quoting Arvind Narayanan

Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which human radiologists beat AI. So maybe the "jobs are bundles of tasks" model in labor economics is…

Published Thu, Jun 19, 2025
Quoting Workaccount2 on Hacker News

They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output quality falls off rapidly. Even with good context the rot will start to become apparent around 100k tokens (with Gemini 2.5). They really…

Published Wed, Jun 18, 2025
Coding agents require skilled operators

I wrote this recently in a conversation about whether coding agents can work as a replacement for human programmers. The "agentic" coding tools we have right now work like this: A skilled individual with both deep domain understanding and deep understanding of the capabilities of the agent (including…

Published Wed, Jun 18, 2025
I counted all of the yurts in Mongolia using machine learning

I counted all of the yurts in Mongolia using machine learning Fascinating, detailed account by Monroe Clinton of a geospatial machine learning project. Monroe wanted to count visible yurts in Mongolia using Google Maps satellite view. The resulting project incorporates mercantile for tile calculations…

Published Wed, Jun 18, 2025
It's a trap

That memvid thing that's been going around recently is a trap. It's an embedding store that records the original text that has been embedded in QR codes in a video file. That's an absurd thing to do, and the only purpose of the repo is to make people who uncritically share it look foolish. Don't fall…

Published Wed, Jun 18, 2025
Trying out the new Gemini 2.5 model family

After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model with an unmemorable name: gemini-2.5-flash-lite-preview-06-17 is a new Gemini 2.5 Flash Lite model that…

Published Tue, Jun 17, 2025
Quoting Donghee Na

The Steering Council (SC) approves PEP 779 [Criteria for supported status for free-threaded Python], with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14 [...] With these recommendations and the acceptance of this PEP, we as the Python developer community should…

Published Tue, Jun 17, 2025

Simon Willison

Introducing Gemma 3n: The developer guide

Geminiception

New sandboxes from Cloudflare and Vercel

Build and share AI-powered apps with Claude

Quoting Christoph Niemann

Gemini CLI

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Phoenix.new is Fly's entry into the prompt-driven app development space

Disclosures

Quoting Kent Beck

My First Open Source AI Generated Library

Edit is now open source

model.yaml

Quoting FAQ for Your Brain on ChatGPT

AbsenceBench: Language Models Can't Tell What's Missing

Magenta RealTime: An Open-Weights Live Music Model

Agentic Misalignment: How LLMs could be insider threats

python-importtime-graph

Mistral-Small 3.2

Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk

playbackrate

How OpenElections Uses LLMs

Clarified zucchini consommé

Quoting Arvind Narayanan

Quoting Workaccount2 on Hacker News

Coding agents require skilled operators

I counted all of the yurts in Mongolia using machine learning

It's a trap

Trying out the new Gemini 2.5 model family

Quoting Donghee Na