Simon Willison
-
Unseeable prompt injections in screenshots: more vulnerabilities in Comet and other AI browsers
Unseeable prompt injections in screenshots: more vulnerabilities in Comet and other AI browsers The Brave security team wrote about prompt injection against browser agents a few months ago (here are my notes on that). Here's their follow-up: What we’ve found confirms our initial concerns: indirect prompt…
Published
-
Introducing ChatGPT Atlas
Introducing ChatGPT Atlas Last year OpenAI hired Chrome engineer Darin Fisher, which sparked speculation they might have their own browser in the pipeline. Today it arrived. ChatGPT Atlas is a Mac-only web browser with a variety of ChatGPT-enabled features. You can bring up a chat panel next to a web…
Published
-
Quoting Phil Gyford
Since getting a modem at the start of the month, and hooking up to the Internet, I’ve spent about an hour every evening actually online (which I guess is costing me about £1 a night), and much of the days and early evenings fiddling about with things. It’s so complicated. All the hype never mentioned…
Published
-
Quoting Bruce Schneier and Barath Raghavan
Prompt injection might be unsolvable in today’s LLMs. LLMs process token sequences, but no mechanism exists to mark token privileges. Every solution proposed introduces new injection vectors: Delimiter? Attackers include delimiters. Instruction hierarchy? Attackers claim priority. Separate models? Double…
Published
-
Claude Code for web - a new asynchronous coding agent from Anthropic
Anthropic launched Claude Code for web this morning. It's an asynchronous coding agent - their answer to OpenAI's Codex Cloud and Google's Jules, and has a very similar shape. I had preview access over the weekend and I've already seen some very promising results from it. It's available online at claude.ai…
Published
-
Getting DeepSeek-OCR working on an NVIDIA Spark via brute force using Claude Code
DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA Spark by having Claude Code effectively brute force the challenge of getting it working on that particular…
Published
-
TIL: Exploring OpenAI's deep research API model o4-mini-deep-research
TIL: Exploring OpenAI's deep research API model o4-mini-deep-research I landed a PR by Manuel Solorzano adding pricing information to llm-prices.com for OpenAI's o4-mini-deep-research and o3-deep-research models, which they released in June and document here. I realized I'd never tried these before,…
Published
-
The AI water issue is fake
The AI water issue is fake Andy Masley (previously): All U.S. data centers (which mostly support the internet, not AI) used 200--250 million gallons of freshwater daily in 2023. The U.S. consumes approximately 132 billion gallons of freshwater daily. The U.S. circulates a lot more water day to day, but…
Published
-
Andrej Karpathy — AGI is still a decade away
Andrej Karpathy — AGI is still a decade away Extremely high signal 2 hour 25 minute (!) conversation between Andrej Karpathy and Dwarkesh Patel. It starts with Andrej's claim that "the year of agents" is actually more likely to take a decade. Seeing as I accepted 2025 as the year of agents just yesterday…
Published
-
Quoting Alexander Fridriksson and Jay Miller
Using UUIDv7 is generally discouraged for security when the primary key is exposed to end users in external-facing applications or APIs. The main issue is that UUIDv7 incorporates a 48-bit Unix timestamp as its most significant part, meaning the identifier itself leaks the record's creation time. This…
Published
-
Should form labels be wrapped or separate?
Should form labels be wrapped or separate? James Edwards notes that wrapping a form input in a label event like this has a significant downside: Name It turns out both Dragon Naturally Speaking for Windows and Voice Control for macOS and iOS fail to understand this relationship! You need to use the explicit…
Published
-
Quoting Barry Zhang
Skills actually came out of a prototype I built demonstrating that Claude Code is a general-purpose agent :-) It was a natural conclusion once we realized that bash + filesystem were all we needed — Barry Zhang, Anthropic Tags: skills, claude-code, ai-agents, generative-ai, ai, llms
Published
-
Claude Skills are awesome, maybe a bigger deal than MCP
Anthropic this morning introduced Claude Skills, a new pattern for making new abilities available to their models: Claude can now use Skills to improve how it performs specific tasks. Skills are folders that include instructions, scripts, and resources that Claude can load when needed. Claude will only…
Published
-
NVIDIA DGX Spark + Apple Mac Studio = 4x Faster LLM Inference with EXO 1.0
NVIDIA DGX Spark + Apple Mac Studio = 4x Faster LLM Inference with EXO 1.0 EXO Labs wired a 256GB M3 Ultra Mac Studio up to an NVIDIA DGX Spark and got a 2.8x performance boost serving Llama-3.1 8B (FP16) with an 8,192 token prompt. Their detailed explanation taught me a lot about LLM performance. There…
Published
-
Quoting Riana Pfefferkorn
Pro se litigants [people representing themselves in court without a lawyer] account for the majority of the cases in the United States where a party submitted a court filing containing AI hallucinations. In a country where legal representation is unaffordable for most people, it is no wonder that pro…
Published
-
Coding without typing the code
Last year the most useful exercise for getting a feel for how good LLMs were at writing code was vibe coding (before that name had even been coined) - seeing if you could create a useful small application through prompting alone. Today I think there's a new, more ambitious and significantly more intimidating…
Published
-
Quoting Catherine Wu
While Sonnet 4.5 remains the default [in Claude Code], Haiku 4.5 now powers the Explore subagent which can rapidly gather context on your codebase to build apps even faster. You can select Haiku 4.5 to be your default model in /model. When selected, you’ll automatically use Sonnet 4.5 in Plan mode and…
Published
-
Introducing Claude Haiku 4.5
Introducing Claude Haiku 4.5 Anthropic released Claude Haiku 4.5 today, the cheapest member of the Claude 4.5 family that started with Sonnet 4.5 a couple of weeks ago. It's priced at $1/million input tokens and $5/million output tokens, slightly more expensive than Haiku 3.5 ($0.80/$4) and a lot more…
Published
-
Quoting Claude Haiku 4.5 System Card
Previous system cards have reported results on an expanded version of our earlier agentic misalignment evaluation suite: three families of exotic scenarios meant to elicit the model to commit blackmail, attempt a murder, and frame someone for financial crimes. We choose not to report full results here…
Published
-
A modern approach to preventing CSRF in Go
A modern approach to preventing CSRF in Go Alex Edwards writes about the new http.CrossOriginProtection middleware that was added to the Go standard library in version 1.25 in August and asks: Have we finally reached the point where CSRF attacks can be prevented without relying on a token-based check…
Published
-
NVIDIA DGX Spark: great hardware, early days for the ecosystem
NVIDIA sent me a preview unit of their new DGX Spark desktop "AI supercomputer". I've never had hardware to review before! You can consider this my first ever sponsored post if you like, but they did not pay me any cash and aside from an embargo date they did not request (nor would I grant) any editorial…
Published
-
Just Talk To It - the no-bs Way of Agentic Engineering
Just Talk To It - the no-bs Way of Agentic Engineering Peter Steinberger's long, detailed description of his current process for using Codex CLI and GPT-5 Codex. This is information dense and full of actionable tips, plus plenty of strong opinions about the differences between Claude 4.5 an GPT-5: While…
Published
-
nanochat
nanochat Really interesting new project from Andrej Karpathy, described at length in this discussion post. It provides a full ChatGPT-style LLM, including training, inference and a web Ui, that can be trained for as little as $100: This repo is a full-stack implementation of an LLM like ChatGPT in a…
Published
-
Quoting Slashdot
Slashdot: What's the reason OneDrive tells users this setting can only be turned off 3 times a year? (And are those any three times — or does that mean three specific days, like Christmas, New Year's Day, etc.) [Microsoft's publicist chose not to answer this question.] — Slashdot, asking the obvious…
Published
-
Claude Code sub-agents
Claude Code includes the ability to run sub-agents, where a separate agent loop with a fresh token context is dispatched to achieve a goal and report back when it's done. I wrote a bit about how these work in June when I traced Claude Code's activity by intercepting its API calls. I recently learned…
Published
-
Vibing a Non-Trivial Ghostty Feature
Vibing a Non-Trivial Ghostty Feature Mitchell Hashimoto provides a comprehensive answer to the frequent demand for a detailed description of shipping a non-trivial production feature to an existing project using AI-assistance. In this case it's a slick unobtrusive auto-update UI for his Ghostty terminal…
Published
-
Note on 11th October 2025
I'm beginning to suspect that a key skill in working effectively with coding agents is developing an intuition for when you don't need to closely review every line of code they produce. This feels deeply uncomfortable! Tags: vibe-coding, coding-agents, ai-assisted-programming, generative-ai, ai, llms
Published
-
An MVCC-like columnar table on S3 with constant-time deletes
An MVCC-like columnar table on S3 with constant-time deletes s3's support for conditional writes (previously) makes it an interesting, scalable and often inexpensive platform for all kinds of database patterns. Shayon Mukherjee presents an ingenious design for a Parquet-backed database in S3 which accepts…
Published
-
simonw/claude-skills
simonw/claude-skills One of the tips I picked up from Jesse Vincent's Claude Code Superpowers post (previously) was this: Skills are what give your agents Superpowers. The first time they really popped up on my radar was a few weeks ago when Anthropic rolled out improved Office document creation. When…
Published
-
Superpowers: How I'm using coding agents in October 2025
Superpowers: How I'm using coding agents in October 2025 A follow-up to Jesse Vincent's post about September, but this is a really significant piece in its own right. Jesse is one of the most creative users of coding agents (Claude Code in particular) that I know. He's put a great amount of work into…
Published