Generalist LLMs are not lawyers, and evaluating them that way is a waste of time. Evaluating LLMs with useful specialized prompts (and eventually, with specialized legal harnesses) is where the work must happen.
Hosting the Conservation Evidence conference at Pembroke, recovering from the India trip, and keeping up with LLM developments.
Wrapping up 25 days of agentic coding with a Claude Code OCaml plugin marketplace to share the skills and tools developed throughout the series.
Tuatara is a feed aggregator that integrates Claude to evolve and patch its own code when encountering parsing errors, embodying the concept of self-healing software.
Introducing unpac, a tool that unifies git and package management into a single workflow where all code dependencies live in one repository as trackable branches.
Materialising opam metadata into git submodules and monorepos, enabling cross-cutting fixes and unified odoc3 documentation across dozens of OCaml libraries.
Porting the W3C's Nu HTML Validator from Java to OCaml and running in the browser dynamically
Porting the Nu HTML Validator's language detection to OCaml, then optimizing from 115MB to 28MB and fixing WASM array limits for browser deployment.
Building an OCaml Zulip bot framework with functional handlers, and pivoting from TOML to INI codecs for Python configparser compatibility
Building tomlt, a pure OCaml TOML 1.1 parser with bidirectional codecs following the jsont design patterns
Building an OCaml JMAP client that runs in browsers and CLI, then using it to build personalised email workflows for taming notification overload.
Building interactive OCaml tutorials that compile to JavaScript, using agents to generate executable documentation that teaches protocols like JSON Pointer while you code review.
Vibespiling JustHTML from Python to pure OCaml, achieving 100% pass rate on the browser html5lib test suite using agentic workflows.
Vibe coding an OCaml library for the Karakeep bookmarking service by giving an agent a live API key and letting it debug jsont codecs against the real service.
Agentically synthesising a batteries-included OCaml HTTP client by gathering recommendations from fifty open-source implementations across JavaScript, Python, Java, Rust, Swift, Haskell, Go, C++, PHP and shell.
Building a TCP/TLS connection pooling library for Eio with DNS-based load balancing, stacked error handling, and self-contained HTML visualisations for stress test results.
Synthesizing three RFC-compliant libraries (punycode, public-suffix, and cookeio) directly from Internet RFC specifications, establishing a workflow for automating standards implementation with proper cross-referencing to spec sections.
Building a simpler single-process terminal UI for Sortal using Mosaic's effects-based direct-style API, with Eio integration and discovering multimodal image debugging for terminal layouts.
Experimenting with OxCaml's bonsai_term framework for Sortal's terminal UI, navigating Eio-Async interoperability challenges through JSON-RPC while discovering image-based debugging techniques for terminal applications.
Creating Sortal, a CLI contacts management application using Yaml storage, XDG directories, Git-based synchronization, and integrating all previously built libraries into a cohesive CLI tool.
Building yamlt to enable jsont codec definitions to work with both JSON and Yaml, providing data manipulation with location tracking and good error messages for both formats.
Implementing a pure OCaml Yaml 1.2 parser using bytesrw by synthesizing from the specification and existing C library behavior, passing thousands of test suite cases while being 20% faster than the C-based implementation.
Building Bytesrw-Eio adapters for composable byte stream I/O while discovering Claude Skills as a powerful way to automate opam package metadata management through reusable workflow templates.
Creating OCaml bindings for the Claude API using Eio and jsont codecs by reverse-engineering the JSON-RPC protocol from Python and Go SDKs, enabling Claude to write more Claude-powered OCaml code.
Building an XDG Base Directory Specification library with Eio capabilities and Cmdliner integration, providing sandboxed filesystem access patterns with full environment variable and CLI override support.
Implementing a JSONFeed specification library using jsont codecs, discovering how Claude can automate the construction of complex combinators from prose specifications with excellent error messages.
Building a Base32 Crockford encoding library in OCaml using Claude Code, establishing the development workflow with sandboxed Docker containers and local development environments.
Reflections on the Franco-British AI collaboration workshops exploring how AI is transforming scientific practice, plus follow-up funding for the Conservation Copilot project.
An exploration of agentic programming through building useful OCaml libraries daily using Claude Code while establishing groundrules for responsible development.
Elon Musk’s new Wikipedia clone has been criticized for nicking Wikipedia. Something else it also does: Aggressively cite the Tedium archive.
Presentation at Aarhus 2025 on Internet ecology, proposing AI-driven software diversity to fight protocol ossification and create more resilient networks.
Setting up self-hosted location tracking using OwnTracks and reverse engineering Life Cycle app data with Claude Code for field work in Botswana.
For an unlicensed game accessory, the Game Genie sure casts a long shadow. It reshaped the games we already owned—and had a profound effect on copyright law.
Community efforts to improve agentic coding experience for OCaml including MCP libraries, opam embeddings, and tooling improvements.
Nature comment on AI-generated paper threats to evidence synthesis proposing federated living evidence databases with human-in-loop review.
AI gets a lot of hate these days, and it often frustrates me too, but let’s be clear about what it can realistically do. Here’s my attempt to explain by example.
Survey paper on energy-aware approaches for optimizing deep learning training and inference on embedded devices.
PLOS One publication showing pretrained LLMs perform poorly on conservation questions but improve dramatically with Conservation Evidence database training.
The creator-economy service Gumroad decided to open-source its platform at a suspiciously convenient time. (And even “open source” might be stretching it.)
In case you were on the fence about whether OpenAI was a positive force in the world, they sort of revealed their hand this week by leaning into a meme.
On large language models, artificial intelligence, DeepSeek, and trying to find the middle lane between skepticism and surety. I mention bionic arms a lot for some reason.
The mess between Forbes and Perplexity AI highlights how soulless and extractive aggregation can be in the wrong hands. It’s the wrong direction for LLMs.