All Posts
14 min read
Jun 14, 2026
Running a Headless Claude Code Trading Agent on GCP with the Robinhood MCP Server
10 min read
Jun 6, 2026
Kokoro-82M: Running a Local Text-to-Speech Model on One GPU
13 min read
Jun 3, 2026
Running Ideogram 4 Locally: Quantized Inference and Structured JSON Captions
6 min read
Jun 1, 2026
Simple Things Scaled Up
10 min read
May 31, 2026
X-Token: Distilling Knowledge Across Tokenizers That Don't Speak the Same Language
10 min read
May 4, 2026
Building an Agentic Movie Recommender on Cloudflare Pages
4 min read
May 1, 2026
It Started as a Chatbot
22 min read
Apr 30, 2026
LLM Glossary (In Progress)
11 min read
Apr 25, 2026
DFlash: How Block Diffusion Breaks the Speculative Decoding Ceiling
1 min read
Apr 21, 2026
Laws of LLMs and Agents
5 min read
Apr 19, 2026
The Fragmented Researcher: Engineering Focus Around a 10-Month-Old
5 min read
Apr 18, 2026
Large Language Models are beautiful.
16 min read
Apr 14, 2026
27,000 Tokens Before Hello: The Agent Harness Tax
20 min read
Apr 6, 2026
Gemma 4: Everything You Need to Know About Google's Most Capable Open Model
4 min read
Mar 26, 2026
Embeddings are beautiful.
15 min read
Mar 25, 2026
TurboQuant: The cheat sheet that ate your GPU (and how Google fixed it)
5 min read
Mar 1, 2026
Doc-to-LoRA & Text-to-LoRA: How Sakana is teaching LLMs to learn instantly
8 min read
Feb 26, 2026
BlinkThink: Self-Hosted Camera Snapshots with FastAPI and Gemini
7 min read
Feb 15, 2026
Building a Browser Agent with Gemini and Playwright
5 min read
Feb 8, 2026
The 10-Million Token Paradox: Decoding the Logic of Recursive Language Models
3 min read
Feb 8, 2026
OpenClaw: 98% Plumbing, 2% Revolution
9 min read
Feb 15, 2025
VerbalVista: Talking to Your Own Data with RAG, FAISS, and a Bit of Stubbornness
14 min read
May 1, 2024
