Run Codex CLI on Mac Against gpt-oss:120b via Tailscale

Simon Willison shows how to run OpenAI’s Codex CLI on a Mac while offloading the actual AI model to a GPT-OSS 120B instance running on his DGX Spark via Tailscale and Ollama. It works—even for building a Space Invaders game—but it’s more of a fun hack than a daily driver.

November 7, 2025

•

Codex

TL;DR

Simon Willison connected his Mac to a DGX Spark server using Tailscale to securely access it over the internet.
He configured Ollama on the Spark to serve large models like gpt-oss:120b over the network.
It works well enough, but he notes GPT-5 and Claude Sonnet 4.5 are still far more capable.

Here’s a 9to5Mac-style summary, just like you asked—attribution preserved, light on spoilers.

Simon Willison shows how he ran OpenAI’s Codex CLI on his Mac using a GPT-OSS 120B model hosted remotely on his NVIDIA DGX Spark box — all connected through Tailscale. In a new post, Willison explains how he wired up his Mac to securely access a GPU-powered AI model running on his home lab hardware. After setting up Tailscale on both machines, he configured Ollama on the DGX Spark to listen on all network interfaces instead of localhost. With that done, the Mac could point its OLLAMA_HOST to the Spark’s Tailscale IP and interact with massive local models like gpt-oss:120b.

The real trick came when he figured out how to make OpenAI’s Codex CLI talk to that remote model. By setting CODEX_OSS_BASE_URL to the Spark’s Ollama API URL and specifying --model gpt-oss:120b, Codex started streaming completions from the 120B open-source model instead of OpenAI’s servers.

To stress-test the setup, Willison even asked Codex to build a Space Invaders game from scratch—HTML, Git repo and all—which worked, albeit with slower and less capable results than modern GPT-5 or Claude Sonnet 4.5. Still, he says, it’s pretty neat to have a private Codex-style workflow powered by his own hardware anywhere in the world.

Source: Running Codex CLI against gpt-oss:120b on the DGX Spark via Tailscale

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community

GPT‑5.1‑Codex‑Max Delivers Compaction, Multicontext and Long Coding Workflows

GPT‑5.1‑Codex‑Max adds compaction and token‑efficient reasoning to Codex, enabling coherent multi‑hour project coding across CLI, IDE, and cloud. It runs sandboxed by default with monitoring and safety controls.

Nov 21, 2025

1 shared tag

Superpowers 3.3 Ports SKILL.md Workflows to Codex

Superpowers 3.3.0 runs SKILL.md agent workflows on OpenAI Codex with a Claude-to-Codex mapping, bootstrap, and helper scripts. Installer available — warns it downloads and executes code (remote execution risk).

Oct 28, 2025

1 shared tag

Practical Playbook for Agentic Engineering: Talk Directly to Agents

A concise, practical guide to agent-driven development for large TypeScript/React codebases. Focuses on direct CLI interaction with gpt-5-codex agents, atomic commits, blast-radius edits, and lightweight planning.

Oct 14, 2025

1 shared tag

Continue the conversation on Slack

Related Articles

GPT‑5.1‑Codex‑Max Delivers Compaction, Multicontext and Long Coding Workflows

Superpowers 3.3 Ports SKILL.md Workflows to Codex

Practical Playbook for Agentic Engineering: Talk Directly to Agents