Docker Desktop 4.44: Gordon Remembers You Now, and Model Runner Stops Freezing Your Machine

Docker Desktop 4.44 showing Gordon AI agent with persistent memory and Docker Model Runner multi-model support

Docker Desktop 4.44 ships Gordon persistent memory and multi-model support in Model Runner

Docker Desktop 4.44 shipped this week, and the headline change is not a benchmark or a security patch — it is that Gordon, Docker’s AI agent, finally has a memory. Persistent local memory means Gordon retains your preferences, your project context, and the debugging history you built up over prior sessions. If you have used Gordon before, you know the problem: spend five minutes explaining your stack, close Docker Desktop, open it again tomorrow, and Gordon greets you like a stranger. That is gone in 4.44.

What’s New in Docker Desktop 4.44

The release ships four changes that matter for daily development work:

Gordon persistent local memory — context and preferences survive session restarts
Model Runner multi-model support — run multiple models simultaneously with resource warnings before things go sideways
New docker desktop kubernetes CLI subcommand — Kubernetes cluster management without opening the GUI
Gemini CLI and Goose added to MCP Toolkit — two more AI coding tools connect to Docker’s local MCP server infrastructure

There is also a bundled fix for a critical NVIDIA Container Toolkit vulnerability that affects developers running GPU workloads. More on that at the end.

Gordon Finally Remembers Who You Are

The stateless-per-session design of Gordon was its biggest usability problem. Every conversation started from zero. Gordon had no idea you prefer multi-stage builds, that your app runs on Python 3.12 with a Postgres sidecar, or that you spent the last two sessions debugging a volume mount permission issue. You were always starting over.

4.44 changes this with persistent local memory stored in a local database on your machine. Gordon will now retain your stack preferences and Dockerfile patterns, project context from previous sessions, user preferences like base image choices and build conventions, and debugging history — problems you have already solved together.

The data stays local. Nothing is sent to Docker’s servers. Docker’s engineering team is direct about the design constraint: “a memory system that surfaces irrelevant context is worse than no memory at all.” The goal is not to dump everything into context — it is to surface the right things when they are relevant. That is a harder problem than saving chat logs, and it shows in how deliberately they have built toward this since Gordon’s GA launch.

In practice, Gordon becomes more useful the longer you use it. The first session still requires some orientation, but by the second or third session, Gordon has enough context to skip the boilerplate. That is a meaningful difference for a tool you open every day.

Model Runner: Two Models at Once, No More Silent Freeze

The other major change is in Docker Model Runner, Docker’s built-in local model inference layer. Before 4.44, loading a model too large for your available RAM or VRAM would silently freeze Docker Desktop. No warning — just a hang you eventually had to kill.

4.44 fixes this two ways. Model Runner now warns you when resources are insufficient before attempting to load a model. And you can now run multiple models concurrently. These two changes together make Model Runner viable for multi-model application development — previously, you chose one model at a time and hoped it fit.

Two new flags ship with this release: --gpu (Windows only) explicitly routes inference to the GPU, and --cors enables CORS on the Model Runner HTTP endpoint, which matters when your app calls the local model from a browser-based dev environment. There is also a new inspect mode in the Model Runner UI to view raw requests and responses — useful for confirming your app is actually sending the prompt format you think it is.

Kubernetes CLI and More MCP Clients

The Docker Desktop CLI now includes a docker desktop kubernetes subcommand. You can list Kubernetes images, enable or disable the local cluster, and check status — all from the terminal without touching the GUI. For developers who script their local dev setup or use dev containers, this removes a class of manual GUI interactions that were awkward to automate.

On the MCP side, Google’s Gemini CLI and Block’s Goose agent are now officially supported MCP clients in the Docker MCP Toolkit. The client list has grown to 14 AI coding tools, and the Docker MCP Catalog now lists over 220 pre-built servers. The pattern is clear: Docker is building its MCP Toolkit as the local AI infrastructure layer that works with whatever coding tool your team uses — Claude Code, Cursor, Gemini CLI, Goose, or otherwise.

Upgrade Now if You Run GPU Containers

Docker Desktop 4.44 bundles NVIDIA Container Toolkit v1.17.8, which patches CVE-2025-23266 (NVIDIAScape) — a critical container escape vulnerability with a CVSS score of 9.0 discovered by Wiz Research. An attacker who can get you to run a malicious image can inject code via LD_PRELOAD in CDI mode and execute it with root privileges on the host. If you use Docker with NVIDIA GPU pass-through and CDI mode is enabled, this is a mandatory upgrade.

If you are not running GPU containers, you are not affected. Either way, the upgrade path is the same: check Docker Desktop → Settings → Software Updates or download the latest installer from Docker’s release blog. Full release notes are on the Docker Desktop release notes page.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.