
xAI pushed a significant update to its open-source X recommendation algorithm on May 15. The change is subtle in the changelog and substantial in practice: a new phoenix/run_pipeline.py unifies the previously separate retrieval and ranking scripts into a single entry point, and a pre-trained mini Phoenix model ships with the repository via Git LFS. You no longer need to train anything to get inference running. You can clone the engine that powers X’s For You feed and execute it in two commands.
That is a different thing than what existed before. The repository has been open since January 2026, but what shipped then was inspectable, not operational. Model weights were absent. Scripts were disconnected. The community could read the code — and did, generating 1,600 GitHub stars in six hours — but could not run production inference without significant work. The May 15 update closes that gap.
What Shipped on May 15
The update packages a pre-trained mini Phoenix model as a 3 GB archive distributed via Git LFS. Getting inference running looks like this:
git clone https://github.com/xai-org/x-algorithm
cd x-algorithm
git lfs pull
python phoenix/run_pipeline.py
Beyond the unified pipeline, the update ships a Grox content-understanding component (classifiers, embedders, and a task-execution engine for content workloads), an integrated ad blending system, and new candidate hydrators that enrich posts before ranking. xAI has committed to publishing updates every four weeks with developer notes.
Four Components, One Feed
Understanding what you are actually running matters before forking it. The X algorithm is four components working together:
Home Mixer is the orchestration layer. It coordinates the other components, calls them in the right order, and assembles the final ranked feed. Thunder is an in-memory, Kafka-backed post store that handles sub-millisecond lookups for content from accounts you follow — in-network content — without touching an external database. Phoenix is the ML subsystem: a two-tower transformer retrieves out-of-network candidates, and a Grok-derived ranking model predicts engagement probability. Candidate Pipeline is the modular Rust framework that wires everything together, with Sources, Hydrators, Filters, and Scorers as swappable components.
The codebase is 57.4% Rust and 42.6% Python under Apache 2.0. The Rust handles the real-time systems. The Python handles the ML. The pre-trained mini Phoenix model uses 256-dimensional embeddings, four attention heads, and two transformer layers — meaningfully smaller than production, but enough to run real inference.
The Signal Hierarchy Is the Interesting Part
The more immediately useful finding buried in the open-sourced Phoenix scoring code is the engagement weight hierarchy. The algorithm does not treat all engagement as equivalent:
- Two-way conversation (author replies to commenter): approximately 150 times a like
- Reposts: 20 times a like
- Replies: 13.5 times a like
- Link clicks: 11 times a like
- Bookmarks: 10 times a like
- Likes: base weight
The algorithm is explicitly built to surface discussion, not passive consumption. One two-way reply thread carries more algorithmic weight than a post with 150 likes. If you are building content strategy on X, or building a platform where you want to adapt these weights, this is the signal hierarchy you are working with.
What Developers Can Build With It
The Candidate Pipeline framework is where the practical value lives for builders. The architecture uses a Strategy Pattern at system level: you can add a new data source by implementing a Source, change how candidates are enriched by writing a Hydrator, and swap ranking logic by replacing the Scorer — without touching the other components. Founders building niche social networks or editorial feed products now have a production-quality recommendation pipeline baseline that does not require training a transformer from scratch.
xAI has also said it is accepting community contributions — issues and pull requests sync to the internal repository. Whether they follow through over multiple update cycles is worth watching.
The Honest Limits
The pre-trained mini Phoenix model is not the production Phoenix model. The production model operates at a scale and dimensionality that the 3 GB download does not replicate. X’s social graph training data is not included. The trust and safety pipeline is absent from the repository. This is a genuine, runnable baseline — not a drop-in production system.
The history of recommendation systems has a clean dividing line at this week. Building a social or content feed no longer requires you to derive the architecture from first principles or academic papers. You can start from a system that has been serving 500 million users and customize from there.









