South Korea’s $1 billion national AI project just hit its first major test—and it’s failing. Naver Cloud, a leading participant in the country’s AI sovereignty initiative, stands accused of using components from Alibaba’s Qwen 2.4 model, violating the program’s core “from scratch” requirement. Moreover, technical analysis shows 99.5% cosine similarity between HyperCLOVA X’s vision encoder and Qwen’s, raising a question that cuts to the heart of AI development: What does building “from scratch” actually mean?
The Controversy
The Ministry of Science and ICT established clear rules for South Korea’s national AI foundation model project: participants must design and train models independently to ensure AI sovereignty. The goal? Prepare for scenarios where US or Chinese tech giants increase licensing fees or revoke access entirely.
However, developer communities discovered that Naver Cloud’s HyperCLOVA X SEED 32B—one of five competing models—shares near-identical architecture with Alibaba’s open-source vision encoder. Consequently, the encoder, which converts images into signals AI can process, represents 12% of the model’s parameters. That’s not peripheral. It’s substantial.
Naver Cloud defended the decision as “strategic engineering” to optimize compatibility with global technology ecosystems. Meanwhile, a second company, Upstage, faced similar allegations but was cleared after demonstrating only 0.0004% overlap with Chinese models. The Ministry’s evaluation concludes January 15, 2026, with potential elimination of non-compliant consortia.
What “From Scratch” Actually Means
Here’s where it gets messy. Industry experts increasingly agree the problem isn’t the technology—it’s the definition. Does “from scratch” mean every component must be domestically developed, or just the core reasoning engine?
Building a foundation model truly from scratch costs millions and takes months. Furthermore, most modern AI development relies on pre-trained components—not because developers are lazy, but because it’s how the field advances. Open-source encoders, frameworks, and datasets are industry standard. Even “sovereign” models depend on foreign chips, cloud infrastructure, and global training data.
The strict interpretation says 12% foreign components equals failure. In contrast, the flexible view says focus on what matters: the 88% domestic innovation in the core reasoning model. Which interpretation wins will define AI sovereignty for every nation watching this controversy.
Why AI Sovereignty Matters Beyond Korea
South Korea isn’t alone in chasing AI sovereignty. In fact, Europe is proposing €15-20 billion annually for sovereign AI infrastructure and effectively barring US AI agents from public sector workflows ahead of August 2026’s EU AI Act compliance deadline. Similarly, the US is planning 25 gigawatts of AI data center capacity while promoting “digital solidarity” over sovereignty. China’s “Delete A” project aims to remove American technology from supply chains entirely.
Yet everyone faces the same paradox: true independence requires controlling chips, cloud infrastructure, training data, and model development. No nation has achieved this. Ultimately, South Korea’s controversy exposes the gap between sovereignty rhetoric and technical reality.
The Developer Impact
This isn’t abstract geopolitics—it affects how you build. Open-source AI is now political. Your choice of foundation model carries geopolitical implications. Additionally, national projects will scrutinize dependencies. Regional AI ecosystems are fragmenting into US, EU, Chinese, and regional alternatives.
Expect increasing transparency requirements: clear documentation of component sourcing, verification of domestic versus foreign parts, new compliance frameworks. The one-size-fits-all AI era is ending.
The Real Question
Here’s the uncomfortable truth: AI sovereignty as currently defined may be performative nationalism. Is South Korea better served by a pure “from scratch” model that takes longer, costs more, and potentially lags behind global competitors? Or by pragmatic sovereignty—securing the core reasoning capabilities domestically while leveraging open-source components strategically?
Maybe Naver using Qwen’s encoder isn’t cheating. Maybe it’s smart engineering. The developer community noted the real accomplishment: Naver trained a capable 32-billion-parameter language model. They didn’t have to build a vision encoder from scratch—they chose efficiency over purity.
The January 15 deadline will force clarity. Other nations pursuing AI sovereignty are watching. Therefore, the outcome won’t just determine Naver’s fate—it’ll set precedent for how every country defines “independence” in an inherently global technology.
Because if 12% borrowed components disqualifies a $1 billion national project, no one’s achieving AI sovereignty anytime soon.












