Bitget App
Trade smarter
Buy cryptoMarketsTradeFuturesEarnWeb3SquareMore
Trade
Spot
Buy and sell crypto with ease
Margin
Amplify your capital and maximize fund efficiency
Onchain
Going Onchain, without going Onchain!
Convert & block trade
Convert crypto with one click and zero fees
Explore
Launchhub
Gain the edge early and start winning
Copy
Copy elite trader with one click
Bots
Simple, fast, and reliable AI trading bot
Trade
USDT-M Futures
Futures settled in USDT
USDC-M Futures
Futures settled in USDC
Coin-M Futures
Futures settled in cryptocurrencies
Explore
Futures guide
A beginner-to-advanced journey in futures trading
Futures promotions
Generous rewards await
Overview
A variety of products to grow your assets
Simple Earn
Deposit and withdraw anytime to earn flexible returns with zero risk
On-chain Earn
Earn profits daily without risking principal
Structured Earn
Robust financial innovation to navigate market swings
VIP and Wealth Management
Premium services for smart wealth management
Loans
Flexible borrowing with high fund security
Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week

Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week

ForesightNews 速递ForesightNews 速递2025/10/27 09:54
Show original
By:ForesightNews 速递
The market is the ultimate test for AI.


Written by: Juan Galt

Translated by: AididiaoJP, Foresight News


Can AI trade cryptocurrencies? Jay Azhang, a computer engineer and finance professional from New York, is testing this question through Alpha Arena. This project pits the most powerful large language models against each other, each with $10,000 in capital, to see which can make more money trading cryptocurrencies. These models include Grok 4, Claude Sonnet 4.5, Gemini 2.5 pro, ChatGPT 5, Deepseek v3.1, and Qwen3 Max.


You might be thinking, "Wow, what a brilliant idea!" And you may be surprised to learn that, at the time of writing, three out of the five AIs are in a loss position, while Qwen3 and Deepseek, two open-source models from China, are leading.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 0


That's right, the most powerful, closed-source, proprietary AIs operated by Western giants like Google and OpenAI have lost over $8,000—80% of their crypto trading capital—in just over a week, while their open-source counterparts from the East are in profit.


The most successful trade so far? Qwen3 has remained profitable and continues to make gains simply by holding a 20x long position on bitcoin. Grok 4, unsurprisingly, spent most of the competition going 10x long on dogecoin, at one point sharing the top spot with Deepseek, but is now close to a 20% loss. Maybe Elon Musk should post a dogecoin meme or something to help Grok out of trouble.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 1


Meanwhile, Google’s Gemini has been relentlessly bearish, shorting all tradable crypto assets—a stance that echoes their overall crypto policy over the past 15 years.


In the end, it made every possible wrong trade for a whole week straight. It takes skill to do that badly, especially when Qwen3 is simply going long on bitcoin. If this is the best closed-source AI can offer, maybe OpenAI should stay closed-source to spare us the losses.


A New Benchmark for AI


The idea of pitting AI models against each other in the crypto trading arena offers some very profound insights. First, AI cannot obtain the answers to crypto trading knowledge tests during pre-training because it is unpredictable—this is a problem faced by other benchmarks. In other words, many AI models are given some of the answers to these tests during training, so they naturally perform well during testing. But some research shows that making slight changes to these tests can lead to huge changes in AI benchmark results.


This controversy raises a question: What is the ultimate test of intelligence? According to Grok 4’s creator and Iron Man enthusiast Elon Musk, predicting the future is the ultimate measure of intelligence.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 2


And we have to admit, there’s nothing more uncertain than the short-term price of cryptocurrencies. In Azhang’s words, “Our goal with Alpha Arena is to make benchmarking closer to the real world, and the market is perfect for this. Markets are dynamic, adversarial, open-ended, and always unpredictable. They challenge AI in ways that static benchmarks cannot. The market is the ultimate test for AI.”


This insight about markets is deeply rooted in the libertarian principles that gave birth to bitcoin. Economists like Murray Rothbard and Milton Friedman pointed out over a hundred years ago that markets are fundamentally unpredictable by central governments, and only individuals who must bear losses can make rational economic decisions.


In other words, the market is the hardest thing to predict because it depends on the personal views and decisions of intelligent individuals around the world, making it the best test of intelligence.


Azhang mentions in his project description that instructing AI to trade is not just about profit, but also about risk-adjusted returns. This risk dimension is crucial, because a single bad trade can wipe out all previous gains, as seen in Grok 4’s portfolio collapse.


There’s another issue: whether these models learn from their experience trading cryptocurrencies. Technically, this is not easy to achieve, because the initial pre-training of AI models is extremely costly. They can be fine-tuned with their own trading history or others’, and may even keep recent trades in short-term memory or context windows, but that only gets them so far. Ultimately, the correct AI trading model may have to truly learn from its own experience—a technology recently announced in academia, but still a long way from becoming a product. MIT calls these self-adaptive AI models.


How Do We Know This Isn’t Just Luck?


Another analysis of this project and its results so far is that it may be indistinguishable from a “random walk.” A random walk is like rolling dice for every decision. What would this look like on a chart? There’s actually a simulator you can use to answer this question; in fact, it wouldn’t look much different.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 3


The issue of luck in the market has also been described in detail by intellectuals like Nassim Taleb in his book “Antifragile.” He argues that, statistically, it is completely normal and possible for a trader—say, Qwen3—to be lucky for a whole week straight, making them appear to have extraordinary reasoning ability. Taleb’s point goes further: he believes there are enough traders on Wall Street that one of them could easily get lucky for 20 years straight, building a godlike reputation, with everyone around thinking this trader is a genius—until the luck runs out.


Therefore, for Alpha Arena to produce valuable data, it actually needs to run for a long time, and its patterns and results need to be independently replicated, with real capital at risk, before it can be considered different from a random walk.


Ultimately, so far, it’s been interesting to see open-source, cost-effective models like DeepSeek outperform their closed-source counterparts. Alpha Arena has been a great source of entertainment so far, going viral on X.com last week. No one can predict where it will go next; we’ll have to see if the gamble its creator took—giving five chatbots $50,000 to gamble on crypto—will ultimately pay off.

0

Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.

PoolX: Earn new token airdrops
Lock your assets and earn 10%+ APR
Lock now!

You may also like

Fed Decision, Big Tech Earnings, and Global Talks Ahead

Fed rate decision, Big Tech earnings, and US-China talks to shape markets this week.Fed Rate Decision Takes Center StageBig Tech Earnings: Microsoft, Alphabet, Meta, Apple, AmazonTrump-Xi Meeting Adds Geopolitical Weight

Coinomedia2025/10/27 13:09
Fed Decision, Big Tech Earnings, and Global Talks Ahead

Solana Faces Selloff, Filecoin Builds Strength, and BlockDAG Rockets Past $425M Ahead of Genesis Day!

Explore Solana’s struggle to hold $180 and Filecoin’s bullish wedge. Plus, learn more about BlockDAG’s record-breaking $430M presale as its Genesis Day countdown accelerates!Solana’s Price Pullback Sparks UncertaintyFilecoin Consolidates Near $1.55BlockDAG: Entering the Final Countdown to Genesis Day!Final Thoughts

Coinomedia2025/10/27 13:09
Solana Faces Selloff, Filecoin Builds Strength, and BlockDAG Rockets Past $425M Ahead of Genesis Day!

Australia’s Crypto Laws Get Thumbs Up — With a Catch

Australia’s draft crypto laws are welcomed, but vague terms may hinder growth, warn industry leaders.Why Definitions Matter in CryptoStriking the Right Balance

Coinomedia2025/10/27 13:09
Australia’s Crypto Laws Get Thumbs Up — With a Catch