Not-yet-profitable AI companies are constructing a vast and expensive global network of server farms to support cloud-based generative AI (genAI) services. Deeply financed by venture capitalists who will one day want to see return on their investments, these centers are consuming enough memory to drive consumer technology prices higher and higher.
Yet, for all the investment now going on, it’s inevitable that new on-device genAI models will emerge. When they do, the AI tasks for which you use cloud services today will be handled on device tomorrow. And at the speed we’re going, tomorrow is not very far away.
We already know it is possible. Just look at Siri AI. To create it, Apple worked with Google Gemini — using the latter to help build and distill Apple’s own AI models, many of which now work entirely on device.
The hidden cost of the AI buildout
This move toward edge AI takes place as the tech industry pours its eggs into the AI basket, with major memory suppliers redirecting manufacturing capacity toward higher-value memory products for AI servers, such as advanced-layer 3D NAND. They’ve done so while failing to invest in additional capacity, prompting a shortage of the kind of general purpose RAM you use in your computer, console, or smartphone.
This is having a dramatic impact. Gartner says the memory shortage will cause PC shipments to drop 10.4% in 2026 and smartphone shipments to decline 8.4%, with prices on those products rising 17% and 13%, respectively, versus 2025 levels.
The AI consumer tech tax
This leaves electronics manufacturers quarrelling over the remaining supply, while shrinking profit margins force them to increase prices. Sony has already raised the PS5 price by $100, Microsoft has raised Xbox prices, and Nintendo has raised the price of its first-generation Switch. Samsung has quietly increased prices across Galaxy smartphones, tablets, and laptops. Analysts also warn that “shrinkflation” is escaping from the supermarket and coming to your tech, an insidious move in which manufacturers quietly reduce the features/performance of their devices to maintain familiar price points. That laptop you purchase could ship with a downgraded display, for example.
Apple is not immune. The iPhone 18 Pro is tipped for a significant price increase in 2026, potentially adding $100 to $150, even before tariffs are considered. Apple CEO Tim Cook recently warned of price increases ahead as a direct impact of demand for memory components.
A business model under pressure
Surely all this investment in AI servers and the increased cost of tech products will be worth it in the end, right? Writer and tech critic Ed Zitron disagrees, pointing out that for $200 a month, a user can burn $8,000 in Anthropic tokens or $14,000 in OpenAI tokens.
He argues that subsidy at this scale suggests AI economics are already broken, and that the actual value of AI may be inflated, forming a big bad bubble ready to burst once market opinion (and investment) catches on.
By spending billions chasing market share, AI has fundamentally undermined its own value, making it harder to achieve sustainable business success. Costs might well fall in future, of course, but edge AI could be the biggest cost reduction exercise of all.
Make tokens pay
Perhaps AI companies have finally begun turning things around? Maybe not. Less than three months into paying the actual costs of LLM-based services, both OpenAI and Anthropic are considering drastic price cuts, with one Cisco executive stating publicly that AI token costs are far higher than the actual value those tokens are generating at scale.
Even Meta has imposed strict limits on token usage after finding it was on track to spend billions on internal AI alone in 2026. The Times reports that two large banks spent an astonishing $1 billion on AI experiments without seeing any significant return.
That’s the use value, but what about the hardware investment? It is really hard to ignore the irony that billions of dollars are being poured into a server-based infrastructure that might become obsolete before turning any kind of profit. Today’s H200-based servers will need to be upgraded sooner or later, and when they are, where will the money come from?
Apple has a different approach
Consumer electronics leader Apple clearly sees this. While it has been accused of being behind in AI, perhaps it was just being realistic. After all, the reality seems to be that we’re experiencing something akin to venture capital backed economic socialism in the AI sector, with billions invested for no visible — or, if Zitron is right — possible return.
At the same time, Apple seems focused on building edge AI as a privacy-preserving, cost-saving alternative to the massive data center buildouts rivals have pursued.
Rather than squandering billions on a revenue-draining chatbot, Apple worked with others to create its own alternative. As part of its agreement with Google, Apple is using a large version of Gemini to train a smaller, distilled version capable of running locally on Apple hardware. Siri AI can hold conversations, pull context from a user’s emails, messages, and photos, answer live questions from the web, and act across apps, with much of the work taking place on the device itself.
These tools are also available to app developers, thanks to Apple’s Foundation Models framework. At WWDC, Apple showed how its devices can work together to run local LLMs using MLX Distributed, which means users can run on-premises, highly private AI models. And the company continues to make strategic acquisitions, such as the recent purchase of on-device AI startup Liquid AI.
On-device or off, the move has caused Apple to break with years of tradition to pack its systems with more and more memory, ironically feeding the same component pricing narrative.
The squeeze isn’t over
Who pays for all this? You do. Memory prices will continue to rise across the year, with TrendForce predicting up to 75% increases on top of the already 100% spike we’ve seen in recent months. Memory suppliers seem unwilling to ramp up supply to help bring costs down, potentially because they don’t want to be left with unused capacity once the AI bubble does burst. That means existing manufacturing is being pointed at the highest value memory components, further feeding price hikes.
When AI leaves the cloud
What happens to investors when AI stops needing a data center to be useful? The companies that survive this shift won’t necessarily be the ones who built the biggest and most costly clouds. They are more likely to be the ones who identified cloud-based AI as the start of a transition toward more intelligent devices equipped with their own on-device AI. That is precisely what Apple is building toward.
Please join me on social media at BlueSky, LinkedIn, or Mastodon, even better, please subscribe to The Core for your daily fix of human-curated Apple News.