|
August 26, 2025
Nvidia Brings Blackwell to Robotics
July 17, 2025
AWS Puts Agent-Focused Platform Center Stage
July 9, 2025
Samsung’s Latest Foldables Stretch Limits
June 24, 2025
HPE’s GreenLake Intelligence Brings Agentic AI to IT Operations
June 18, 2025
AWS Enhances Security Offerings
June 12, 2025
AMD Drives System Level AI Advances
June 10, 2025
Cisco Highlights Promise and Potential of On-Prem Agents and AI
June 4, 2025
Arm Brings Compute Platform Designs to Automotive Market
May 20, 2025
Dell Showcases Silicon Diversity in AI Server and PC
May 19, 2025
Microsoft Brings AI Agents to Life
May 14, 2025
Google Ups Privacy and Intelligence Ante with Latest Android Updates
April 30, 2025
Intel Pushes Foundry Business Forward
April 29, 2025
Chip Design Hits AI Crossover Point
April 24, 2025
Adobe Broadens Firefly’s Creative AI Reach
April 9, 2025
Google Sets the Stage for Hybrid AI with Cloud Next Announcements
April 1, 2025
New Intel CEO Lays out Company Vision
March 21, 2025
Nvidia Positions Itself as AI Infrastructure Provider
March 13, 2025
Enterprise AI Will Go Nowhere Without Training
February 18, 2025
The Rapid Rise of On-Device AI
February 12, 2025
Adobe Reimagines Generative Video with Latest Firefly
January 22, 2025
Samsung Cracks the AI Puzzle with Galaxy S25
January 8, 2025
Nvidia Brings GenAI to the Physical World with Cosmos
|
|
|














 |
TECHnalysis Research president Bob O'Donnell publishes commentary on current tech industry trends every week at LinkedIn.com in the TECHnalysis Research Insights Newsletter and those blog entries are reposted here as well. In addition, those columns are also reprinted on Techspot and SeekingAlpha.
He also writes a regular column in the Tech section of USAToday.com and those columns are posted here. Some of the USAToday columns are also published on partner sites, such as MSN.
He also writes a 5G-focused column for Forbes that can be found here and that is archived here.
In addition, he also occasionally writes guest columns in various publications, including RCR Wireless, Fast Company and engadget. Those columns are reprinted here.
September 10, 2025
By Bob O'Donnell
In the never-ending race to improve the AI performance across every device possible, it’s always important to keep an eye on Arm. After all, their compute architectures for both CPUs and GPUs power virtually all of the world’s smartphones and an increasing percentage of PCs, wearables and other devices. From architectural licensees such as Apple and Qualcomm to direct IP customers such as MediaTek and Xiaomi, the reach of Arm across client devices is enormous. (And that doesn’t even include the company’s efforts in infrastructure via Nvidia and automotive with multiple chip suppliers.)
So, with the unveiling of the company’s new Lumex platform and the key components that go into it, including the new C1 CPU architecture and Mali G1 GPU architecture, there’s a lot to dig into and analyze. First, it’s important to note that Lumex is primarily focused on mobile devices, but elements of its design will be impacting future chip designs for PCs and other devices as well.
At a high level, Lumex is interesting on several different fronts. First, it represents the company’s latest effort to move further up the design and value creation stack for mobile devices. Lumex builds on last year’s CSS (Compute Subsystem) for Client offering and adds even more capabilities to make the process of using Arm-based designs easier and faster, while also improving overall system performance. In the same way that Neoverse allowed Arm to move beyond just the IP (intellectual property) for individual computing cores in the server/datacenter/infrastructure market, Lumex is the culmination of the company’s efforts to bring more value to the mobile client market. Plus, it fills in the larger branding strategy the company embarked on earlier this year to create platform names for all the key categories in which they participate.
Inside the Lumex platform are several important new innovations that set the stage for further improvements in on-device AI performance. In addition to the new C1 CPU options (including C1 Ultra, C1 Premium, C1 Pro and C1 Nano), which continue the company’s impressive multi-year run of roughly 15% IPC (Instructions Per Clock) raw performance increases, as well as the Mali G1 GPU line and its even better multi-year run of 20% improvements, Lumex incorporates a new SI L1 system interconnect architecture and a new SMMU (System Memory Management Unit). Both of these offer advances in overall system performance and efficiency, particularly for AI workloads.
One particularly interesting data point that Arm used to describe the relevance of these changes is how much different the energy usage of system memory can be when running LLM-powered AI workloads versus even challenging gaming applications. Specifically, the percentage of total system power usage that the memory takes in gaming workloads averages about 10-15%, but it jumps 5x to over 70% for AI-focused applications. In other words, without some kind of adjustment to the overall system architecture it’s easy to see how quickly on-device AI applications could run down the battery, even compared to high-power games.
The manner in which Arm achieves these system-level changes is through a rethinking of how the elements in the system interconnect. The SI L1, for example, not only supports higher-speed connections to various components throughout the overall chip design but offers a Network on Chip (NOC) architecture that allows it to serve as a single connection point for both the CPU and the GPU to main memory. This allows for more scalable designs while simultaneously reducing latency-induced delays and decreasing power consumption. The newly designed SMMU takes up less die space and provides similar types of system-wide benefits while also enabling enhancing security against virtualization-focused attacks.
One of the other very interesting AI-related performance improvements that Arm included in the Lumex platform is the addition of SME2 (Scalable Matrix Extensions) instructions and a dedicated set of logic components for those instructions in the new C1 CPU architecture. This second generation version of the technology—the first generation of which Apple implemented in its Arm-based M4 architecture for the Mac—offers a number of enhancements and extended capabilities that translate into up to 3.7X improvements across a variety of AI inferencing, computer vision and other ML (Machine Learning)-based benchmarks while simultaneously reducing CPU energy consumption by 12% on inferencing workloads--a solid combination.
Even beyond the performance improvements, the addition of the SME2 instructions (which are part of Arm’s V 9.3-A instruction set for those keeping track) are noteworthy because they’re indicative of what looks to be a larger trend towards moving AI-focused workloads across a broader range of compute architectures. Early on, all the focus for AI on device was on NPUs. Because of that, it was widely anticipated that these devices would not only become a standard part of all major SOC (System on Chip) designs, but would handle most all the AI workloads. The reality, however, has been very different.
Due primarily to the enormous diversity in different NPU designs and the lack of a standardized way for programmers to access and use them in their applications, very few current AI applications leverage the NPU at all. Instead, software developers have realized that by falling back to the GPU and CPU they can get good enough performance and not as big an energy usage hit as they feared. Plus, most importantly, it allows applications they create to run on virtually any device, instead of just the ones housing one vendor’s specific NPU.
Now, over time, more standardization (such as the creation of something similar to DirectX—which is used to allow GPUs from multiple vendors to be used in a standardized way—for NPUs) and/or better abstraction tools that enable programmers to not have to worry about specific NPU architecture design differences will likely get developed. In the meantime, however, advancements like SME2 will likely deliver more real-world AI applications performance improvements than most people expected from NPUs. This is particularly true if they’re used in conjunction with Arm’s Kleidi AI and other software tools designed to leverage them. Plus, toss in developments like Apple’s recent announcements to incorporate AI accelerator cores in the GPU portion of their new A19 line of chips created for the iPhone 17 and it’s clear that momentum is moving from NPU-centricity to a more balanced view of how on-device AI acceleration can be done.
This point also helps explain one thing that was missing from Lumex that some people thought Arm might include—their own dedicated NPU IP and design. To be clear, Arm does have Ethos NPUs for IoT applications, but the company did not incorporate a separate block of their own creation in Lumex. According to Arm, they did this in part to avoid adding yet another choice to a confusing situation and also to let their customers differentiate with their own designs. Over time, I wouldn’t be shocked if this perspective changed. For now, however, it’s clear Arm sees SME2 as a strong way to bring AI performance improvements to mobile devices. Early evidence suggests that it’s a smart move.
As with all of Arm’s efforts, it may be a while before we can enjoy the benefits that Lumex enables. Because the company is creating designs that are then incorporated into chips by others and finally integrated into products, the process often takes time. Still, it’s clear that Arm is thinking about and enabling AI-focused performance enhancements in a variety of different and interesting ways.
Here's a link to the original column: https://www.linkedin.com/pulse/arm-lumex-platform-lifts-smartphone-ai-bob-o-donnell-9hvkf
Bob O’Donnell is the president and chief analyst of TECHnalysis Research, LLC a market research firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on LinkedIn at Bob O’Donnell or on Twitter @bobodtech.
Podcasts
Leveraging more than 10 years of award-winning, professional radio experience, TECHnalysis Research has a video-based podcast called TECHnalysis Talks.
LEARN MORE |
|
Research Offerings
TECHnalysis Research offers a wide range of research deliverables that you can read about here.
READ MORE |
|