November 6, 2024
By Bob O'Donnell
When it comes to Generative AI, the common thinking is that the critical tools necessary for businesses to build their own GenAI-based applications are virtually all cloud-based resources and that organizations are perfectly fine with using them. But what if that really isn’t the case?
What if companies are starting to realize that a different approach to building and running GenAI applications, using hybrid computing architectures, is a better choice? The early signs seem to suggest that a shift in that thinking is beginning to occur.
In fact, one of the most surprising results from my recent survey of over 1,000 US-based companies currently using GenAI (see “The Intelligent Path Forward: Generative AI in the Enterprise” for more) is how much interest organizations have in running their GenAI-powered applications on premises. Despite the fact that only a small single digit percentage said they were currently running these kinds of applications locally, a staggering 80% of respondents said they were interested in doing so.
To be clear, most organizations will continue to run GenAI applications in the cloud—and perhaps even a majority of them—but there’s no doubt that there is pent-up demand to move some of these workloads behind corporate firewalls. The implications of this interest are enormous. They also point to one of several nascent evolutions in market dynamics that are poised to dramatically impact how IT products and services are positioned, marketed, sold and deployed.
At a basic level, we’re bound to see the rapid rise of hybrid AI, where most enterprise environments end up running GenAI-powered applications both in the cloud and on-prem. In some cases, these may be separate applications that each run in their own location, but increasingly, I expect to see applications that simultaneously run across both environments.
Of course, long-time IT industry observers won’t find this terribly surprising—it aligns very similarly with the move that we’ve seen from pure cloud computing initiatives to the hybrid cloud computing models that have now become pervasive. However, there is one critically important difference with GenAI. Instead of this evolution taking place over the course of about 10 years or so, as the move from cloud to hybrid cloud did, the transition to Hybrid GenAI will likely only take about 10 months. Just as we’ve all witnessed an incredibly fast rate of change and evolution in GenAI-based foundation models and other tools, so too do I expect an equally rapid transition to new types of GenAI-powered computing environments and business models.
The reasons for this interest in Hybrid GenAI are, in many ways, similar to what organizations cited as they evolved their approach to cloud computing from one based solely in the cloud to one based on more shared, hybrid principles. Most companies have been reluctant—or unable, if they’re in certain regulated industries—to move some of their most precious or important data to the cloud. As a result, they’ve recognized the need to run at least some of the applications that use this critical data within their own data centers.
When it comes to GenAI, it’s this same critical data that organizations have quickly realized is essential in helping to train and/or fine tune foundation models they’re using as a basis for their GenAI-powered applications. Basically, in order to get the best/most useful results out of these applications, they need to feed their most important (and likely, most highly guarded and most likely to still be on-prem) data into these models. In other words, data gravity strikes again.
This, in turn, is making companies start to rethink how they’re approaching their GenAI application development. While they’re commonly choosing to go to the cloud for their initial proof-of-concept work, there’s growing awareness that, for full deployment, they want to have access to run the tools, platforms, and foundation models they need within their own environments.
As a result of this thinking, I believe we’ll see many IT product and service providers start to accelerate their hybrid AI-based offerings. For example, one of the biggest challenges holding back on-prem GenAI deployments is the fact that several of the most popular foundation models—notably OpenAI’s GPT family, Amazon’s Titan and Google’s Gemini—aren’t even available to run on-prem yet. You can only get to them via the cloud. I expect that situation to be very different by this time next year. In fact, companies that make their tools available more quickly on-prem and in hybrid environments will likely gain a competitive advantage. Whether that advantage proves to be long-term or short-term is certainly up for debate, but, given the momentum and interest towards on-prem deployments, it’s bound to be a factor.
Directly related to this is the increased demand for more corporate infrastructure. After a very difficult decade that saw big corporate server and infrastructure companies like Dell, HPE, Lenovo, Cisco and others face a number of challenges driven by the transition to cloud-based workloads, it seems clear the pendulum is now beginning to swing the other way. Arguably the adoption of hybrid cloud architectures already started this process, but the move to running more GenAI-powered applications behind the firewall (or in co-location environments) is likely to accelerate this move at a faster rate. As a Cisco executive summarized at the company’s recent Partner Summit event, “data centers are cool again.”
But what makes the notion of Hybrid GenAI even more interesting—and potentially even more impactful—than Hybrid Cloud is that with GenAI there is yet another level of hybridization: running workloads directly on devices in conjunction with either on-prem or cloud-based computing resources. The tremendous improvements in on-board processing of GenAI applications on PCs and smartphones—thanks to a combination of semiconductor, system design and software enhancements—is creating this third layer of potential workload hybridization. Admittedly, these types of distributed computing architectures are not easy to build or write applications for, but their potential impact is huge. Imagine having the capability to leverage the fastest, most available, or best optimized computing resources across the entire three-layer stack of device, datacenter, and cloud to run a given application—with the option to select the best choice or combination depending on a specific application’s needs—and your head starts to swim as you ponder the possibilities.
Exactly how all these developments in Hybrid GenAI pan out is not all clear. Thinking through the full implications of the three-layer hybridization stack that GenAI applications will soon have available to them, for example, is far from an easy task. Toss in potential x-factors like small language models and how they could reshape the rules on how and where GenAI applications are written and run, and things could get even more confusing. Still, it seems clear that we’re on the cusp of some major, and fast moving, changes in how companies think about the computing resources they’re going to need to best leverage GenAI. That, in turn, is likely to start reshaping the landscape of IT suppliers and their offerings a lot sooner and more profoundly than many may think.
Here's a link to the original column: https://www.linkedin.com/pulse/how-hybrid-ai-change-everything-bob-o-donnell-hd7dc
Bob O’Donnell is the president and chief analyst of TECHnalysis Research, LLC a market research firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on LinkedIn at Bob O’Donnell or on Twitter @bobodtech. |