November 30, 2023
By Bob O'Donnell
And then there was Q.
That might have been how Amazon’s AWS cloud computing division wanted to announce their long-awaited entry into the generative AI application market at the recent re:Invent conference, but it wouldn’t have told the full story. (Plus, it may have been confused with the mysterious Q* project within OpenAI that supposedly triggered the dramatic management reshuffling events that happened there recently.)
In truth, Amazon Web Services unveiled a comprehensive three-layer GenAI strategy and stack of offerings at re:Invent that happened to feature the intriguing new Q digital assistant at the top of the stack. But while Q got most of the attention, there were lots of important and interconnected elements underneath it. At each of the three layers—Infrastructure, Platform/Tools and Applications—AWS debuted a combination of new offerings and important enhancements to existing products that tied together to form an impressive and complete solution in the red-hot field of GenAI.
Or, at least, that’s what they were supposed to do. Unfortunately, there were so many announcements in a field that so few really understand yet that a lot of people were a bit confused by what exactly the company had put together. (A quick skimming of multiple articles on the news from re:Invent demonstrated that various news elements were covered in such dramatically different ways in each of the stories that it was clear the company still has some work to do in explaining exactly what they have.)
Given the luxury of a day or two to think about it, as well as the ability to ask a lot of clarifying questions about some of the critical details of how the various pieces fit together, however, it’s apparent to me now that this new approach to GenAI is a comprehensive and compelling strategy—even in its admittedly early days. It’s also apparent now that what AWS has done over the course of the last several years is introduce a variety of different products and services that, at first blush, may not seem to have been related, but in fact were the starting points for a much bigger picture that’s now starting to come into focus.
The company’s latest efforts (and news) start at its core infrastructure layer, including its custom silicon work. At this year’s re:Invent, AWS built on previous offerings and debuted the second generation of its Trainium AI accelerator chip, which offers 4x improvements in AI model training workloads over its predecessor. They also discussed their Inferentia 2 chip, introduced earlier this year, which is optimized for AI inferencing efforts. Together, these two custom chips—along with the fourth generation of their Arm-based Graviton CPU, which was also introduced at re:Invent—give Amazon a complete line of unique chips that it can use to build differentiated compute offerings that can be leveraged across a wide range of different AI workloads.
At the same time, AWS CEO Adam Selipsky also had Nvidia CEO Jensen Huang join him onstage during his opening day keynote to announce further partnerships between the companies. In particular, they discussed the debut of Nvidia’s latest GH200 GPU, which also incorporates a custom ARM CPU, in several new EC2 compute instance from AWS, and the first third-party deployment of Nvidia’s DGX Cloud systems. In fact, the two even discussed a new version of Nvidia’s NVLink chip interconnect technology that allows up to 32 of these systems to function together as a giant AI computing factory (codenamed Project Ceiba) that AWS will host for Nvidia’s own AI development purposes. In addition to the impressive capabilities of these offerings, the key takeaway here is that Amazon is focused on delivering choice—a theme that ended up showing up several times throughout the re:Invent Day 1 keynote—and not force customers to use only its own solutions.
Moving up to the middle Platform/Tools level, AWS announced a number of critical enhancements to its Bedrock platform, which it debuted earlier this year. Bedrock consists of a set of services that allow you to do everything from pick the foundation model of choice, figure out how you choose to train or fine-tune the model, determine levels of access that different people in an organization have access to, choose what types of information is allowed and what is blocked (something they’re calling Bedrock Guardrails), and create actions based on what the model generates, using a new technology the company is calling Bedrock Agents. In the area of model tuning, AWS announced support for three key technologies: fine tuning, continuous pre-training and, most critically, RAG (Retrieval Augmented Generation). All three of these have burst onto the scene relatively recently and are being actively explored by organizations who believe they can be used to integrate their own custom data more effectively into their GenAI application building efforts. These new approaches are quickly becoming an incredibly important because many companies have started to realize they aren’t interested in (or, frankly, capable of) building their own foundation models from scratch.
On the foundation model side of things, the range of new options being supported within Bedrock grew quite a bit, including Meta’s Llama 2, Stable Diffusion, and more versions of Amazon’s own family of Titan models. Given AWS’ recent large investment in Anthropic AI, it wasn’t a surprise to see a particular focus on Anthropic’s new Claude 2.1 model as well, which apparently offers significant enhancements in some of the safety-related controls it’s already known for, as well as important reductions in hallucinations.
The final layer of the AWS GenAI story was the aforementioned Q digital assistant. Unlike most of AWS’s existing offerings, Q can be used as a high-level finished GenAI application that companies can start to deploy. At the same time, developers can customize Q for specific applications—and they do so through some of the APIs and other tools in the Bedrock layer (see the connection?).
What’s interesting—though also initially a bit confusing—about Q is that can take many forms. The most obvious version is a chatbot-style experience similar to what other companies currently offer. Not surprisingly, most of the early news stories focused on this chatbot UI. Even in its early iteration, however, Q can do many other things as well. Specifically, AWS showed how Q can be used as part of the code generating experience found in Amazon’s Code Whisperer, a call transcriber and summarizer in its Amazon Connect customer service offering, an easy-to-use builder of comprehensive data dashboards in its Amazon QuickSite analytics tools, and as a content generator, summarization tool and knowledge management guide for regular business users. An interesting point to note is that Q can leverage different underlying foundation models—apparently a set of Amazon’s own Titan models and Anthropic’s Claude 2.1—for different applications. This is arguably a more comprehensive and more capable type of digital assistant application than we’ve seen from some of its competitors, but it’s also a lot harder for people to get their heads around.
Digging deeper into how Q works and its connections to the other parts of the AWS strategy, it turns out that Q was essentially built via a set of Bedrock Agents. So, what this means is that companies who are looking for a more “easy button” solution for getting GenAI applications deployed in their company can use Q as is. Companies who are interested in doing more customized solutions, on the other hand, can create some of their own Bedrock Agents. This concept of pre-built versus customizable capabilities also applies to Bedrock and Amazon’s existing SageMaker tool for building custom AI models. Bedrock is for those who want to leverage a range of already built foundation models, while SageMaker is for who want to build models of their own.
In addition to all these capabilities, AWS made a number of other announcements related to this multi-level GenAI strategy, particularly with regard to connections to different types of databases, other enterprise applications data sources, and more. Given the critical connection between data and GenAI applications—as well as the strong desire for companies to leverage their own data—this makes complete sense.
Taking a step back and pulling all this together, you can start to see what a comprehensive framework and vision AWS has put together here. At the same time, it’s also apparent that it’s not necessarily the most intuitive strategy to understand. Moving forward, let’s hope Amazon can refine its messaging to make this impressive GenAI story more understandable to more people so that companies can answer their big Qs and start to leverage all the capabilities that are hidden within.
Here's a link to the original column: https://www.linkedin.com/pulse/amazon-aws-genai-strategy-comes-big-q-bob-o-donnell-w2hpc Bob O’Donnell is the president and chief analyst of TECHnalysis Research, LLC a market research firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on LinkedIn at Bob O’Donnell or on Twitter @bobodtech. |