The lights have gone down on Vegas for another year with the end of AWS re:Invent 2024.
This year’s event proved precisely why the annual five-day conference is a staple of the tech calendar, having shaped into one of the most significant in the 12-year series. Unsurprisingly, generative AI dominated the agenda, with AWS proving once again why it’s the go-to stack for bridging the gap between industry leaders and start-ups with a whole range of handy enterprise tools and other exciting unveilings.
It was certainly not one to be missed by anyone within the tech space—but don’t worry if you did! That’s because we’ve compiled this post packed full of all the biggest announcements, updates, and advice from across the week, ensuring you can keep your finger on the pulse by staying at the forefront of all things AWS.
So, with the sweet sounds of Weezer still ringing in our ears following a mammoth headline set at re:Play, let’s recap what went down at re:Invent 2024—from the most exciting innovations on the horizon to the most interesting takeaways from some of the community’s biggest names, experts, and thought leaders.
Limitless new capabilities in Amazon Bedrock
Amazon Bedrock is AWS’ fully managed service for building and scaling gen AI applications with high-performing foundation models, and at re:Invent, we were introduced to an exciting range of new capabilities that will help customers prevent factual errors, complete complex tasks using multiple AI-powered agents, and even create task-specific models that offer similar performance to larger models at just a fractions of the cost and latency:
- Introduction of Automated Reasoning: Even the most advanced AI models can provide misleading responses and factual errors—what we call ‘hallucinations’. These hallucinations pose significant challenges across the ecosystem, particularly when it comes to the trust organizations are willing to place in gen AI solutions. Of course, this becomes even more pertinent in industries like healthcare and financial services, where accuracy is paramount. Enter Amazon Bedrock. At re:Invent, AWS introduced new features, including Model Distillation aimed at training smaller AI models at pace, and Automated Reasoning Checks. The latter is the only gen AI safeguard that helps prevent hallucinations by using logically accurate and verifiable reasoning, producing auditable outputs that show customers exactly why a model arrived at an outcome. Combined with Model Distillation, this transparent approach aims to improve response accuracy and customer trust while enabling organizations to create tailored models for specific needs.
- Easy build and coordination with multi-agents: With the introduction of multi-agent orchestration to Bedrock, organizations can now build collaborative AI agents to streamline and execute complex workflows. Single AI-powered agents can help customers’ applications take action using a model’s reasoning capabilities, but with multi-agent orchestration, these agents are capable of far greater tasks. This new ability to create complex systems that coordinate multiple agents, share context between them, and dynamically route different tasks to the right agent is one that traditionally required specialized tools that many organizations didn’t have access to—until now. With this in mind, this new addition to Bedrock helps to democratize AWS’ Generative AI features even further.
- Cost reduction on prompt catching: With the introduction of Intelligent Prompt Routing and Prompt Caching to Bedrock, AWS customers can now benefit from cost savings of up to 30% and 90%, respectively, for running AI applications. Prompt Catching helps to reduce the cost and latency associated with token generation by storing and reusing common or frequently used queries, significantly minimizing expenses and speeding up response times. Meanwhile, Intelligent Prompt Routing ensures that prompts are handled using the most suitable model for their complexity. It streamlines resource use and enhances processing efficiency by directing simpler tasks to smaller models and reserving larger models for more intricate queries.
- Data automation and advanced RAG features: AWS looks set to tackle unstructured data head-on with the introduction of Bedrock Data Automation. This new feature transforms raw, unstructured content—such as audio files, videos, and documents like PDFs—into structured formats optimized for generative AI workflows. Acting as a powerful ETL tool, it will process multimodal data at scale, simplifying data preparation for enterprise AI applications and allowing organizations to extract insights and value from a broader range of content. But that’s not at all! New tools were unveiled at re:Invent to optimize RAG workflows for both structured and unstructured data, including Amazon Bedrock Knowledge Bases and GraphRAG. With these tools, you can simplify intricate processes like constructing knowledge graphs and generating SQL queries, making AI more accessible by empowering organizations to develop more accurate and intelligent AI applications without needing custom coding or specialized technical expertise.
Generate text, images, and video with Nova AI model family
OK, we aren’t quite done talking about Bedrock just yet, but this exciting new launch deserved a section of its own!
At re:Invent, Amazon introduced the Nova family—a selection of generative AI models specializing in text, image, and video creation. Integrated with Bedrock, these models enable organizations to unlock creative content creation and development by utilizing accessible yet advanced AI applications.
The Nova family includes:
- Amazon Nova Lite: A highly cost-efficient multimodal model designed for ultra-fast processing of text, images, and videos.
- Amazon Nova Micro: A text-only model optimized for minimal latency and ultra-low operational costs.
- Amazon Nova Pro: A versatile multimodal model offering an ideal balance of accuracy, speed, and cost-effectiveness for diverse applications.
- Amazon Nova Premier: The most advanced multimodal model in the lineup, tailored for complex reasoning tasks and serving as a superior teacher for distilling custom models (expected to launch in Q1 2025).
- Amazon Nova Canvas: A cutting-edge model dedicated to generating high-quality images.
- Amazon Nova Reel: A state-of-the-art model specialized in producing video content.
Specifically designed to easily integrate with AWS customers’ systems and data, this family of fast and efficient AI models can perform a wide range of tasks across multiple modalities (and in over 200 languages!). Better still, not only are Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro at least 75% cheaper than the respective highest-performing models within Bedrock—they’re also the fastest models in their respective intelligence classes!
SageMaker becomes a Data and AI Hub
We were introduced to the next generation of SageMaker at re:Invent, featuring a range of exciting new additions aimed at streamlining data accessibility, governance, and integration for advanced analytics and AI development:
- SageMaker Unified Studio: This central platform simplifies finding and accessing organizational data while integrating AWS’s analytics, machine learning, and AI tools. With support from Amazon Q Developer, users can address diverse data use cases using the most suitable tools.
- Governance with SageMaker Catalog: Built-in governance features ensure appropriate access to data, models, and artifacts, empowering users to leverage resources securely and effectively.
- SageMaker Lakehouse: This feature connects data from lakes, warehouses, operational databases, and enterprise apps, enabling users to work with this unified data within SageMaker Unified Studio seamlessly. It supports familiar ML and AI tools and integrates with query engines compatible with Apache Iceberg.
- Zero-ETL Integrations: New integrations with popular SaaS applications simplify the use of third-party data in SageMaker Lakehouse and Amazon Redshift without the need for complex ETL pipelines, making it easier to analyze or apply ML to external data.
- HyperPod Task Governance: SageMaker HyperPod’s new feature, HyperPod Task Governance, sets out to optimize GPU usage and minimize idle time to cut AI infrastructure costs by up to a mammoth 40%. This smart approach to task prioritization and resource allocation tackles critical efficiency challenges head-on to enable organizations to scale AI initiatives with more intelligence and efficiency.
Trainium 2 instances now available
AWS has introduced Trainium2-powered Amazon EC2 instances, specifically engineered to support high-performance deep learning (DL) training for generative AI models. These instances are ideal for tasks such as training large language models and latent diffusion models.
Each Trn2 instance incorporates 16 Trainium2 chips interconnected via NeuronLink, a high-bandwidth communication framework, to maximize performance and minimize latency. AWS also launched Trn2 UltraServers, a new EC2 offering that links four Trn2 servers into a single massive server using NeuronLink interconnect. This setup allows users to scale generative AI workloads across 64 Trainium2 chips, significantly boosting the capability to handle demanding AI training tasks.
Exciting new database potential with Amazon Aurora DSQL and DynamoDB
AWS unveiled significant advancements for its databases, including the introduction of Amazon Aurora DSQL, which is being hailed as the fastest distributed SQL database. These enhancements aim to meet the needs of organizations running high-demand workloads requiring multi-region operations with strong consistency, low latency, and exceptional availability.
The updates are designed to support both SQL-based relational databases and NoSQL systems. Customers using SQL can efficiently manage structured, tabular data in relational databases, while NoSQL users benefit from non-tabular, flexible data storage formats. These innovations make AWS’s database offerings more versatile, catering to a broader range of data management needs and application requirements.
Over the years, Amazon Aurora has been a standout, offering enterprise-grade performance with the flexibility and cost-effectiveness of open-source databases. Building on this foundation, AWS has announced significant advancements, reimagining relational and NoSQL databases to achieve global availability, strong consistency, and unparalleled scalability—without sacrificing low latency or SQL compatibility.
Amazon Aurora DSQL
The newly introduced Amazon Aurora DSQL is a serverless, distributed SQL database that sets a new benchmark for performance and scalability. Key features include:
- 99.999% multi-region availability with strong consistency.
- PostgreSQL compatibility.
- Up to 4x faster reads and writes than other leading distributed SQL databases.
- Virtually unlimited scalability with zero infrastructure management.
Aurora DSQL solves long-standing challenges of distributed databases, such as achieving multi-region strong consistency with low latency and maintaining microsecond-level synchronization across servers globally. These capabilities enable customers to develop globally distributed applications at an unprecedented scale.
Enhancements to Amazon DynamoDB Global Tables
Amazon DynamoDB, the fully managed, serverless NoSQL database, now features significant upgrades to its global tables. Previously known for its 99.999% availability, multi-region, multi-active capabilities, and consistent millisecond performance, DynamoDB now incorporates the same underlying technology used in Aurora DSQL. This enhancement introduces strong consistency to global tables, allowing customers to ensure their multi-region applications always access the latest data. This is achieved without requiring changes to application code, maintaining the simplicity and zero infrastructure management DynamoDB is known for.