Google Cloud has unveiled a series of major updates to its Vertex AI platform, ushering in powerful new generative AI models and enhanced capabilities for enterprise deployment.
If you are unfamiliar with Google Vertex AI, it’s a unified platform that brings multiple cloud-based AI solutions for enterprise use offered by the brand under one roof. You can use the platform to build ML models specific to your organisational needs – and deploy them using pre-trained and custom tooling.
Now let’s take a look at everything new on this platform.
Gemini model enhancements
At the forefront is the public preview of Gemini 1.5 Pro, bringing the world’s largest context window of 1 million tokens to developers. Google says that such a large window enables native reasoning over vast amounts of data relevant to each request, often eliminating the need for techniques like fine-tuning or retrieval augmented generation.
One innovative new feature is support for processing audio streams within Gemini 1.5 Pro on Vertex AI. This cross-modal capability unlocks seamless analysis spanning text, images, video, and audio sources like earnings calls.
But Gemini 1.5 Pro isn’t the only model on offer. Google says it offers over 130 models in all. The new update also adds Anthropic’s Claude 3 family of models as well as the lightweight CodeGemma models.
Imagen’s animated and editing capabilities
Google is previewing the ability for its Imagen model to generate short, 4-second live animated images from text prompts at 24 frames per second. While initially limited to 360×640 resolution, this creative feature will empower marketing and content creation teams.
Imagen 2’s image generation powers are also being upgraded with the general availability of advanced editing tools like inpainting, outpainting, and invisible digital watermarking. Inpainting allows removing unwanted elements, while outpainting expands the image borders for a wider view.
Grounding models with ‘Enterprise Truth’
A key challenge has been keeping generative AI models aligned with trustworthy, up-to-date information sources. Google is tackling this by allowing Vertex AI models to directly ground their responses with Google Search through a public preview capability.
Complementing this is retrieval augmented generation (RAG) for grounding model outputs in an enterprise’s proprietary data. Google refers to leveraging these grounding techniques as providing “Enterprise Truth” – an imperative for building reliable, task-oriented AI agents.
MLOps tooling for prompt management and evaluation
Recognising the unique challenges of deploying large models, Google Cloud has expanded Vertex AI’s MLOps capabilities. The new Vertex AI Prompt Management service addresses pain points like prompt experimentation, migration, and tracking, offering features like versioning, AI-generated suggestions, and collaboration tools.
The prompt management toolset aids in areas like prompt versioning, comparison of variations, human feedback collection, and AI-powered prompt optimisation suggestions. Evaluation services help assess safety, factual accuracy, and other key performance metrics as models are iterated.