We’re thrilled to announce the release of a new Cloudera Accelerator for Machine Learning (ML) Projects (AMP): “Summarization with Gemini from Vertex AI”. An AMP is a pre-built, high-quality minimal viable product (MVP) for Artificial Intelligence (AI) use cases that can be deployed in a single-click from Cloudera AI (CAI). AMPs are all about helping you quickly build performant AI applications. More on AMPs can be found here.
We built this AMP for two reasons:
- To add an AI application prototype to our AMP catalog that can handle both full document summarization and raw text block summarization.
- To showcase how easy it is to build an AI application using Cloudera AI and Google’s Vertex AI Model Garden.
Summarization has consistently been the ultimate low-hanging fruit of Generative AI (GenAI) use cases. For example, a Cloudera customer saw a large productivity improvement in their contract review process with an application that extracts and displays a short summary of essential clauses for the reviewer. Another customer in Banking reduced the time it took to produce a prospective client’s source of wealth review memo from one day to just 15 minutes with a custom GenAI application that summarizes key details from tens to hundreds of financial documents.
This will be our first AMP using the Vertex AI Model Garden, and it’s about time. It’s incredibly beneficial to only need a single account for easy API access to over a hundred of the leading closed-source and open-source models, including a strong set of task-specific models. The models in the Garden are already optimized for running efficiently on Google’s Cloud infrastructure, offering cost effective inference and enterprise-grade scaling, even on the highest-throughput apps.
This will also be our first AMP using Gemini Pro Models, which work well with multi-modal and text summarization applications and offer a large context window, which is up to one million tokens. Benchmark tests indicate that Gemini Pro demonstrates superior speed in token processing compared to its competitors like GPT-4. And compared to other high-performing models, Gemini Pro offers competitive pricing structures for both free and paid tiers, making it an attractive option for businesses seeking cost-effective AI solutions without compromising on quality.
How to deploy the AMP:
- Get Gemini Pro Access: From the Vertex AI Marketplace find and enable the Vertex AI API, then create an API key, and then enable Gemini for the same project space you generated the API key for.
- Launch the AMP: Click on the AMP tile “Document Summarization with Gemini from Vertex AI” in Cloudera AI Learning, input the configuration information (Vertex AI API key and ML runtime info), and then click launch.
The AMP scripts will then do the following:
- Install all dependencies and requirements (including the all-MiniLM-L6-v2 embedding model, Hugging Face transformers library, and LlamaIndex vector store).
- Load a sample doc into the LlamaIndex vector store
- Launch the Streamlit UI
You can then use the Streamlit UI to:
- Select the Gemini Pro Model you’d like to use for summarization
- Paste in text and summarize it
- Load documents into the vector store (which generates the embeddings)
- Select a loaded document and summarize it
- Adjust response length (max output tokens) and randomness (temperature)
And there you have it: a summarization application deployed in mere minutes. Stay tuned for future AMPs we’ll build using Cloudera AI and Vertex AI.