Support for Multi-modal data and inline context?

Danial_Ahmad · February 17, 2025, 7:31am

Hi, I have a couple of questions regarding vectara offerings.

Firstly does it support image data ingestion or multi-modal documents i.e. PDF documents containing images?
Also, is there any way to use the APIs or the Vectara Agentic package in a non-RAG flow? E.g Suppose I fetch data from some external API and need to use that as a context for the LLM/agent. Given the current documentation I couldn’t find any support for inline context. All the usage needs the data to be ingested to perform query/chat over it (Both in API docs and Vectara Agentic package)

ofermend · February 18, 2025, 12:03am

Hey @Danial_Ahmad

For image data - there is a way to handle images in PDF via our vectara-ingest open source tool. In that (GitHub - vectara/vectara-ingest: An open source framework to crawl data sources and ingest into Vectara), if you specify the summarize_images to True vectara-ingest will use an external LLM (e.g. GPT-4o) to summarize each image and ingest that summary as text into Vectara. Would that be a way to address your use case?
For the non-RAG flow and inline context - I think vectara-agentic can help, but to be sure - can you please clarify the specific use-case you are working on? what is the data in Vectara and what inline context do you want to include?

Topic		Replies	Views
How to create a RAG for DB with Vectara APIs or SDK? Vectara Platform Q&A query	3	33	March 19, 2025
Text Summarization + External API Use Case Vectara Platform Q&A	2	928	May 1, 2023
Whats the status on multimodal documents? Vectara Platform Q&A	1	751	August 17, 2023
Not sure where to begin	1	821	August 11, 2023
Upload JSON data in my own structure? Vectara Platform Q&A	7	153	June 11, 2024

Support for Multi-modal data and inline context?

Related topics