Support for Multi-modal data and inline context?

Hi, I have a couple of questions regarding vectara offerings.

Firstly does it support image data ingestion or multi-modal documents i.e. PDF documents containing images?
Also, is there any way to use the APIs or the Vectara Agentic package in a non-RAG flow? E.g Suppose I fetch data from some external API and need to use that as a context for the LLM/agent. Given the current documentation I couldn’t find any support for inline context. All the usage needs the data to be ingested to perform query/chat over it (Both in API docs and Vectara Agentic package)

Hey @Danial_Ahmad

  1. For image data - there is a way to handle images in PDF via our vectara-ingest open source tool. In that (GitHub - vectara/vectara-ingest: An open source framework to crawl data sources and ingest into Vectara), if you specify the summarize_images to True vectara-ingest will use an external LLM (e.g. GPT-4o) to summarize each image and ingest that summary as text into Vectara. Would that be a way to address your use case?
  2. For the non-RAG flow and inline context - I think vectara-agentic can help, but to be sure - can you please clarify the specific use-case you are working on? what is the data in Vectara and what inline context do you want to include?