Adding metadata to a whole-file upload

zigguratt · July 3, 2023, 12:36pm

Beginner question here. I’m moving from Pinecone to Vectara because Vectara is a better fit for my company.

When using Pinecone via Langchain I could send metadata with every chunk extracted from a document, say, a PDF. This, of course, required me to do the embedding myself, which was standard practice for vector stores until I found Vectara.

Of course, Vectara has built-in embedding. Is it possible to send the metadata for a whole PDF and have Vectara attach that metadata to every chunk it creates from the PDF? If that’s not possible and I have to do the same as I was doing with Pinecone, I lose the advantage that Vectara offers with its built-in embedding.

shane · July 3, 2023, 6:15pm

It is! The way you can add additional metadata via the file upload API is to use the doc_metadata field. We have an example on the docs here if you search for doc_metadata. This gets attached to the document metadata as opposed to the section metadata (both/either can have their own)

Topic		Replies	Views
Question about using the FileUpload API via Zapier Webhooks Vectara Platform Q&A	5	814	October 20, 2023
V2 upload_file playground Vectara Platform Q&A	2	75	June 26, 2024
Python file produced by the API Playground doesn't include metadata Vectara Platform Q&A indexing	5	820	July 5, 2023
Cannot index files via /upload	10	983	June 16, 2023
How to return metadata with query results? Vectara Platform Q&A query	2	867	July 7, 2023

Adding metadata to a whole-file upload

Related topics