Uploading files & Indexing

June_76 · July 20, 2023, 11:34pm

I am working on integrating Vectara to an existing database, where once a file is uploaded to the database, I want the file to be uploaded to a corpus in Vectara and indexed and ready for querying. From going over Vectara documentation, I understand that uploading a file and indexing a file on Vectara are two different tasks. What would be the best way for me to upload a file to the corresponding corpora (that I want it to be uploaded in) while also indexing it right away?

shane · July 21, 2023, 12:01am

The file upload API results in not only the file extraction, but also an indexing operation. Did you experience some problems in seeing a file not searchable?

June_76 · July 21, 2023, 12:54am

Thanks for your response!

No, I did not face any issues. So far I have tried only on console.vectara.com.
If the upload API extracts a file and indexes the data in it, what is the difference between upload API and index API? Can you brief on the particular use cases for the two?

shane · July 21, 2023, 2:06am

The upload API takes “raw files” (PDFs, word documents, HTML files, etc) and then extracts text and metadata from those documents.

The index API is for sending semi-structured data programmatically where you control the text/metadata extraction. e.g. if you had a database that contained some fields that contain text and some metadata, I’d generally recommend using the standard indexing API, as it gives you the greatest control to structure your documents and also allows gRPC (which is lower latency) vs REST. The File Upload API does allow you to send custom formatted JSON documents, and there aren’t too many downsides to it, but I’d think the standard indexing API would generally make more sense for a database sync unless your database held document blobs containing PDFs and similar

Topic		Replies	Views
Question about using the FileUpload API via Zapier Webhooks Vectara Platform Q&A	5	814	October 20, 2023
Upload JSON data in my own structure? Vectara Platform Q&A	7	153	June 11, 2024
Specifying data type of PDF metadata Vectara Platform Q&A indexing	5	758	November 2, 2023
Files in Corpus Vectara Platform Q&A indexing	1	1372	July 9, 2023
Indexing error Vectara Platform Q&A indexing	5	2284	February 15, 2023

Uploading files & Indexing

Related topics