From what I understand you ingested a document (e.g. a PDF) into your Vectara corpus. This shows the document ID and metadata. Do you want to see the text of the original document?
This can be available via the API when you ingest the file (but not in the console) - File Upload API Definition | Vectara Docs. If you set d=True in he request, the API call returns the extracted document that was indexed.
it would be neat to see the scraped text contents in the console during testing
I would like to call an API on your side to ‘only’ do the text extraction part of your logic - again for testing, and maybe some other uses.
Since we are planning to trust Vectara that you are doing an excellent text extraction and not changing the meaning of the contents of a PDF or Word document - being able to see that scraped out text is very valuable.
In my application - my users may want to upload files into a different section of my app to store full copies of the text content - primarily for smaller files that they will want to use in their Gen AI prompts without RAG - just include the whole thing. Maybe a 1 page mission statement, or our principles, our brand, our coporate history. If I could pass those word/pdf documents up to vectara and get a ‘vectara scraped text’ back - then I’d be using the same logic for all my ‘text scraping’ needs.
also a concern: many of these threads get answered wtih ‘sorry that is not possible in the api or console currently’ - are these being recorded as potential future enhancements?