List of Uploaded Documnts?

Quick question - when I upload my documents in the corpus, where can I see the list of documents once it’s done uploading? I’ve looked around in the console but unable to find where that is.

Hi there! Thanks for raising this. Currently, there are two ways to see what you’ve uploaded:

  1. If you upload files in the “Data ingestion” tab, you’ll see a list of the files that were uploaded. But this only works when using this tab (not the API), and this information disappears if you refresh or navigate away.
  2. If you run a search in the “Search” tab you’ll see a list of results that reference documents, but there might be multiple results per document and this isn’t a comprehensive list of all documents.

Can I ask some questions to learn more about what you need? This will help us figure out what kind of solution will work best for you.

  • What are you trying to accomplish by looking at a list of documents?
  • Are you just trying to verify the upload was successful and the documents are stored in the corpus?
  • Are there other questions you’re trying to answer?

Thanks again! Hearing from you and other folks using Vectara means a lot to me, because it helps us make the product better.

CJ

Hey @cjcenizal!

That’s correct! Essentially my team would like to see exactly what documents have been fed into the corpus. Right now there’s no easy way aside from running a Search but that only gives a snapshot of the document.

Other platforms that we’ve trialed at least let us see the documents we’ve uploaded and can make the necessary changes to said documents so the data stays relevant.

B

Thank you for explaining! That makes a ton of sense. Out of curiosity, what are the other platforms you trialed?

So there’s no way to view a list of ingested documents? @cjcenizal

I personally played around with IngestAI.

Currently the only options are the ones that I listed, but now we’re discussing this feature request internally.

Hi!
I am in a similar situation where I need to get the document IDs to update/delete based on the doc_id. Do you have an update for this thread?

I have the same issue. I feel like this should get top priority because it makes the platform almost unusable if we can’t remove documents since we don’t even know what any of our doc_id’s are.

1 Like

Hi there,
I just ran into the same questions. I’m using Flowise to index documents and would like to be able to trigger the deletion in Flowise as well.

Any chance to grab a document’s ID after uploading via the API so it can be referenced for deletion later?

Thanks

Hi Jan, would it help if the API responded with a list of IDs that correspond to the documents that were indexed?

I’m actually looking for this too and this solution would help.

1 Like