API Response - time out


I am regularly running into the API not responding and I would like to know the best approach to solve this issue (see below).

For some context about this request:

  • The Corpus has circa 3GB in it
  • I am filtering the index using metadata which should be less than 10MB of the total data
  • I am simultaneously indexing hundreds of documents into the same corpus
  • I am requesting 40 results (but the problem exists with 30 results too)


  • Is it possible to increase the timeout?
  • Why does this happen? Is it too much text per doc, too much data total, ongoing indexing causes querying issues?
  • Is it ok to simultaneously do large amounts of indexing and querying or should I do them separately?

Thank you in advance!

API Response:

“responseSet”: [
“response”: ,
“status”: ,
“document”: ,
“generated”: ,
“summary”: [
“text”: “”,
“lang”: “”,
“prompt”: “”,
“status”: ,
“futureId”: 2
“futureId”: 1
“status”: [
“code”: “QRY__TIMEOUT”,
“statusDetail”: “Operation timed out. Results are partial.”,
“cause”: null
“metrics”: null

Hi Ed,

Thanks for using our platform and with a relatively high data volume.

  • Yes, it’s possible to increase the timeout. If you are using our HTTP API, you would include a header like grpc-timeout: 30S
  • There are two main reasons:
    1. We have a known issue where our lexical search is slow for large corpus. You can turn it off by setting the lamba value to 0 during the query.
    2. Large corpus can take a bit of time to optimize, and that optimization can drift if the amount and type of data in the corpus rapidly changes.
  • It should be okay, depending on what values you mean by “large”. Having seen your query and index volume we should discuss dedicated capacity for your needs.

Thanks for the patience you’ve demonstrated while using the platform. It appears you have a fairly intensive use case, so understanding your exact numbers query and ingestion rates and potentially figuring out dedicated capacity would let us provide a more smooth platform experience.

Hey Nikhil,

Thanks for the info!

Considering that a large corpus takes some time to optimise would the best practice be for me to not query whilst indexing to allow for the optimisation to happen?

To give you an example, in the last 7 hours I have indexed around 90,000 documents - whilst this has been happening close to zero query API calls have responded at all (I just tried with an increased timeout and it doesn’t work either unfortunately).

I really need the ability to index millions of documents on a weekly basis (they can be deleted shortly after being indexed once I have run my analysis if this would help things).

If there is some way to sort this out I would be very happy to discuss!

Thanks a lot for your help

You can look for the status of the index rebuild job using our List Jobs API ListJobs | Vectara Docs. Querying will not impact the rebuild, as the rebuild is done in a separate service.

Do you need to index the new documents into a one corpus, or a new one every time? Unfortunately, once a corpus gets into the 20+ million document range, we can support it but we have to manually set that up on our end.

After inserting a lot of documents you will likely need to wait for a rebuild before query performance is acceptable. Can you email your customer id and corpus id to nikhil@vectara.com, and I can take a look at the status of your corpus from our end too.