How to speed queries up

rbhalla · April 29, 2023, 11:47pm

Hi,

Currently I am testing indexing a single wikipedia page (about 50kb according to index stats) and retrieval.

It seems like retrieval alone is about 700ms for me (querying for a single word). For requests that need token refreshes, that adds another 500ms (on average).

Do you have any tips for how I might go about speeding up retrieval time?

tallat · April 30, 2023, 7:30pm

Thanks for reaching out. Our query latencies are much shorter than what you are observing. Can you please share your customer (account) ID and corpus ID? That’ll help us diagnose further. You can share this information offline with me via email: tallat (at) vectara.com.

rbhalla · May 5, 2023, 8:17pm

So after a very helpful exchange with Tallat, there are 3 things I’ve learnt:

Reusing the the tcp connection goes a long way to speeding things up. I am using python so using requests.Session() did it for me (again, thanks Tallat for this tip)
I stopped using OAuth for requests and instead generated an API key. OAuth requests (even when not refreshing the token) were adding a large overhead to request initialisation.
Vectara servers are currently US west coast, and I am in the UK. So that may be one reason I’m not seeing the latencies I would expect.

With the above changes I was able to get my average latency down to about 200ms. I think in an ideal world they would be closer to 50ms, but I will wait till Vectara supports more regions for that.

Topic		Replies	Views
API Response - time out Vectara Platform Q&A query	3	614	December 2, 2023
Timeouts with more than 20 seconds Vectara Platform Q&A	6	854	September 8, 2023
Issue with hybrid search + metadata filter Vectara Platform Q&A	3	581	January 18, 2024
Can't Retrieve Anything Vectara Platform Q&A	13	816	September 20, 2023
Noticeable delay using Vectara gRPC query method	2	966	March 28, 2023

How to speed queries up

Related topics