Getting API error

I am frequently (but not consistently) getting this error on v2 query api:

‘FAILED_PRECONDITION: Factual consistency computation skipped due to summarizer error.’

There are no differences in the summarizer configurations between the api calls - seems to work sometimes but not others. What does this error mean?

Hi Vivek,

This generally means the summary could not be generated. Do you see an error detailing why summary couldn’t be generated?

Hi Tallat,

It is a 412 error:

Response Status: 412 Precondition Failed
Response Headers: {
‘access-control-allow-headers’: ‘Content-Type, Authorization, X-Api-Key, X-Request-Id, Accept, Customer-Id’,
‘access-control-allow-methods’: ‘PUT, GET, POST, PATCH, DELETE, OPTIONS’,
‘access-control-allow-origin’: ‘*’,
‘access-control-expose-headers’: ‘X-Trace-Id, X-Request-Id’,
‘access-control-max-age’: ‘1728000’,
connection: ‘keep-alive’,
‘content-length’: ‘150’,
‘content-type’: ‘application/json’,
‘customer-id’: ***,
date: ‘Mon, 20 Jan 2025 16:29:01 GMT’,
‘strict-transport-security’: ‘max-age=15724800; includeSubDomains’,
‘x-request-id’: ‘6c4de54d112f5d175a3c19a58457d454’,
‘x-trace-id’: ‘8ec8c42db421ce47d583b09e08cbed72’
}

Response Body: {
messages: [
‘FAILED_PRECONDITION: Factual consistency computation skipped due to summarizer error.’
],
request_id: ‘6c4de54d112f5d175a3c19a58457d454’
}

The strange thing is that I am calling the api around 15 times in quick succession (now that v2 api does not support batch calling like v1). Some of those calls are getting 412 errors, others are getting ok response back. The only difference between the api calls is the query text itself - all configurations are unchanged.

Thanks for the details. Someone from our team will look into this and get back.

Thanks, I appreciate it.

@tallat Checking to see if there’s any update on this. Thanks!

The issue is assigned to one of the engineers and he’s actively looking at it. I’ve nudged the engineer, and he should respond here soon with his progress.

Hey Vivek, it seems that you are using a chain reranker and the first reranker cuts off all results. This can happen as different queries will have results with different scores, so whatever cutoff you are using seems to, in some cases, remove all results. Could you share your reranking config for the request?

Ah I see, thanks @husseinhassans .We currently only have dummy data in the corpus during development, I see why the cutoff would be too strong. It does seems that 412 status is bit misleading, it would be more clear to get “No responses found” back from the summarizer.

Here is our reranking config:

const VECTARA_RERANKING = {
SLINGSHOT: {
type: ‘customer_reranker’ as const,
reranker_name: ‘Rerank_Multilingual_v1’,
cutoff: 0.5,
limit: 10
},
KNEE: {
type: ‘userfn’ as const,
user_function: ‘knee()’,
cutoff: 0.5
}
} as const;

const VECTARA_DEFAULT_CHAIN_RERANKER: ChainReranker = {
type: ‘chain’,
rerankers: [
VECTARA_RERANKING.SLINGSHOT,
VECTARA_RERANKING.KNEE
]
};

Our team is new the to the concept of reranking. We were following what appeared to be best practice from the docs and blog posts. Would you recommend a different approach?

Thanks for the feedback Vivek. You’re absolutely right, posting a fix for the error log soon.
A quick note about reranking: Rerank_Multilingual_v1 essentially ensures your results are in order of relevance. So for the most part, this is what you want to use. The knee reranking function is used to remove results it determines are not useful from your result set. It operates better on a larger result set. Therefore, you don’t want to use any limits or cutoffs prior to applying the knee reranker.
If you want optimal performance, don’t use limits or cutoffs in Rerank_Multilingual_v1 when used in tandem with knee reranker. use your desired limits or cutoffs in the final stage. Our recommendation is just using a 0.5 cutoff, but if you would like a limit of 10 for your purposes, that should go in the final stage:
const VECTARA_RERANKING = {
SLINGSHOT: {
type: ‘customer_reranker’ as const,
reranker_name: ‘Rerank_Multilingual_v1’
},
KNEE: {
type: ‘userfn’ as const,
user_function: ‘knee()’,
cutoff: 0.5,
limit: 10 #THIS IS OPTIONAL
}
} as const;

Could you point me to what docs recommended you to use this type of config? Might need to fix that.

Thanks this is very helpful!

We were looking at the docs here: Reranking | Vectara Docs. And the blog post announcing Knee Reranking: Introducing Knee Reranking: smart result filtering for better results

On reread, I think the docs are clear and the blog post laid out the recommended approach well. I likely got mixed up between looking at the different examples for combing cutoffs and limits.

1 Like