I’m trying to index more than 6000 document to a corpus, I use 100 concurrent connection in order to speed up the process using Python 3.10.13, I’m getting some errors back from some requests
Here are all the different errors that have occurred:
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
HTTPSConnectionPool(host='api.vectara.io', port=443): Max retries exceeded with url: /v1/index (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)')))
Status Code: 429
And here is the piece of code I use for indexing:
def load_url(filename, index):
with open(os.path.join(folder_path, filename), "r") as json_file:
document = json.load(json_file)
payload = {
"customer_id": config.ZIR_CUSTOMER_ID,
"corpus_id": corpus_id,
"document": document
}
headers = {
"Authorization": jwt_token['token_type'] + ' ' + jwt_token['access_token'],
"customer-id": f'{config.ZIR_CUSTOMER_ID}',
"Content-Type": "application/json",
}
try:
ans = requests.post(url="https://api.vectara.io/v1/index", headers=headers, data=json.dumps(payload))
status_code = ans.status_code
if status_code != 200:
print(status_code)
index, status_code = load_url(filename, index)
except Exception as exc:
print(filename)
print(exc)
index, status_code = load_url(filename, index)
return index, status_code