Error in indexing API

I’m trying to index more than 6000 document to a corpus, I use 100 concurrent connection in order to speed up the process using Python 3.10.13, I’m getting some errors back from some requests
Here are all the different errors that have occurred:

('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
HTTPSConnectionPool(host='api.vectara.io', port=443): Max retries exceeded with url: /v1/index (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)')))
Status Code: 429

And here is the piece of code I use for indexing:

def load_url(filename, index):
            
            with open(os.path.join(folder_path, filename), "r") as json_file:
                    document = json.load(json_file)

            payload = {
                "customer_id": config.ZIR_CUSTOMER_ID,
                "corpus_id": corpus_id,
                "document": document
            }

            headers = {
                "Authorization": jwt_token['token_type'] + ' ' + jwt_token['access_token'],
                "customer-id": f'{config.ZIR_CUSTOMER_ID}',
                "Content-Type": "application/json",
            }
            try:
                ans = requests.post(url="https://api.vectara.io/v1/index", headers=headers, data=json.dumps(payload))
                status_code = ans.status_code
                if status_code != 200:
                    print(status_code)
                    index, status_code = load_url(filename, index)
            except Exception as exc:
                print(filename)
                print(exc)
                index, status_code = load_url(filename, index)
            return index, status_code

I have also got another error with the authentication URL trying to get the token

urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='zir-prod-1353344576.auth.us-west-2.amazoncognito.com', port=443): Max retries exceeded with url: /oauth2/token (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0xffff83f7ee30>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='zir-prod-1353344576.auth.us-west-2.amazoncognito.com', port=443): Max retries exceeded with url: /oauth2/token (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0xffff83f7ee30>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

Hi hossam,

Apologies for the late response.

When you are sending several requests to the same endpoint, the recommended practice is to use the same underlying connection. Not only is it faster (you don’t have to open a new connection in each request), it will also consume lesser resources (connections are expensive).

Can you please try that and see if that resolves your issue? In Python, you can simply do session = requests.Session(), and then all post requests can use the session object (e.g., session.post(url=....

Please give it a try and let us know the result.

Regards

Thanks Tallat,
I will give it a try next time and will report back if I still have errors, but for the authentication API it was just a one request.