Hi there, I am starting to use this amazing tool to implement a robust RAG, but after uploading some documents to a new corpus and examining how these documents were split, I realized that I need to change the chunking method. For this project, I need to define a specific character to serve as a marker for where the documents should be divided (text splitter).
Is this possible in Vectara? How should I proceed?
Thank you.
We don’t support chunking by specific character. The chunking strategies we currently support are listed under the chunking_strategy field in the API (e.g., Upload a file to the corpus | Vectara Docs).
One option you have is to chunk the documents yourself, and then use our low-level API (aka Core Document) to upload the chunks directly. For details, see CoreDocument here: Add a document to a corpus | Vectara Docs