Hi there,
I am currently using algolia search and what it is lacking from functionality that it did not understand the intent or skip the intent sometimes and did not respond back correctly.
So i have checked vectara and happy to see that it uses LLM model with vector hashing and can return recommendations.
I am looking for a desired result, but stuck in data preparation setup that what kind of data needs to be uploaded in Corpora in JSON format.
Here is my actual JSON.
{
"first_name": "Sri",
"last_name": "Siva",
"tagline": "Award Winning Learning Consultant",
"bio": "I have over 20 years progressive experience with large fortune 500 companies. Experience includes both US and offshore teams with robust records of success in achieving complex objectives and timelines. I am highly experienced in managing and implementing end-to-end, instructional design process-based methodology (analysis, design, development, and delivery) to create outstanding curriculum and training programs (ILT and CBT/WBT). I was awarded 2nd prize in the OxTalent competition for my work in converting an ILT to CBT at the University of Oxford. The prototype was so successful that Oxford University has proposed to take this project further to supporting the wider community.",
"location": "",
"roles": [
"Course Developer",
"eLearning Developer",
"Instructional Designer",
"Learning Technologist",
"Learning Technologist - Other"
],
"languages": [
"English"
],
"skills": [
"Articulate",
"Camtasia",
"Captivate",
"Dreamweaver",
"Office 365",
"Photoshop",
"Snagit"
],
"industries": [],
"experiences": [],
"companies": [
"Amazon",
"Google",
"Ernst & Young",
"Abbott",
"American Society of Plastic Surgery",
"Home Depot",
"Walgreens",
"World Health Organization",
"Blue Cross Blue Shield",
"Twilio",
"Infor",
"Coca Cola",
"Motorola",
"BMO Harris Bank",
"US Cellular",
"Vyaire",
"Crowe Horwath",
"Aflac",
"Accenture",
"Boy Scouts of America",
"Cox Communication",
"Cox Automotive",
"Mars",
"Franklin Templeton",
"Facebook",
"Fannie Mae",
"Meta",
"AT&T",
"Uber",
"Kaiser Permanente",
"Analog Digital",
"Columbus McKinnon",
"Morton Buildings",
"Clear Connect"
]
}
And i have converted the above JSON into the specific JSON format which Vectara understand.
But i am not sure if i have correctly formatted it or not.
Here is the ready JSON to upload on vectara.
{
"documentId": "talent-002",
"title": "Award Winning Learning Consultant",
"description": "I have over 20 years progressive experience with large fortune 500 companies. Experience includes both US and offshore teams with robust records of success in achieving complex objectives and timelines. I am highly experienced in managing and implementing end-to-end, instructional design process-based methodology (analysis, design, development, and delivery) to create outstanding curriculum and training programs (ILT and CBT/WBT). I was awarded 2nd prize in the OxTalent competition for my work in converting an ILT to CBT at the University of Oxford. The prototype was so successful that Oxford University has proposed to take this project further to supporting the wider community.",
"metadataJson": "{\"talent\":\"sri siva\"}",
"section": [
{
"title": "Roles",
"text": "Course Developer, eLearning Developer, Instructional Designer, Learning Technologist, Learning Technologist - Other",
"metadataJson": "{\"section\":\"roles\"}"
},
{
"title": "Languages",
"text": "English",
"metadataJson": "{\"section\":\"languages\"}"
},
{
"title": "Skills",
"text": "Articulate, Camtasia, Captivate, Dreamweaver, Office 365, Photoshop, Snagit",
"metadataJson": "{\"section\":\"skills\"}"
},
{
"title": "companies",
"text": "Amazon, Google, Ernst & Young,Abbott,American Society of Plastic Surgery,Home Depot,Walgreens,World Health Organization,Blue Cross Blue Shield,Twilio,Infor,Coca Cola,Motorola,BMO Harris Bank,US Cellular,Vyaire,Crowe Horwath,Aflac,Accenture,Boy Scouts of America,Cox Communication,Cox Automotive,Mars,Franklin Templeton,Facebook,Fannie Mae,Meta,AT&T,Uber,Kaiser Permanente,Analog Digital,Columbus McKinnon,Morton Buildings,Clear Connect",
"metadataJson": "{\"section\":\"companies\"}"
}
]
}
This JSON data successfully uploaded to corpora.
But upon querying on this data let’s say i have asked the vectara
“I am looking for instructional designer”
it gives me this result.
I am using the PHP to post request to vectara API.
here is my PHP code:
$query_data = [
'query' => $query,
"start" => 0,
'numResults' => 3,
'corpusKey' => [
[
'customerId' => $customer_id,
'corpusId' => $corpus_id,
// "metadataFilter" => "part.is_title = true",
"semantics" => "RESPONSE",
"lexicalInterpolationConfig" => [
"lambda" => 0
]
],
],
"summary" => [
[
// "summarizerPromptName" => "vectara-summary-ext-v1.2.0",
"responseLang" => "en",
"maxSummarizedResults" => 2
]
]
];
What i want to achieve that when i asked to vectara that I am looking for instructional designer who speak spanish for example then it will return the recommended talent profiles.
And display the result in the form of Cards in which each card represent one talent profile which contains talent name, roles, tagline for that specific talent.
FYI: I have pulled the talent data from algolia index and i want to train that data into vectara to fetch recommendations.
I am stuck right now into data format setup, and looking for help in this regard.
Looking forward for quick response.
Thanks