Is German content indexed properly? Shows encoding errors on the "Search" tab

I’m trying to index German content from a website and as you can tell in the attached screenshot, special characters like “ä”, “ö” or “ß” (so-called Umlaute) are not showing up on the Search tab correctly.

The responses from the chat seem to make sense, so I’m wondering whether this is just an issue on the website rather than the indexing and retrieval itself?

I uploaded a PDF with German content… see my results below. I did it on Chrome browser. Which browser did you use? How can we reproduce this issue? Can you share content/pdf with us?

Jan, can you reproduce this when you run the same query in the “Search” tab in Console? I’m trying to narrow down the problem space a bit, and eliminate Vectara Answer as a potential source of this bug. Thanks!

Ok, here’s why I saw the broken special characters:

I had my Mac language setting set to English. When it’s English, I cannot see the special characters in Chrome nor Brave.

Setting the Mac OS X language to German makes the special characters show up.


Great to hear! Thanks for figuring it out and sharing your solution.