As developers scrape data off the web, we use Node.js along with handy Cheerio scraper. When fetching .html()
Cheerio parser returns the special symbols as HTML encoded entities, eg.:ä
as ä
ß
as ß
Cheerio developer vindication of the parser action
(1) It’s not the job of a parser to preserve the original document.
(2) .html()
returns an HTML representation of the parsed document, which doesn’t have to be equal to the original document.
source.