Categories
Development

Cheerio.js, get items from html table into object

Suppose there is a table like below (1 info row only):

Blows
Minute (BPM)
Speed (RPM) Power, PSI Flow, PSI
Tool Sys
0-2500 0-250 1.8 HP 2.6-13.2 GPM SDS Max

How to scrape it using cheerio.js as a parser?

Case 1 (1 row only)

Categories
Development

Node.js Cheerio scraper, replace element

let table = $('table');
if ($(table).has('br')) {  				     
    $("br").replaceWith(" ");
}
Categories
Development

Cheerio scraper escapes special symbols with html entities when performing .html()

As developers scrape data off the web, we use Node.js along with handy Cheerio scraper. When fetching .html() Cheerio parser returns the special symbols as HTML encoded entities, eg.:
ä as ä
ß as ß

Cheerio developer vindication of the parser action

(1) It’s not the job of a parser to preserve the original document. 
(2) .html() returns an HTML representation of the parsed document, which doesn’t have to be equal to the original document.
source.