Suppose there is a table like below (1 info row only):
Blows Minute (BPM) |
Speed (RPM) | Power, PSI | Flow, PSI | Tool Sys |
---|---|---|---|---|
0-2500 | 0-250 | 1.8 HP | 2.6-13.2 GPM | SDS Max |
How to scrape it using cheerio.js as a parser?
Suppose there is a table like below (1 info row only):
Blows Minute (BPM) |
Speed (RPM) | Power, PSI | Flow, PSI | Tool Sys |
---|---|---|---|---|
0-2500 | 0-250 | 1.8 HP | 2.6-13.2 GPM | SDS Max |
How to scrape it using cheerio.js as a parser?
let table = $('table');
if ($(table).has('br')) {
$("br").replaceWith(" ");
}
As developers scrape data off the web, we use Node.js along with handy Cheerio scraper. When fetching .html()
Cheerio parser returns the special symbols as HTML encoded entities, eg.:ä
as ä
ß
as ß
(1) It’s not the job of a parser to preserve the original document.
(2) .html()
returns an HTML representation of the parsed document, which doesn’t have to be equal to the original document.
source.