Nested items
It is also possible to add extraction rules inside the output option in order to create powerful extractors.
Example
Here are the rules that would extract general information and all blog post details from FoxScrape's blog:
JSON
1{2"extract_rules": {3"title": "h1",4"subtitle": "#subtitle",5"articles": {6"selector": ".card",7"type": "list",8"output": {9"title": ".post-title",10"link": {11"selector": ".post-title",12"output": "@href"13},14"description": ".post-description"15}16}17}18}
The information extracted by the above rules on FoxScrape's blog page would be:
JSON
1{2"title": "The FoxScrape Blog",3"subtitle": " We help you get better at web-scraping: detailed tutorial, case studies and \n writing by industry experts",4"articles": [5{6"title": " Block ressources with Puppeteer - (5min)",7"link": "https://www.foxscrape.com/blog/block-requests-puppeteer/",8"description": "This article will show you how to intercept and block requests with Puppeteer using the request interception API and the puppeteer extra plugin."9},10{11"title": " Web Scraping vs Web Crawling: Ultimate Guide - (10min)",12"link": "https://www.foxscrape.com/blog/scraping-vs-crawling/",13"description": "What is the difference between web scraping and web crawling? That's exactly what we will discover in this article, and the different tools you can use."14}15]16}