CSS or XPath selectors
You can use extract rules with CSS or XPath selectors. By default, the rules will work without the need to specify the kind of selector you are using.
The rules will consider any selector beginning with a / as an XPath selector, everything else will be considered a CSS selector.
JSON
1{2"extract_rules": {3"title": "#title"4}5}
CSS selector
JSON
1{2"extract_rules": {3"title": "//h1[@id="title"]"4}5}
XPath selector
JSON
1{2"extract_rules": {3"title": "/html/body/h1[@id="title"]"4}5}
XPath selector
Sometimes, you might want to force this behavior if:
- you use an XPath selector which doesn't begin with /
- you use a CSS selector which begins with /
- you simply want to make your code clearer
Then you can use the selector_type property.
JSON
1{2"extract_rules": {3"title": {4"selector": "#title",5"selector_type": "css"6}7}8}
CSS selector
JSON
1{2"extract_rules": {3"title": {4"selector": "./html/body/h1[@id="title"]",5"selector_type": "xpath"6}7}8}
XPath selector
selector_type: auto | css | xpath (default: auto)