Transform a input column of urls to html text
KEY | TYPE | Description |
---|---|---|
disableHostRestriction | bool | If True, will not restrict crawling to the same host. |
depthColumn | str | Increasing depth explores more links, capturing more content |
crawlDepth | None | None |
inputColumn | str | Name of input column to transform. |
honourWebsiteRules | bool | If True, will respect robots.txt rules. |
inputColumnType | None | None |
userAgent | str | If provided, will use this user agent instead of randomly selecting one. |
outputColumn | str | Name of output column to store transformed data. |