Showing 2 of total 2 results (show query)
ropensci
robotstxt:A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Maintained by Jordan Bradford. Last updated 4 months ago.
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
68 stars 10.43 score 414 scripts 7 dependentsdmi3kno
polite:Be Nice on the Web
Be responsible when scraping data from websites by following polite principles: introduce yourself, ask for permission, take slowly and never ask twice.
Maintained by Dmytro Perepolkin. Last updated 2 years ago.
crawlermemoiserate-limiterrobotstxtrvestscraperwebscraping
327 stars 8.86 score 596 scripts 5 dependents