Due to the fact that we crawl such a large number of URLs per day, and in order to be a good netizen, we fully respect robots.txt directives.
We currently identify ourselves as one of two
ExtractBot/1.0 (+http://extractbot.com/docs/crawler) [text-only]
It's possible to change the
User-Agent string that we identify as during your crawls, however we will continue to adhere
to robots.txt directives for both our own UA and your newly supplied UA (in that order).
We will not disrespect robots.txt directives under any circumstances. If you need to parse a page restricted to us then you must fetch the HTML
yourself and pass it to us in the
html parameter with your request.