It's a lot easier with wget (which comes with just about every linux distro and is available for windows).
Considering that doing so is basically a bunch of http requests, same as your browser, excluding it will also exclude normal web traffic.
BTW, quite a few versions of Internet Explorer included the same capability.
Your best bet, don't worry about it. Use copyscape to look for plagiarism, and worry about making a high quality site.
Even if you defeat these tools with javascript links, etc... Someone could just write a custom Selenium script and make the web browser do the work.
BTW, just about anything that keeps the simple tools from crawling your site will also prevent search engines from doing the same.
Last edited by willcode4beer : 04-14-2008 at 05:47 PM.
|