It would appear that the msnbot/2.0b is still having issues, as it continues to (with increasing frequency) stumble into a forbidden/honeypot directory on my server. I'm getting a bit tired of unbanning these MSN IPs; Here's a few of the latest entries:
Code:
65.55.106.209 - - [31/Aug/2009:16:53:15 -0600] "GET /robots.txt HTTP/1.1" 200 294 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.209 - - [31/Aug/2009:16:54:03 -0600] "GET /legitpage1.php HTTP/1.0" 200 14161 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.162 - - [31/Aug/2009:18:20:12 -0600] "GET /robots.txt HTTP/1.1" 200 294 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.162 - - [31/Aug/2009:18:21:03 -0600] "GET /forbidden/ HTTP/1.0" 403 3893 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.187 - - [31/Aug/2009:18:46:27 -0600] "GET /legitdir1/ HTTP/1.0" 200 9835 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.51.70 - - [02/Sep/2009:18:18:00 -0600] "GET /robots.txt HTTP/1.1" 200 179 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.207.95 - - [02/Sep/2009:19:06:32 -0600] "GET /robots.txt HTTP/1.1" 200 294 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.207.95 - - [02/Sep/2009:19:07:34 -0600] "GET /legitpage2.php HTTP/1.0" 200 6494 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.51.70 - - [02/Sep/2009:19:27:45 -0600] "GET /robots.txt HTTP/1.1" 200 179 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.138 - - [02/Sep/2009:20:24:36 -0600] "GET /forbidden/ HTTP/1.0" 403 3893 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.207.120 - - [02/Sep/2009:20:33:05 -0600] "GET /legitpage3.phpforbidden/ HTTP/1.0" 404 5514 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
For clarity, these logs have been modded as follows:
forbidden = Location or Honeypot (Should NOT be Spidered)
legitpage# = Legitimate Web Page (Should be Spidered)
legitdir# = Legitimate Directory/Folder (Should be Spidered)
If any other legit bots were having difficulty I might suspect a mistake on my part; however, thus far only MSN and bad boys from RUSS/CHINA have made my naughty lists.
Side Note:
A techblog ( http://www.chewie.co.uk/seosem/msnbo...dex-meta-tags/) suggested that the msnbot/2.0b might be confused as to which site/IP it was spidering, and merely happened across the forbidden directory while trying to spider a different site. In this case the chance for that is incredibly slim as the directory in question was specifically titled to avoid similarity (something similar to: /reallyawesomestuffhere/). That way only the incredibly dense, or naughty, might read and follow the otherwise forbidden reference in my robots.txt file.
It's a shame some people actually like MSN/Live/Bing Search, otherwise I'd just let them ban themselves for good.
|