I have had problems in the past with hacking attempts to websites and computers from RIPE Network Coordination Centre IP addresses. But when I ban the IP on a website I have, historically, ended up blocking people in Europe from visiting my website. I know because I had a number of situations, a few years ago, where legitimate people from Europe could not visit a site until I removed the IP block on a RIPE Network Coordination Centre associated IP address.
I always dread the day when hackers/crooks coming through RIPE Network Coordination Centre find a website I am working on. Well, now they have, it appears, found a particular website that I am currently working on. The same website that most of my posts on these forums have been about.
How do I block a bot? Specifically AnotherBot 2.1?
It is asking for pages that do not exist on the website(/feed/, /index.php?feed=rss2). I redirect 404s to / and that is what my logs show happened. AnotherBot 2.1 tried to access /feed/, which is not a page on my website, then it tried to access /index.php?feed=rss2, which is also not a page on my website, then it ended up at /. note: To CLARIFY index.php is in mywebsite.com/blog/index.php, but AnotherBot 2.1 was looking for an index.php?feed=rss2 at mywebsite.com/. Nonetheless, to me AnotherBot 2.1 does not look like it means my website any good. Even more importantly, any bot that does not respect robots.txt IS not a bot I want on my website. Am I right?
Also, about 1.5 hours BEFORE AnotherBot 2.1, from a RIPE Network Coordination Centre associated IP address, showed up someone? did a Google search, from an Asia Pacific Network Information Centre associated IP address that returned my website in the #1 SERP spot. I think that search and AnotherBot2.1 are related. The website is rather small now, and is not yet getting really significant amounts of traffic, and I am trying to keep up with potential abuse to the website "manually", but I guess I am going to need to start blocking bots automatically; which is something I do not know how to do.
The Google search for AnotherBot 2.1: http://www.google.com/search?hl=en&c....1&btnG=Search
Although I do not think it is related to Anotherbot 2.1, I am getting regular visits from Agent: bot/1.0 (bot; http://; bot@bot.bot) now. It has come from 216.158.1.198(Consult Dynamics, Inc. - Wilmington, DE). It appears to got to /robots.txt, which currently only has User-agent: Fasterfox
Disallow: / in it. Incidentally, blocking AnotherBot 2.1 in /robots.txt will not help me, BECAUSE AnotherBot 2.1 did not even respect(visit) /robots.txt. Will it?
NOTE: I am going to add more details about this particular attack to this post. If you need those additional details to give me advice check back to this post.
This is how AnotherBot 2.1 showed up in the website logs:
{Host: 77.92.88.17
/feed/
Http Code: 302 Date: Jun 10 02:19:13 Http Version: HTTP/1.1 Size in Bytes: 724
Referer: -
Agent: AnotherBot 2.1
/index.php?feed=rss2
Http Code: 302 Date: Jun 10 02:19:15 Http Version: HTTP/1.1 Size in Bytes: 724
Referer: -
Agent: AnotherBot 2.1
/
Http Code: 200 Date: Jun 10 02:19:15 Http Version: HTTP/1.1 Size in Bytes: 12183
Referer: -
Agent: AnotherBot 2.1}
I do not even understand why AnotherBot 2.1 got ANY 302 founds when it requested the first 2 pages. I guess the 404 "redirect" I have pointed at / is responsible for that 302 temporary redirect.
UPDATE: Found some instructions here http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html
Last edited by 052808; 06-12-2008 at 03:23 AM..
|