Reply
Robots.txt (allow as opposed to exclusion)
Old 06-02-2007, 01:35 PM Robots.txt (allow as opposed to exclusion)
Banned

Posts: 253
Name: Michel Samuel
I probably already know the answer to this question but what the hell...

Can I do this in my robots.txt file ?

User-agent: robot-I-want-to-crawl-my-site.
allow: /

User-agent: *
Disallow: /


The objective is to give permission to the robots I like.
And disallow all the rest.
Michel Samuel is offline
Reply With Quote
View Public Profile
 
When You Register, These Ads Go Away!
     
Old 06-02-2007, 08:28 PM Re: Robots.txt (allow as opposed to exclusion)
chrishirst's Avatar
Super Moderator

Posts: 11,466
Location: Blackpool. UK
Nope

robots.txt is an exclusion protocol only. It is also a voluntary protocol to follow so not all bots honour the exclusions.
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
Indifference will be the downfall of mankind, but who cares?
Code Samples | People Counting System
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 06-03-2007, 05:44 AM Re: Robots.txt (allow as opposed to exclusion)
Banned

Posts: 253
Name: Michel Samuel
Quote:
Originally Posted by chrishirst View Post
Nope

robots.txt is an exclusion protocol only. It is also a voluntary protocol to follow so not all bots honour the exclusions.
I had hope that I was wrong.
C'est la vie.

OK,
In this case anyone have a master list of american search engines ?
I'm going to have to do this the hard way.
Michel Samuel is offline
Reply With Quote
View Public Profile
 
Old 06-05-2007, 08:49 AM Re: Robots.txt (allow as opposed to exclusion)
Average Talker

Posts: 21
What is the benefit of Robots.txt file?
Lucy is offline
Reply With Quote
View Public Profile
 
Old 06-05-2007, 09:17 AM Re: Robots.txt (allow as opposed to exclusion)
tripy's Avatar
Fetchez la vache!

Posts: 1,819
Name: Thierry
Location: In the void
[quote]
What is the benefit of Robots.txt file?
[/quotes]

You can use it to give hints on the search engines agents, and blacklist some parts of your site, for example.
You can see the robots.txt of this site there:
http://www.webmaster-talk.com/robots.txt
__________________
Listen to the ducky: "This is awesome!!!"

tripy is offline
Reply With Quote
View Public Profile
 
Old 06-08-2007, 02:55 PM Re: Robots.txt (allow as opposed to exclusion)
Learning Newbie's Avatar
Moderator

Posts: 4,585
Name: John Alexander
Quote:
Originally Posted by Michel Samuel View Post
In this case anyone have a master list of american search engines ?
I'm going to have to do this the hard way.
No. But can you use geo-IP blocking instead? You can actually ban robots (return an unauthorized message) as opposed to hanging a "do not enter" sign.
__________________
4 ways to improve the lives of the "bottom billion"

"HEY YOU KIDS GET OFF MY LAWN!" -John McCain
Learning Newbie is offline
Reply With Quote
View Public Profile
 
Old 06-20-2007, 09:40 AM Re: Robots.txt (allow as opposed to exclusion)
Thomas Schulz's Avatar
Experienced Talker

Posts: 49
Name: Thomas Schulz
Quote:
Originally Posted by chrishirst View Post
Nope

robots.txt is an exclusion protocol only. It is also a voluntary protocol to follow so not all bots honour the exclusions.

Although Google, Yahoo, Ask etc. have some extensions.

Here's my blog post about having the robots.txt link xml sitemap file.

Basicly you can now write: Sitemap: http://www.example.com/sitemap.xml
Thomas Schulz is offline
Reply With Quote
View Public Profile Visit Thomas Schulz's homepage!
 
Old 06-26-2007, 12:42 AM Re: Robots.txt (allow as opposed to exclusion)
Banned

Posts: 510
Name: CHRIS
Location: I live in Google's Home State
Why would you want to disclude something from showing from your website. please let me know as I would like to know the functions of this protyocol.
Vasity is offline
Reply With Quote
View Public Profile Visit Vasity's homepage!
 
Old 06-26-2007, 03:29 AM Re: Robots.txt (allow as opposed to exclusion)
chrishirst's Avatar
Super Moderator

Posts: 11,466
Location: Blackpool. UK
There are many reason you would want to exclude pages or folders from a site.

eg:
Many forums have several ways of getting to the same page, using the disallow you can stop compliant bots from accessing the print versions

If you have tracking links attached to many of your external links you can exclude these with;
disallow: /folder/pagename.ext?track=*

see http://www.robotstxt.org for more on the protocol and http://www.highrankings.com/forum/in...p?showforum=62 for more examples and instances.
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
Indifference will be the downfall of mankind, but who cares?
Code Samples | People Counting System
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 07-01-2007, 02:49 AM Re: Robots.txt (allow as opposed to exclusion)
King Justice's Avatar
King of Webmaster-Talk

Posts: 847
Name: Justice McCay
Location: New Jersey
Quote:
Originally Posted by chrishirst View Post
Nope

robots.txt is an exclusion protocol only. It is also a voluntary protocol to follow so not all bots honour the exclusions.
Ditto. Can't include permission on robots.txt - they open themselves to anything and everything on your server if you don't exclude it!
__________________
Green talkupation is always appreciated. =]
Free Online Games - Cheap Power Leveling - Pontiac Grand Am
King Justice is offline
Reply With Quote
View Public Profile
 
Reply     « Reply to Robots.txt (allow as opposed to exclusion)
 

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML

 


Page generated in 0.62110 seconds with 13 queries