Reply
Google forcing default encoding in cache and incorrectly indexing pages
Old 11-13-2005, 03:16 PM Google forcing default encoding in cache and incorrectly indexing pages
Novice Talker

Posts: 14
Location: Athens Greece
Many of my pages (forum pages and pages which have forum SSI) are not indexed correctly by Google. Although they use Windows-1253, Google forces ISO-8859-1 encoding in its indexing and its cache. I have checked sites using the same forum software (SMF) and this does not appear to happen.

To test this please check (make sure you add the cache: bit as it is not linked and perform the search in google's search box)

cache:http://www.translatum.gr

and cache:http://www.translatum.gr/forum/index.php

and compare with

cache:http://www.cyprusgreens.org/neoi/index.php

In www.translatum.gr and http://www.translatum.gr/forum/index.php cache ISO-8859-1 appears in the google cache header; something which does not apply to the cache:http://www.cyprusgreens.org/neoi/index.php site. The funniest bit is that in some of my forum posts, Google gets the encoding right! (For example cache:http://www.translatum.gr/forum/index...ic,1302.0.html )

MSN does not appear to have a problem with these pages. Any ideas?
__________________
Spiros Doikas
Greek Translator & Webmaster
translatum.gr is offline
Reply With Quote
View Public Profile Visit translatum.gr's homepage!
 
When You Register, These Ads Go Away!
Old 11-14-2005, 07:31 AM
chrishirst's Avatar
Super Moderator

Posts: 15,343
Location: Blackpool. UK
Where's the problem ?

It's only the cache view that sets the encoding. It will have NO effect on indexing, NO effect on your results and NO effect on users viewing the page.
The number of users looking at the cache view will vary between 0 and 1 (that will be you)
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
Indifference will be the downfall of mankind, but who cares?
Code Samples | People Counting System | Bits & Bobs
chrishirst is online now
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 11-14-2005, 08:05 AM
Novice Talker

Posts: 14
Location: Athens Greece
The problem is that the Greek words are not indexed properly as they are displayed as extended ASCII characters rather than Greek and hence no search results come up with these words.

For example they appear like this: Óõíþíõìá ãéá ôçí êáôóßêá óôç äéÜëåêôï ôçò ÊñÞôçò. To see how it affects search results try this search

http://www.google.com/search?num=100...22&btnG=Search

The extended ASCII characters you see are incorrectly indexed Greek characters.

I only mentioned cache as it gave me some clue as to why Greek characters were not displayed correctly. It was thoroughly tested before I made this post. I apologise if I was not clear enough.
__________________
Spiros Doikas
Greek Translator & Webmaster

Last edited by translatum.gr : 11-14-2005 at 08:11 AM.
translatum.gr is offline
Reply With Quote
View Public Profile Visit translatum.gr's homepage!
 
Old 11-14-2005, 08:55 AM
chrishirst's Avatar
Super Moderator

Posts: 15,343
Location: Blackpool. UK
the issue is that the results pages use UTF-8 for character encoding, so they are probably indexed correctly, but not displayed correctly. I would guess at this being because the search is in english.

What happens when you search using the Greek characters? does the encoding change to match the language used ?
does Google.gr have a "only in Greek" option ?
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
Indifference will be the downfall of mankind, but who cares?
Code Samples | People Counting System | Bits & Bobs
chrishirst is online now
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Reply     « Reply to Google forcing default encoding in cache and incorrectly indexing pages
 

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Webmaster Resources Marketplace:
Software Development Company | Webhosting.UK.com | Text Link Brokers 


   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML

 


Page generated in 0.12607 seconds with 12 queries