Any lessons from the AOL data?
08-08-2006, 08:28 PM
|
Any lessons from the AOL data?
|
Posts: 12
|
Anytime done any exploring and found anything interesting?
|
|
|
|
08-08-2006, 08:59 PM
|
|
Posts: 240
|
One thing that struck me is the number of people who search for a url....e.g. go and type in "www.google.com" into the AOL search engine! Maybe that says more about AOL users than anything.
Its quite hard to work with due to the sheer size of the dataset so I've not done that much so far. What I have done so far though is find out how important the position in the SERP really is for click through - is being number 1 really that much better than being number 2 etc etc.
Ok so here is what I found with the section of the data I am working with (about 3.5 million searches). The ItemRank is result position, and the clicks are the number of times it was clicked
Total Searches: 3,558,411
Total click throughs: 1,890,568
ItemRank Clicks
1 802590
2 226728
3 160498
4 113746
5 91763
6 75910
7 63848
8 56487
10 55549
9 52926
Pretty conclusive really - it confirms what everyone has been saying, although anywhere in the top 3 would be acceptable IMO. Interestingly the 10th result gets marginly more than the 9th! I suspect that is because this is the last real result you see before the pagination bits....so if you cant be in the top results, aim for 10th!
Also fairly interestingly, some people are prepared to go through a LOT of results. The lowest position that I got a click through for was the 449th result which was for ... "beastailty" - guess you;ve got to go a long way for your kicks! 
|
|
|
|
08-08-2006, 11:42 PM
|
|
Posts: 66
|
Is there a mirror for this somewhere?
or should i not be asking for this here:shuriken:
|
|
|
|
08-09-2006, 01:00 AM
|
|
Posts: 12
|
Good points mattd,
There's several mirrors out there, but they are all over the bittorrents networks, so that's probably going to be easier for you to find.
I got all of the data, and putting it into a MySQL database (thank God for the generous hosting compnies out there  ), but it's around 20 million records so I'm not done yet, although I started this morning. (you have to convert from tab delimited to sql also).
Would anybody be interested in a web interface to the data?
|
|
|
|
08-09-2006, 01:38 AM
|
|
Posts: 12
|
Here's some of my info:
Total Searches:9,038,794
Total Clicks: 4,926,623
Click Rank1: 2,075,765
Click Rank2: 586,100
Click Rank3: 418,643
Click Rank4: 298,532
Click Rank5: 242,169
Click Rank6: 199,541
Click Rank7: 168,080
Click Rank8: 148,489
Click Rank9: 140,356
Click Rank10: 147,551
So this points to the fact that being number 1 does really pay....
More to come
|
|
|
|
08-09-2006, 03:05 AM
|
|
Posts: 26
|
be careful what you search for cuz big brother is always watching?
lol,
x
|
|
|
|
08-09-2006, 03:23 AM
|
|
Posts: 328
|
What surprised me a bit is how many searches don't result in a click, and how many times people search for the exact same search phrase...
Thx for the numbers august and mattd.. I will do some analysis myself on a sample of the first 4 data files (can't seem to be able to import more in access .. lol)
|
|
|
|
08-09-2006, 04:10 AM
|
|
Posts: 226
|
From August:
Quote:
Total Searches:9,038,794
Total Clicks: 4,926,623
Click Rank1: 2,075,765
Click Rank2: 586,100
Click Rank3: 418,643
Click Rank4: 298,532
Click Rank5: 242,169
Click Rank6: 199,541
Click Rank7: 168,080
Click Rank8: 148,489
Click Rank9: 140,356
Click Rank10: 147,551
|
Results in:
Total Searches:9,038,794
Total Clicks: 4,926,623
Click Rank1: 2,075,765
Click Rank2: 586,100 = 3.5x less
Click Rank3: 418,643 = 4.9x less
Click Rank4: 298,532 = 6.9x less
Click Rank5: 242,169 = 8.5x less
Click Rank6: 199,541 = 10.4x less
Click Rank7: 168,080 = 12.3x less
Click Rank8: 148,489 = 14.0x less
Click Rank9: 140,356 = 14.8x less
Click Rank10: 147,551 = 14.1x less
Click Rank1: 2,075,765
Click Rank2: 586,100 = 3.5x less than ^
Click Rank3: 418,643 = 1.4x less than ^
Click Rank4: 298,532 = 1.4x less than ^
Click Rank5: 242,169 = 1.2x less than ^
Click Rank6: 199,541 = 1.2x less than ^
Click Rank7: 168,080 = 1.2x less than ^
Click Rank8: 148,489 = 1.1x less than ^
Click Rank9: 140,356 = 1.05x less than ^
Click Rank10: 147,551 = 1.05x more than ^
|
|
|
|
08-09-2006, 04:36 AM
|
|
Posts: 12
|
Breakpoint, nice calculations.... One idea that this disproves is that the last, or position 10 result gets so few hits. I thought it would be on the level of position2 or position3. I guess there's a slight change from position9 to position10.
If somebody thinks of other queries we should do with the data let me know. or other things that might be interesting to find out.
I'm up to 11 Million inserted into MySQL database, I'm surprised at how well the DB is handling it too.... maybe the hosting company just has some really big and fast computers..
|
|
|
|
08-09-2006, 06:29 AM
|
|
Posts: 336
|
Nice results. would be good to run queries.
How many searches where xrated compared to safe search terms?
Im thinking of releasing an adult site and would love to feedback on search terms.
|
|
|
|
08-09-2006, 06:34 AM
|
|
Posts: 12
|
pm me a list of keywords to search against
|
|
|
|
08-09-2006, 07:59 AM
|
|
Posts: 328
|
here's is some of my data:
Total clicks: 7.752.953
Rank Hits Percentage of total
1 3275637 42,25%
2 925507 11,94%
3 656290 8,47%
4 468703 6,05%
5 377877 4,87%
6 309669 3,99%
7 262140 3,38%
8 230981 2,98%
9 218748 2,82%
10 229909 2,97%
Rank Hits Percentage of total
1-10 6955461 89,71%
11-20 338558 4,37%
21-30 187744 2,42%
31-40 82751 1,07%
41-50 44500 0,57%
51-60 33590 0,43%
61-70 23354 0,30%
71-80 15960 0,21%
81-90 13430 0,17%
91-100 11178 0,14%
100 + 67383 0,87%
AOL Data Analysis - I. Clicks on Search Engine Results
|
|
|
|
08-09-2006, 09:28 AM
|
|
Posts: 328
|
here a list of top urls where people clicked to from the same sample as above (7.752.953 clicks)
Code:
http://www.google.com 143666 1,85%
http://www.myspace.com 65014 0,84%
http://www.yahoo.com 60996 0,79%
http://en.wikipedia.org 49940 0,64%
http://www.amazon.com 42755 0,55%
http://www.imdb.com 40220 0,52%
http://www.mapquest.com 37885 0,49%
http://www.ebay.com 31348 0,40%
http://mail.yahoo.com 21675 0,28%
http://www.bankofamerica.com 19378 0,25%
http://www.geocities.com 16121 0,21%
http://www.ask.com 15738 0,20%
http://www.hotmail.com 13959 0,18%
http://www.bizrate.com 13091 0,17%
http://www.tripadvisor.com 12613 0,16%
http://profile.myspace.com 12589 0,16%
http://www.msn.com 11994 0,15%
http://www.nextag.com 11066 0,14%
http://cgi.ebay.com 11059 0,14%
http://www.answers.com 10886 0,14%
http://disney.go.com 10730 0,14%
http://www.craigslist.org 10572 0,14%
http://www.southwest.com 10452 0,13%
http://www.superpages.com 10245 0,13%
http://www.azlyrics.com 9896 0,13%
http://www.tv.com 9584 0,12%
http://shopping.msn.com 9472 0,12%
http://www.irs.gov 9157 0,12%
http://music.myspace.com 8933 0,12%
http://www.angelfire.com 8530 0,11%
http://www.walmart.com 8511 0,11%
http://dir.yahoo.com 8218 0,11%
http://travel.yahoo.com 7890 0,10%
http://www.nlm.nih.gov 7296 0,09%
http://www.city-data.com 7057 0,09%
http://www.sing365.com 6894 0,09%
http://www.epinions.com 6864 0,09%
http://www.pogo.com 6783 0,09%
http://www.findarticles.com 6682 0,09%
http://www.ncbi.nlm.nih.gov 6468 0,08%
http://www.cooks.com 6327 0,08%
http://www.target.com 6295 0,08%
http://www.cnn.com 6245 0,08%
http://www.switchboard.com 6225 0,08%
http://finance.yahoo.com 6165 0,08%
http://shopping.yahoo.com 5857 0,08%
http://www.msnbc.msn.com 5813 0,07%
http://www.gamespot.com 5673 0,07%
http://www.microsoft.com 5623 0,07%
http://www.usps.com 5496 0,07%
http://www.weather.com 5492 0,07%
|
|
|
|
08-09-2006, 09:36 AM
|
|
Posts: 155
|
So the tail lives, despite the comments I have recently read in the blogosphere.
|
|
|
|
08-10-2006, 10:35 AM
|
|
Posts: 328
|
no more people working on this data??
|
|
|
|
08-10-2006, 01:54 PM
|
|
Posts: 12
|
I'm still putting it up onto my webhost, once they are all there, I'm going to do further testing for my own purposes and if I come up with anything interesting I'll post it here...
|
|
|
|
08-11-2006, 02:53 AM
|
|
Posts: 328
|
Quote:
Originally Posted by august
I'm still putting it up onto my webhost, once they are all there, I'm going to do further testing for my own purposes and if I come up with anything interesting I'll post it here...
|
Im trying to figure out if I can query a big enough sample fast enough to see with how many words people are searching..
|
|
|
|
08-11-2006, 03:45 AM
|
|
Posts: 12
|
my queries seem to be taking 1 to 3 seconds max, in over 15 million entries
|
|
|
|
08-11-2006, 03:49 AM
|
|
Posts: 328
|
Quote:
Originally Posted by august
my queries seem to be taking 1 to 3 seconds max, in over 15 million entries
|
what kinda queries? just simple selects where you search for keywords or also counts, groups etc. combined?
|
|
|
|
|
« Reply to Any lessons from the AOL data?
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|