As some of you may remember Aol accidentally let a load of data loose on to the internet ( http://news.bbc.co.uk/1/hi/technology/5255732.stm ) and some SEO’ers have made some good use of this – including me !
I would love to say that I crunched the original data, but that credit goes to http://www.jimboykin.com/click-rate-for-top-10-search-results/
However I have done a little data manipulation of my own….the reason is because I use a programme called web ceo, which has a ‘ranking score’ formula which helps me get ‘clean metric’ for how one web site is doing against another for a standard set of keywords. In many cases this is not important, but when you are putting business cases together, or your trying to make things simple for decision makers, having a simple number makes thing far simpler.
the other point about this data is that ‘winner takes all’ – as you can see from this pie chart !
Formula adjustment for top 20 rankings – their importance is based on the percentage of click through per position – so if 1 = 100%, then No 2 = 28% because it only gets a 3rd of the traffic of No1.
Where I find this data extremely useful is when I’m building a business case, so I would use overture, pick my keywords, multiply them by 3.5 to get the complete UK search volume per month and then run these numbers through my percentages to work out how much traffic we can expect per position on the rankings.
Total Searches: 9,038,794
Total Clicks: 4,926,623
% of clicks
Click Rank1: 2,075,765 42.13%
Click Rank2: 586,100 11.90%
Click Rank3: 418,643 8.50%
Click Rank4: 298,532 6.06%
Click Rank5: 242,169 4.92%
Click Rank6: 199,541 4.05%
Click Rank7: 168,080 3.41%
Click Rank8: 148,489 3.01%
Click Rank9: 140,356 2.85%
Click Rank10: 147,551 2.99%
1st page: 4,425,226 89.82%
2nd page: 501,397 10.18%
Below is a pie chart showing relative amount of click through per position on a typical search engine rankings page
I have long advocated the idea of keyword clustering, simply because its a clever way for searh engines to understand ‘context’. On a basic level his means building a broad theme across a site and then building more focused themes on specific pages.
These themes originate from core keywords or keyword phrases which are promenent on the page or site. (this means a density of about 3% of keyword volume. The actual density isnt that important – the main thing is that these words are the primary words on the page. The importance of these words are highlighted by being in Alt tags, title tags and H1 tags and of course in the body content.
So when I was researching the ‘mechanics’ of search engines (page rank) I took some time out to watch a very interesting lecture fromJeff Dean, a google fellow (a worthy and important person from Google)
Apart from stuff about their information architecture, he gave a demonstration of the back end user interface of one of the modules they use in their ranking. You can clearly see that they have used ‘intelligent’ anlaysis to work out what words cluster with what so they can get a good idea as to the context of words and keyphrases. the better the contextual matching, generally the more relevant the search result.
You can see the movie HERE (images on the right are screengrabs from the movie)
So assuming you have got the idea, the next question is how to work out what those clusters might be.
Well, we are in luck, because there is a small search engine called Clusty and it appears to do the work for you !
Try it yourself: http://cloud.clusty.com/
As you can see its interesting – but is it any good ?
Well I have done a few small tests. My main methodology is to pick a search term i.e SEO and go into the top 6 sites in Google and using a tool called a keyword density cloud See HERE ( it shows the more important keywords in a larger font) just do a basic review of what keywords are prominent. This way you can get a pretty good idea as to what words associate with what.
So after doing this random set of tests it seems there is a high degree of correlation between Clusty and my method.
So in summary, here is a tool (clusty) which helps you build web pages around certain keywords, therefore taking search engine optimization to another level.
As a final pointer. When you have published a page, run it through the google adsense tester HERE – if the adverts follow the theme of the page, then you know how Google sees the theme of you your page.
I have found some really interesting stuff which augments my piece on keyword clustering. In essence, Google uses a technology to work out he relative importance of a keywords based on its proximity to other words. I would love to say I understand this properly – because I don’t. But I’m learning.
Where do I go from here on this ? Well I need to understand the importance of NOT keyword stuffing, but instead build work out the best balance between traditional content hierarchy, i.e. meta tagging, H1 tagging and link structure and so on – and LSI where things like keyword stuffing doesn’t work.
I have seen people on forums moan about sites coming up on top of the rankings even though they hardly use keyword ‘X’ – this is LSI in action… the thing is to understand its behaviors and build accordingly.
The good news is that this patterns?is simply a reasonably close match to natural writing. But as a SEO guy, I need to understand this comprehensively so I can carefully organise a site properly….
reading on the subject:
Writing for Google http://www.seobook.com/archives/001668.shtml
Brief guide on how LSI works: http://www.seobook.com/archives/000657.shtml
Full non mathematical explanation: http://www.seobook.com/lsi/cover_page.htm
A step by step tutorial on de-constructing piece of text to a search engine friendly level http://www.seobook.com/lsi/tutorial.htm
Hi, IM adding this link because its a superb article on LSI.(latent symantec indexing) IT turns a pretty inaccessable subject into soemthing most of us can probbaly grasp….
My next push is to find a proper LSI tool which allows me to get more ‘predictive’ with this. In other words, I write and from that I get the most important keywords not only from a keywords density perspective, but als o from a LS|T perspective.
So if I write an article about pears, and the big associative word is fruit, then at what point does ‘friut’ become the ‘seen’ keyword on google !?
I will inform you when I find this particular thing !
Ive been following the action at PubCon, its the BIG momma of SEO / SEM internet conferences and i came across this video (see below) with Rand Fishkin, Lee Odden, Todd Malicoat – three really on the money SEO/SEM’ers
the video or click on the picture
November 30th, 2006 · 1 Comment
Occasionally I come across stuff whicih I don’t really understand, but that I know is important… this website is one
They are academics who specialise in “Information Retrieval – Artificial Intelligence”. I have read a few of their papers and it has given me a more intuitive sense of how search engines function.
As you expect, we humans tend to anthromorphosize (we like to think machines etc have feelings) and with google there is a tendancy to think its got brains, when in fact its just a bunch of calculations giving the appearance of intelligence. This is why spmmers exist and do very well.
Anyway to conclude, this site helps get a better handle on the workings of google. A great suppliment to this, is this video presentation from google on symantec indexing: