The effect of inbound links
It’s common knowledge that Google evaluates many factors when working out which page to rank where in response to a search query. They claim to incorporate around 100 different ranking factors. And it’s common knowledge that the most powerful of these ranking factors is link text. Link text is the text that you click on when clicking a link. Here’s an example of link text:- miserable failure. The words “miserable failure” are the link text. Link text is also known as anchor text.
I used that particular example because it shows the power of link text – the link text effect. If you click on it, it searches Google for “miserable failure”, and you may be surprised to see which page is ranked at #1. If you click on the “Cached” link for that #1 ranked listing, you will see Google’s cache for the page, and you will see each word of the phrase “miserable failure” highlighted in yellow in the page – or that’s what you would see if the page actually contained either of those words, but it doesn’t.
So how come the George Bush page is ranked at #1 for a phrase that isn’t anywhere to be found in the page? The cache page itself tells us. In Google’s head are the words, These terms only appear in links pointing to this page: miserable failure. The link texts of links that point to that page contain the words “miserable failure”, and it’s the power of those link texts that got the page to #1.
That demonstrates the power of link text in Google. Some people decided to get the George Bush page ranked #1 for “miserable failure”, and they did it by linking to the page using the link text “miserable failure”. It’s known as “Googlebombing”.
Why are inbound links so powerful?
It’s because of the way that Google stores a page’s data, and the way that they process a search query.
Google’s Regular index consists of two indexes – the short index and the long index. They are also known as the short barrels and the long barrels. The short index is also known as the “fancy hits” index. Google also has a Supplemental index, but that’s not part of the Regular index, and it’s not relevant to this topic.
The short index is used to store the words in link texts that point to a page, the words in a page’s title, and one or two other special things. But when they store the link text words in the short index, they are attributed to the target page, and not to the page that the link is on. In other words, if my page links to your page, using the link text “Miami hotels”, then the words “Miami” and “hotels” are stored in the short index as though they appeared in your page – they belong to your page. If 100 pages link to your page, using those same words as link text, then your page will have a lot of entries in the short index for those particular words.
The long index is used to store all the other words on a page – its actual content.
And here’s the point…
When Google processes a search query, they first try to get enough results from the short index. If they can’t get enough results from there, they use the long index to add to what they have. It means that, if they can get enough results from the short index – that’s the index that contains words in link texts and page titles – then they don’t even look in the long index where the actual contents of pages are stored. Page content isn’t even considered if they can get enough results from the link texts and titles index – the short index.
That is the reason why link texts are so powerful for Google rankings. They are much more powerful than page titles, because a page can have the words from only one title in the short index, but it can have the words from a great many link texts in there. That is the reason why the George Bush page ranks #1 for “miserable failure”. All the link texts from all the pages that link to the George Bush page using the “miserable failure” link text, are in the short index – and they are all attributed to the George Bush page.
Page titles are the second most powerful ranking factor, because they are stored in the short index.
We sometimes see a page listed in the rankings, but its URL is shown and linked instead of its title, and there is no description snippet for it. They are known an URL-only listings. Google says that they are “partially indexed pages”. I’ll explain what that means, since it’s relevant to this topic.
When Google spiders a page and finds a link to another page on it, but they don’t yet have the other page in the index, they find themselves with some link text that they want to attribute to the other page, so that it can be used in the normal search query processing. They treat it as normal, and place it in the short index, attributing it to the other page which they haven’t got. Sometimes they will store the words from more than one link to the other page before they have spidered and indexed the page itself.
Sometimes that link text data in the short index will cause the other page to be ranked for a search query before the page has been spidered and indexed. But they don’t have the page itself, so they don’t have its title, or anything from the page that can be used for the description snippet. So they simply display and link its URL.
That’s what is meant by “partially indexed”, and it’s why we sometimes see those URL-only listings. Google will later spider the other page, it’s data will be stored as normal, and its listings in the search results will be displayed normally.
When a page is indexed, not only is its content indexed, but also link texts that point to it are indexed as part of the page itself. So when links that point to a page are indexed, the page itself is partially indexed, even though it hasn’t yet been spidered.