In December 2005, Google began to roll out what they called the “Big Daddy” update, and by the end of March 2006 it had been fully deployed in all of their datacenters. It wasn’t a normal update, which are often algorithm changes. Big Daddy was a software/infrastructure change, largely to the way that they crawl and index websites.
As the update spread across the datacenters, people started to notice that many pages from their sites had disappeared from the regular index. Matt Cutts, a senior software engineer at Google, put it down to
“sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling.”
That statement pretty much sums up the way that the Big Daddy update affects websites. Links into and out of a site are being used to determine how many of the site’s pages to have in the index. Matt then went on to give a few examples of sites that had been hit, and what he thought might be their problems…
About a real estate site, he said,
“Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled. As these indexing changes have rolled out, we’ve improving how we handle reciprocal link exchanges and link buying/selling.”
About another real estate site, he said,
“This time, I’m seeing links to mortgages sites, credit card sites, and exercise equipment. I think this is covered by the same guidance as above; if you were getting crawled more before and you’re trading a bunch of reciprocal links, don’t be surprised if the new crawler has different crawl priorities and doesn’t crawl as much.”
And about a health care directory site, he said,
“your site also has very few links pointing to you. A few more relevant links would help us know to crawl more pages from your site.”
The Big Daddy update is mainly a new crawl/index function that evaluates the trustability of links into and out of a site, to determine how many of the site’s pages to have in the index. Not only does it evaluate the trustability of the links, but it takes account of the quantity of trustable links. As the health care site shows, if a site doesn’t score well enough, it doesn’t get all of its pages indexed, and if the site already had all of its pages indexed, many or most of them them are removed.
I’ve written about the gross unfairness of evaluating links for that purpose here, so I won’t go over it again, but I want to suggest a reason why Google has done it.
Since Google came on the scene with their links-based rankings system, people have increasingly arranged links solely for ranking purposes. For instance, link exchange schemes are all over the Web, and link exchange requests plague our email inboxes. Over the years, such links have increased and, because of them, the quality of Google’s index/rankings has deteriorated. Google’s system relies on the natural linking of the Web, but in implementing the system, they ruined natural linking, which in turn has eroded the quality of Google’s index and rankings. It’s my belief that Big Daddy is Google’s way of addressing the problem. They are evaluating the trustability of both inbound and outbound links to try and prevent unnatural links from benefiting websites.
They had to address the problem but, in my opinion, they’ve done it in the wrong way. Nevertheless, it’s done, and we have to live with it. We still have a lot to learn about the Big Daddy update, but the way I see it is that reciprocal and off-topic links are not dead, but they won’t help a site as they did before. Perhaps those links won’t count against a site, but they won’t count for it, and links are now needed that count for a site, if it is to be fully indexed.
The best links to have are one-way on-topic links into the site, but because of what Google did to natural linking, they aren’t easily found. Google caused people to not link naturally, and most sites don’t naturally attract links, but the links must be found. The most obvious places to get them are directories. DMOZ can take a very long time to review a site, and even then the site may not be included, but it’s a very good directory to be listed in, so it’s always worth submitting to it (read this before submitting to DMOZ).
Other directories are well worth submitting to, and a good sized list of decent ones can be found at VileSilencer. Google may not credit all the links from all of them, but that doesn’t matter as long as some of them are credited – and all of them may send some traffic.
Google isn’t against link-building, and their own people suggest doing it. But it is ludicrous that we have now the situation where Google first destroyed the natural linking of the Web, and then turned around to suggest ways of unnaturally getting natural links, just so that a website can be treated fairly by them. It’s a ludicrous situation, but that’s the way it is. Some of the unnatural ways that Google suggests are, writing articles that people will link to, writing a blog that people will link to, and creating a buzz. But most people don’t want to write articles and blogs, and would have nothing to write or blog about, and very few sites can create a buzz, so for most people, a buzz is a complete non-starter.
Since writing this article, it occured to me that I may have jumped to the wrong conclusion as to what Google is actually doing with the Big Daddy update. What I haven’t been able to understand is the reason for attacking certain types of links at the point of indexing pages, instead of attacking them in the index itself, where they boost rankings. But attacking certain types of links may not be Big Daddy’s primary purpose.
The growth of the Web continues at a great pace, and no search engine can possibly keep up with it. Index space has to be an issue for them sooner or later, and it may be that Big Daddy is Google’s way of addressing the issue now. Search engines have normally tried to index as much of the Web as possible, but, since they can’t keep pace with it, it may be that Google has made a fundamental change to the way they intend to index the Web. Instead of trying to index all pages from as many websites as possible, they may have decided to allow all sites to be represented in the index, but not necessarily to be fully indexed. In that way, they can index pages from more sites, and their index could be said to be more comprehensive.
Matt Cutts has stated that, with Big Daddy, they are now indexing more sites than before, and also that the index is now more comprehensive than before.
If that’s what Big Daddy is about, then I can’t find fault with it. But it doesn’t make any difference to webmasters. We still need to find more of those one-way on-topic inbound links to get more of our pages in the index.