An exceptionally simple return towards the headsmacking sequence this week (as it really is late here in London and I've been up my regular 40+ hours traveling). We have been noticing that a number of online resources in search of to block bot access to pages on their domain are using robots.txt to complete so. When that is without doubt a good practice, the inquiries we've been obtaining present that there are some misunderstandings about what blocking GoogleYahoo,
microsoft office 2010 Home And Student product key!MSNother search bots with robots.txt does. Here's a effective breakdown: Block with Robots.txt - don't try to go to the URL, but sense free of charge to keep it with the index & display in the SERPs (see below if this confuses you) Block with Meta NoIndex - feel no cost to go to, but don't put the URL while in the index or display with the results Block by Nofollowing Links - not a smart move, as other followed links can still put them within the index (it truly is fine if you don't want to "waste juice" on the page, but don't think it will preserve bots away or prevent it from appearing with the SERPs) Here is a instant example of a page that's blocked via robots.txt but appears in Google's index: (note that this robots.txt is the same across about.com's other subdomains, too) You can see that about.com is clearly disallowing the librarynosearch folder. Yet, here's what happens when we search Google for URLs in that folder: Notice that Google has 2,760 pages from that "disallowed" directory. They haven't crawled these URLs,
microsoft office 2007 Pro Plus activation, so they appear as mere address strings (no title, description,
office Enterprise 2007 serial, etc - since Google can't see the pages' content). Now think one step further - if you've got any amount of pages you're blocking from the search engines' eyes, those URLs can still accumulate links, accumulate juice and other query-independent ranking factors, but they have no way to "pass it along" since their own links out will never be seen. I'll illustrate the situation: There's two real takeaways right here: Conserve link juice by using nofollow when linking to a URL that is robots.txt disallowed If you know that disallowed pages have acquired link juice (particularly from external links),
microsoft office 2007 Standard product key, consider using meta noindex, follow instead so they can pass their link juice on to places on your site that need it. Looking forward to seeing folks at SMX London tomorrow (and for Will and my big showdown on Tuesday, too),
windows 7 enterprise 64bit! p.s. Andy Beard covered this topic previously in a solid post - SEO Linking Gotchas Even the Pros Make.