Using Robots.txt to tell search engines what you want them to index

Lately, I’ve been checking my blog at Google a lot. To check my blog, I use this search query.

site:silkenhut.com/blog

I use this to check if All in One SEO Plug-in has taken effect and also to see which pages are indexed. When I saw my results, I noticed several pages that I don’t want to be indexed but are indexed by Google. Aside from my individual post pages, these are the other things that Google has indexed from the blog.

  • My stats page was indexed and included with it are the links for every person who have commented. (http://silkenhut/stats/?stats_author=Elizar)
  • My feeds and comments are being indexed. (http://silkenhut/back-up-wordpress-using-a-plug-in/feed/)
  • My archives and pages (http://silkenhut/page/3/) are also being indexed.

This is not a major problem but personally, I want to make sure that the pages being indexed are the useful ones such as the blog articles and not the feeds, comments, archives etc…
I already have my robots.txt file but there must be something wrong with it if these problems are still present. So this morning, I spent my time researching about robots.txt files and made an improved robots.txt file that will address these problems.

I used Google Webmaster Tools to test out my robots.txt file. (I’ll make a blog post for Google Webmaster Tools soon but if you are curious, you are free to visit it. It’s very helpful!)

Now behold my latest robots.txt file which I think have solved the problems presented above!

sitemap: http://silkenhut.com/sitemap.xml

User-agent: *
Disallow: /cgi-bin/
Disallow: /blog/wp-admin/
Disallow: /blog/wp-includes/
Disallow: /blog/author/
Disallow: */page/
Disallow: /blog/category/
Disallow: /blog/archives/
Disallow: */trackback/
Disallow: */feed/
Disallow: /blog/stats/

You are to copy this robots.txt file if you want since I just copied them from various sources and then customized them for my blog. I hope this can help. Google index my pages faster!!! hehe

Thanks to Elizar for his comment on my open problem last week. He taught me the idea that you can use robots.txt to disable indexing of specific post pages. ^_^

6 thoughts on “Using Robots.txt to tell search engines what you want them to index

  1. Karlo.PinoyBlogero says:

    Ooo.. didn’t know you could use robots.txt for SEO optimization. Never really focused on optimizing my blog, but I think I should in order for me to be more visible to search engines.

    Thanks for this tip! ^_^ (Copying the robots.txt)

  2. Hey, thanks for the mention.. and you’re welcome.. I could use that too.. 😀

  3. i’ll try this one as well. cheers mate.

  4. Thank you for this detailed look at a specific use for the robots.txt file. I’m learning little by little what extra things can be done to streamline my SEO and this is one more tip I’ll add to my bag.

Leave a Reply