Using Robots.txt to tell search engines what you want them to index
- Add a robots.txt file to your blog
- Using Robots.txt to tell search engines what you want them to index
Lately, I’ve been checking my blog at Google a lot. To check my blog, I use this search query.
site:silkenhut.com/blog
I use this to check if All in One SEO Plug-in has taken effect and also to see which pages are indexed. When I saw my results, I noticed several pages that I don’t want to be indexed but are indexed by Google. Aside from my individual post pages, these are the other things that Google has indexed from the blog.
- My stats page was indexed and included with it are the links for every person who have commented. (http://silkenhut/stats/?stats_author=Elizar)
- My feeds and comments are being indexed. (http://silkenhut/back-up-wordpress-using-a-plug-in/feed/)
- My archives and pages (http://silkenhut/page/3/) are also being indexed.
This is not a major problem but personally, I want to make sure that the pages being indexed are the useful ones such as the blog articles and not the feeds, comments, archives etc…
I already have my robots.txt file but there must be something wrong with it if these problems are still present. So this morning, I spent my time researching about robots.txt files and made an improved robots.txt file that will address these problems.
I used Google Webmaster Tools to test out my robots.txt file. (I’ll make a blog post for Google Webmaster Tools soon but if you are curious, you are free to visit it. It’s very helpful!)
Now behold my latest robots.txt file which I think have solved the problems presented above!
sitemap: http://silkenhut.com/sitemap.xml
User-agent: *
Disallow: /cgi-bin/
Disallow: /blog/wp-admin/
Disallow: /blog/wp-includes/
Disallow: /blog/author/
Disallow: */page/
Disallow: /blog/category/
Disallow: /blog/archives/
Disallow: */trackback/
Disallow: */feed/
Disallow: /blog/stats/
You are to copy this robots.txt file if you want since I just copied them from various sources and then customized them for my blog. I hope this can help. Google index my pages faster!!! hehe
Thanks to Elizar for his comment on my open problem last week. He taught me the idea that you can use robots.txt to disable indexing of specific post pages. ^_^




Ooo.. didn’t know you could use robots.txt for SEO optimization. Never really focused on optimizing my blog, but I think I should in order for me to be more visible to search engines.
Thanks for this tip! ^_^ (Copying the robots.txt)
Reply
Allen Reply:
August 30th, 2007 at 10:46 pm
*files a case of plagiarism against Karlo! haha just kiddding! You are welcome.
Reply
Hey, thanks for the mention.. and you’re welcome.. I could use that too..
Reply
Allen Reply:
August 30th, 2007 at 10:46 pm
Thank you too.
Reply
i’ll try this one as well. cheers mate.
Reply
Thank you for this detailed look at a specific use for the robots.txt file. I’m learning little by little what extra things can be done to streamline my SEO and this is one more tip I’ll add to my bag.
Reply