Must Read : How to update wordpress robots.txt
Using Google Webmaster Tools or Bing Webmaster Tools you can check if the google bot is able to read your website or if it is disallowed. For some reasons I was having issues with my site a few days ago where my sitemap was not loaded properly by the crawler, sound stupid I know, but was getting errors on this. So I needed to update wordpress robots.txt to include the correct sitemap location, the next day I saw it was all ok. I myself could have manually alter the robots.txt (by the way, if the robots.txt is missing, you will need to create this in the root of your website, along side index.html or index.php), but I remember some online tools that would generate this really fast for me so I used one of them.
Example wordpress robots.txt
WordPress provides a similar sitemap on their documentation page that you can use if you wish, but I don’t really recommend it. It is only an example, nothing else. It shows how you can condition each spider specific to your requirements.
# Google Image
# Google AdSense
# digg mirror
I am currently using the following:
But do keep in mind, wordpress already has a default robots.txt file virtually created (eg. there is no robots.txt file physically), so you will need to create the file manually and add the contents of your robots.txt there.
What can you do with robots.txt
You can control using the robots.txt to only allow specific spiders or restrict other spiders in specific section of your website using the Disallow statement. It is really useful if you do not want certain pages like cache (which should be used by the script only) to be crawled and indexed.
You can also use it to apply conditions only for specific spiders like yahoo or bing.