As a WordPress user, you must have probably already experienced sites stealing your content and republishing elsewhere. This is called duplicate content. In order to fight this practice, Google uses some filters to detect similar contents. When the crawler detects two (or more) similar pages, only one of them will be shown in search engines results pages. Additionally, some sites providing duplicate content can experience page rank loss.
Unfortunately, it is quite easy to duplicate content from your WordPress blog. For example, feeds have the same content as posts. Same goes for the trackback URLs, and so on.
To avoid any kind of duplicate content, we have to use a file dedicated to that task. This file is a simple text file, named robots.txt
that is located at the root of your WordPress blog. The file is used to tell the search engines crawlers that they don't have to follow or index some pages or directories.