Book Image

Joomla! 1.5 SEO

Book Image

Joomla! 1.5 SEO

Overview of this book

Some sites always appear at the top of a search result while others fail to even make it to the top ten. Wouldn't you want to see your site on the first page of any search result? This is not easily feasible if you are depending solely on the marketing guys whom you hire for SEO. Joomla! SEO will help you to attract more visitors and improve the way you rank in search engines by giving you the techniques and knowledge to work your site into higher visitor numbers. It will help you to create and improve your site in an easy way. Joomla! is great, and you can make it perform even better by using the guidelines and ideas in this book. Search Engine Optimization is becoming a must for every web site. As the competition on the Internet grows you need to make sure your site is among the top results on the major search engines. More and more people use search engines to find the information they are looking for, so you need to make sure you show up in those search result pages to get those visitors to your web site. Joomla! SEO will provide you with a lot of information, ranging from keywords strategies through technical improvements and content creation. All this information and the tutorials provided are targeted to give you the best base for gaining higher rankings. In the book, you will learn how to build a keyword strategy and create a better site structure for SEO. You will read about technical improvements that will give you better options for SEO. There is a separate chapter that helps you create search-engine friendly and keyword-rich URLs. In the end, you will have a web site that is ready to outperform your competitors and a manual to refer to for improving every step you take.
Table of Contents (17 chapters)
Joomla! 1.5 SEO
Credits
About the Author
About the Reviewers
Preface
9
Tracking and Tracing to Improve Your Web Site

Appendix B. Joomla! robots.txt and .htaccess

The robots.txt and .htaccess files are important to help you gain more traffic from search engines. The robots.txt file opens up or restricts access to files on your server for Search Engine Robots. The .htaccess file takes care of creating great looking, search engine friendly, and easy to remember URLs for your web site.

However, they can also create havoc and dismay if used the wrong way, leaving Search Engine Robots locked outside your web site. It can also result in displaying those nice looking 404 pages under every link you touch on your web site. So, how do you know if the files are okay? Testing is the keyword here!

Making sense of robots.txt

The Googlebot and other Search Engine Robots will crawl your web site based on the rules you provide in your robots.txt file. This file needs to be in the root of your domain or Joomla! installation directory.

Setting your rules for robots

There are just a few rules that robots will take into account if they visit your web site. Some of the rules are in the robots.txt file and you can add another set of rules, either on a page-by-page basis or on a link in your web site.

In the robots.txt file you will see commands such as:

Allow: /folder1/myfile.html
Disallow: /folder1/

You can also have a link to the sitemap of your web site:

Sitemap: http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml

This will give the link to your XML or .html sitemap to the robots if you don't have an XML file. Small difference, large effect!

The following rule looks like it does the same thing, but it doesn't:

User-agent: *
Disallow: /

The "/" in the second line tells the robots not to visit your site's pages. In the following example, the robots are allowed to visit all pages.

User-agent: *
Disallow:

The previous example is to show that you really need to make sure to use the right syntax in your robots.txt file.

Standard Joomla! robots.txt

Joomla! comes with a standard robots.txt file:

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/

As you can see, most special directories are blocked from the Search Engine Robots. There is no need to let them visit and index these special pages that hold the core of the system.

Improving the standard for image searchers

In the standard Joomla! robots.txt file, the directory images is blocked by the following line:

Disallow: /images/

However, this is one line that you need to remove. In the images directory you have all the images that you so carefully named, to be included in the image search pages of the major search engines.

Make sure that the robots get access to this directory by removing that line from your robots.txt file. This will open up a new flood of visitors. If you installed the SEF patch from JoomlAtWork.com site, this is already done for you.

A complete example

The following is the complete robots.txt file of the site www.cblandscapegardenign.com—notice the long line for sitemap:, it must be on one line in your robots.txt file.

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/
sitemap: http://www.cblandscapegardening.com/component/option,com_xmap/lang,en/no_html,1/sitemap,1/view,xml/

Full access is now granted to include the images and stories directories, and a sitemap link is provided for all Search Engine Robots. The way in which pages and links are handled by the robots is a part of your content creation and that explanation is covered in Chapter 4, How to write keyword-rich articles.