Search Marketing Consultant

Web Marketing & Search Engine Consultant

Dynamic Websites:
How Booking Forms Create Duplicate Content

bigstockphoto_Cloning_860487.jpg

Dynamic websites can be a problem, not because the search engines cannot index the content, but because of the way your website is set up, generating an indefinite amount of duplicate content.

This problem was addressed in a post I wrote just a few days ago entitled Dynamic Websites: How to Avoid Indexing Duplicate Content. That post had a focus on website architecture and problems that can arise when the programmers work alone on the design and development.

But even if your website is well thought out you still may need to filter out duplicate content – take for example contact forms, or forms that are used to make bookings for hotel rooms:

200809282113.jpg
200809282114.jpg

This can be another duplicate content nightmare. In fact when the search engine spiders crawl your website they’ll go form page to page and may (partially) fill in the forms generating an indefinite number of pages that will get indexed and inflate website dimensions.

In cases like this the process of blocking out pages can be relatively easy – you need to block the form with your robots.txt file by indicating the exact page (or folder) generating the duplicate content. Forms shouldn’t really be indexed in the first place as they carry no useful information – all the more reason to keep them out of the search engine index.

So let’s say that you have booking forms on your website located at:

http://www.yourwebsite.com/Booking/hotel-booking.php
http://www.yourwebsite.com/Booking/car-rental-booking.php

This is a fortunate case because there is a folder where all the booking forms have been placed so all you need to do is block out the entire folder with your robots.txt file with this instruction line:

Disallow: /Booking/

That’s pretty simple isn’t it ? The website where I spotted this problem had ballooned to 10.000 pages from the actual 346.

It will take teh search engines some time to eliminate the duplicate content pages, so don’t get worried about that: it takes the time that it takes, don’t rush the process.

In Google Webmaster Tools you can actually see progress as the number of blocked pages increases day by day (depending on how often your site is crawled and how many pages are crawled on a daily basis).

Enhanced by Zemanta

10 Replies

  1. Johnnie Debt

    Didn’t ever think of this problem. Thanks for posting

  2. Forms as a problem? This is something new for me and obviously well thought out. I hate messing around with the robots.txt file since I am still somewhat of a newbie, but this is always helpful info so we don’t get slapped by the big G!

  3. Apply Card Online

    I didn’t really think of this problem either. It’s something many webmasters must be aware of. Thank you for the heads up.

    – John.

  4. This is a great post. You hear so much MIS-information about “duplicate content” that it’s good to see someone talk about what really IS duplicate content and what you can do about it.

  5. I never pay attention to such case too. But, you’re right that this can be a problem too. Thanks for underlining this matter for me before I get such problem.

  6. Duplicate content really is such a mystery. I always like to hear someone elses point of view, it helps me kind of with my own theories.

  7. I stopped using contact forms all together on my websites, I said hell with it. I most likely get some weird emails or some spam so I just don’t even bother anymore.

  8. The nice thing about many blogs is that the permalinks in the title helps reduce the duplicate content problem created. I can see how it could be a problem in dynamic sites though.

    On a different note, there might be a problem with the styling on your formatting for the comment input boxes for name, email, and website.

  9. Nice picture you have there:))