How Booking Forms Create Duplicate Content
Dynamic websites can be a problem, not because the search engines cannot index the content, but because of the way your website is set up, generating an indefinite amount of duplicate content.
This problem was addressed in a post I wrote just a few days ago entitled Dynamic Websites: How to Avoid Indexing Duplicate Content. That post had a focus on website architecture and problems that can arise when the programmers work alone on the design and development.
But even if your website is well thought out you still may need to filter out duplicate content – take for example contact forms, or forms that are used to make bookings for hotel rooms:
This can be another duplicate content nightmare. In fact when the search engine spiders crawl your website they’ll go form page to page and may (partially) fill in the forms generating an indefinite number of pages that will get indexed and inflate website dimensions.
In cases like this the process of blocking out pages can be relatively easy – you need to block the form with your robots.txt file by indicating the exact page (or folder) generating the duplicate content. Forms shouldn’t really be indexed in the first place as they carry no useful information – all the more reason to keep them out of the search engine index.
So let’s say that you have booking forms on your website located at:
This is a fortunate case because there is a folder where all the booking forms have been placed so all you need to do is block out the entire folder with your robots.txt file with this instruction line:
That’s pretty simple isn’t it ? The website where I spotted this problem had ballooned to 10.000 pages from the actual 346.
It will take teh search engines some time to eliminate the duplicate content pages, so don’t get worried about that: it takes the time that it takes, don’t rush the process.
In Google Webmaster Tools you can actually see progress as the number of blocked pages increases day by day (depending on how often your site is crawled and how many pages are crawled on a daily basis).