Saturday 14 July 2007

Canonical URLs

Canonical URLs


Due to the filters applied to battle off spam and scraper web sites, duplicate content has become a major issue. The filter for duplicate content is applied to URLs that serve up the same web page content under different addresses, thus are filtering out potential cases of plagiarism or repetitive pages. See more information on Duplicate content.

In order for a web page not only be, but also be perceived as the only copy with its content, the proper server settings, internal navigation and inbound links are necessary. The Canonical URL is the URL that is thus set as the only URL to be able to serve that particular web page. In other words, it is the preferred URL for a single web page. Also, choosing a single Canonical URL to be used for each web page will help concentrate all incoming references, and accumulate all parameters such as PageRank in a more effective way.

Known issues


Sometimes a single web page with no additional copy of it existing on its server can still be perceived by the algorithm as the duplicate of another. This may be the result of not choosing or not setting up the www. subdomain preference on the server, or in the Google Webmaster tools panel, leaving the same web page displayed for more than one kind of parameter sets with dynamic queries or having directory index files linked to, both by their full, file level and shortened, directory level URLs that will default to the index files. ( For example in some cases the very same web page could be accessed through the following URLs : www.example.com/index.html , example.com/ , example.com/index.html, www.example.com/ , or in another example: www.example.com/product.php?item=10&action=review , www.example.com/product.php?item=10 , www.example.com/product.php?action=review&product=10 ... etc. )

+ Resolution: Make sure that a single web page can only be access through a single valid URL. Correct the navigation of the web site so that a single page is always linked to in the same manner, using the correct parameters with the URLs so that the same content ( for example database requests ) can not be accessed and served with more than one set of add-on strings, excluding variations such as different order in which the parameters are included in the URL. Also check whether you are relying on the server setup for web pages that are shown as default for directory level URLs. Make sure that such pages are referred to in the same way throughout the web site navigation, and that no inbound links are pointing to the other version either. You should also see to it that your server is set up properly for cases you can't avoid any of the above, and set up permanent redirects to correct the problem. Using 301, Permanent redirects in a .htaccess file should allow the correction of already existing duplicate URL entries and also prevent Google from indexing the same page on a different address. Also keep SSL protocol in mind, an http and an https version to the same page is also seen as a duplicate.

1 comment:

Anonymous said...

das Unvergleichliche Thema, mir ist es sehr interessant:) cialis 20mg wirkung cialis online [url=http//t7-isis.org]cialis kaufen online[/url]

Enter your email address:

Delivered by FeedBurner