Saturday, 14 July, 2007

Anchor text link

Anchor text link

The calculation of link relevance is an important factor when identifying the theme of a web site, in which both navigation and incoming references play a huge role. The broad relevancy communicated towards the Google Index consists of the themes of individual pages, which are connected by their relevancy network and shape the overall image of the web site. While the topic is of course set by the actual content of the page, the level of its relevance builds with off-site and off-page references as well. If any web page is often linked to with anchor texts that successfully emphasize its theme, the words and derivations with which it is referred to become the phrases it is seen most relevant to. Thus both website navigation and inbound links carry votes, and not only for the used expression, but also for the entire theme, even for words with similar meaning that are not directly targeted. This is of course if the web page itself is relevant to these phrases. In case the terms used to point to the web page can not be found in its title and/or content, nor do they have any relation to its theme, the votes carry much less weight or are sometimes even discarded. Misuse of this practice is thus easy to identify. Also the number of references may not put as much weight into emphasizing the theme as the quality of the links pointing to the resource. Hundreds or thousands of links from less trusted, less important sources aren't likely to match up to the weight of a single referring link from a well trusted, quality web site.

Known issues

Case 1,
If the anchor text is irrelevant to the source and/or the target page, the link will most likely be ignored altogether, or only pass a single vote for the exact phrase. If a page does not rank at all for phrases that are otherwise relevant to its topic, even though the web site is well referenced from other sources, either the incoming links or the internal navigation anchor text is flawed in its attempt to carry the theme throughout the web site and will rank significantly lower than other URLs.

+ Resolution: The anchor text used to point to a page is matched to the title and the content of both the source and the target during relevancy calculations. While there's little chance of a penalty if either are off-topic, a proper anchor text and title is one of the most important initial signals to users who are yet to arrive to the page or web site. The theme should be clearly defined by the wording and be consistent in all three, allowing people to determine if it is the resource they need, whether they encounter a link on another web site, or the page title and description on Google Search results. Should any of the three exclude the keyphrase(s) of the resource, it will become hard for both users and bots to determine the exact theme, and thus the URLs will show much lower positions for the given queries even without a penalty. Also, while it is an evidence, and natural to have diverse wording of anchor text from referring sources, the title and the anchor text of the website navigation are both often used by people when creating a link. Thus again, choosing the proper, most descriptive phrases that match the content may be necessary to avoid lower than predicted position on the results pages.

Case 2,
Receiving links, or requesting others to reference pages with always the same anchor text will raise the question on how much control the web site had over the wording of its own "votes". If the profile shows a pattern that is the same as of sites trying to manipulate their rankings, this may raise a penalty once the same-anchor text link instances pass the natural threshold. Repeated misuse or overuse of such methods would lead to page-based penalties as opposed to phrase based, or even being banned from the Google index. Years of studies have shown a highly predictable pattern of "natural" linking in regards of anchor link texts used. Sometimes however, a page would accumulate a lot of references with the exact same anchor text by chance, enough to outweigh anything else in its linking profile.

+ Resolution: Reasons may be very simple, from not having a long enough title ( which people often use for anchor text ) to not having a description that others could paraphrase. While it is important to define the exact topic and role of a page ( and it is recommended to use consistent anchor text in the main navigation of a web site itself ), in cases of repetitive anchor text in inbound links, a signal is sent that the site is receiving manipulated "votes" to boost its relevancy. This is especially in the event when the branding targets a highly competitive term, which is often used by spammers and may be seen as manipulative. If a page has passed the natural threshold, and is now considered to have been excessively linked to with the same phrase from other domains, a penalty for this term is applied, forcing the URL to a lower position on the results pages. While the system can tell with a very good chance when such practices are in place, sometimes even natural linking patterns will show these signals. However, this penalty is automatic, and may only affect the given query and URL. If the page gets references from other domains with other phrases in the anchor text as well, the penalty may then be lifted. For this you may revisit and extend page titles and descriptions ( should they have been too short ), or provide some indirect ideas to visitors on how to describe / define the theme of the site.

Case 3,
Keyword stuffing, while an unofficial term, clearly describes a past spamming method of which now has a proper counter measure in the system of Google. Using improper length or irrelevant phrases in anchor text when pointing to an internal page may trigger the applying of a filter, and lower the rankings of the URL for searches that include the used words. Continued misuse of anchor text may also lead to the excluding of the URL(s) from the Index, including the source and target pages as well. Recent additions to spam filtering now examine the relevancy of the target page closely, and in certain cases highly competitive commercial terms included in the anchor text, but to a page that is not relevant to them, may be seen as manipulative.

+ Resolution: Accidental overuse of anchor text can easily be avoided by judging a text link, or text link navigation by its aesthetics. Two or even three word links are not at all uncommon, while an entire paragraph of words being used as the text for a link is obviously not meant for better user experience. Avoid stuffing too many keywords into a link, both for your internal navigation, and incoming links from other web sites. Again, any pattern that could be identified as not "natural", is easy to spot for anyone, thus you should assume that Googlebot and the Google algorithms can just as easily judge these cases with a very good accuracy.

Case 4,
A newly issued spam detection system, that has been created to battle off scraper sites, links purchased for their parameters ( PageRank ), spam and other manipulative attempts, now examines the relevancy of any given page with a complex, phrase-matching method. This patent involves predicting the number of only marginally related, competitive phrases present in a document for any given theme. Its effects in combination of other closely examined factors may affect websites that have been ranking well for certain phrases so far. The Google algorithm also looks for attempts to artificially create relevance from semantic correlation if the topic of the page would not indicate the presence of certain references ( yet is including them ). Should a page, by accident, pass the threshold of a natural number of related highly competitive phrases that are not supported by its off-page signals ( inbound links, relevant internal pages referencing it with just as relevant anchor text ), or should a page use an excessive amount of thematically unrelated, but semantically similar terms, it may receive a very distinct penalty ( dubbed by the webmaster community as the last page or -950 penalty ) for the exact queries it was assumed targeting. The pages would stop ranking for a phrase, in case they have a distinctly high TrustRank, they may take the very last positions shown for the given query, but may still have good positions for others. Also, as fluctuations within the system can indicate borderline cases of mixed problems, these URLs may be shown in their original, or better than original rankings for a period of time. Examples include when a page would have strong relevance signals for a two word keyphrase, and is seen attempting to create relevance ( links or content for topics that by themselves are seen as a separate theme ) by using other two word keyphrases that utilize a single part of the one it ranks for. A different case is when a page unknowingly passes the threshold of a "natural" number of references to highly competitive terms, and while human editorial opinion may conclude the topics to be related, automated examinations show similarities with manipulative attempts.

+ Resolution: This penalty is tied to relevancy, thus is often an indication of the lack of proper signals. It is applied automatically and thus any legitimate page can overcome its effects by gaining new outside references to justify the theme, or by using a clearly relevant wording in the title and anchor text pointing to the page. Also, this filter is likely to be adjusted in the upcoming period, to be more accurate in detecting spam documents. You may want to examine the theme hierarchy of your website by making sure the given page is referenced from already relevant pages within the website, and the navigation is using a relevant anchor text as well. Keep in mind that too broad or on the other hand, too specific keyphrases may send the signals of targeting a different theme than the page would be a match for. Single word anchor text may be too generic in certain cases ( and along other single word anchor text with different themes ), and uncommon derivations are not always recognized by the Google algorithm as a match for the topic. Read more on Website Navigation.

No comments:

Enter your email address:

Delivered by FeedBurner