Tuesday 24 July, 2007

11 Techniques to Increase Page Views on Your Blog

11 Techniques to Increase Page Views on Your Blog

Yesterday my brief study into page view statistics revealed that the average blog reader views around 1.7 pages every time they visit a blog.

I finished the post by indicating that I’d post more on how to increase your blog’s page views.

Of course more page views may or may not be what you want from your blog. At least one commenter on the previous post noted that they are happy with a low page view count because it could mean people are leaving their blog by clicking on an advertisement and thereby earning them money. While there could be some truth in this observation and I’m not adverse to this happening on my blogs - I’m also interested in building blogs that people find interesting and useful and one of the many measures of this can be page views. Of course to get back to the money thing again - those of you running impression based ads will be interested in increased page views also.

Having said that - IF you’re interested in increasing the number of pages that your average reader reads, here are a few suggestions that might help:

1. Highlight Related Posts - one of the more common practices of bloggers to encourage readers to read multiple pages on their blogs is to to highlight related posts at the end of your article. You’ll notice that i presently have a list of 5 posts at the end of each individual page that suggests other posts that readers might find useful This list is generated by a WordPress PlugIn. Those of you using other blog platforms might find similar plugins for your own system or might like to manually suggest related articles at the end of your posts.

2. Interlink within Posts - a similar but perhaps more effective technique is to highlight relevant posts within the content of your posts. If you’re writing a post that mentions something similar to what you’ve written before simply link to your previous post from within your article. For example I’ve written about this technique previously in a post on increasing the longevity of key posts.

3. Highlight Key Posts and Categories in your Blog’s Hotspots - I’ve often mentioned that the hottest posts on this blog are those highlighted in my top three menus. Specifically it is those in the top left hand box at the top of this page that are always at the top of my most read post statistics. Depending upon the goals of your blog - you may wish to fill your blog’s hotspots with ads or affiliate programs - or you may want to highlight key posts that are central to your blog and which will hook readers into what your blog is about (thereby increasing page views). Highlighting your category pages is also another similarly useful technique to encourage your readers to find more posts on the same topic. To explicitly name what your category is can also be useful. ie rather than just having the category name at the end of the post - try something like ‘read more posts like this in our ((insert category name)) category’ or ‘filed under ((insert category name))’ etc.

4. Compilation Pages - Extending the previous idea about highlighting key posts you may wish to use posts in these positions that sneeze readers not just to one post on your blog but many. The best example of this on ProBlogger is my Top 20 Posts at ProBlogger post which is in my top left hand menu. This post, as the name suggests, suggests 20 posts on my blog that readers might like to read. I know that this is a post with immense power on this blog and that many first time readers use it to bounce into all corners of my blog. One or two new readers have fed back to me that this page and the pages that it linked to was the reason that they became hooked on ProBlogger. Every post they read added to the chances that they would become loyal readers.

5. Series - While you need to be a bit careful with writing series of posts over periods of time, they are a great way to keep readers coming back and once they are complete to have them surf through multiple pages on your blog. The most popular series on this blog is my Adsense for Bloggers series which leads readers through 8 posts. I know many readers progress through this series because I occasionally get a series of comments from a reader who is obviously progressing through it - 8 comments over 30 minutes or so as they comment on each post. Don’t just do a series for the sake of increasing page views of course - this can really frustrate readers but use them on longer posts or when you’re genuinely wanting to interact with a larger topic over time.

6. Excerpts on Front Pages - I know there are a segment of ProBlogger readers that detest seeing excerpts (extended entry feature) on blog front pages and are very cynical that it’s just a ploy to get more page views. While I personally like using excerpts on front pages it is not about page views for me (although I guess it is a side benefit of it). Personally using excerpts in this way is more about keeping my front page manageable and highlighting multiple posts on the front page. ie if a reader can come to my blog and see not only the last post but the title of the second and maybe even the third post then they are more likely to explore more than just the last thing you’ve written. I tend to only use the extended entry feature on longer articles and allow shorter ones of a paragraph or two go up on the main page - unless I either forget or see the post as an important one.

7. Excerpts in RSS - Once again there is always debate over this topic of full or partial RSS feeds. I know some bloggers main purpose in partial feeds is to get bloggers directly onto their blog - thereby increasing their impression/page view count. While this is certainly a benefit of partial feeds it is not my own reason for using them. Rather I use them for copyright protection and to stop people scraping my full content onto their site’s via RSS. Whatever reason you choose to use partial/excerpt feeds - you should also realize that doing so will cause some readers to unsubscribe to your blog completely. I know in going only with partial feeds that there are some other bloggers who refuse to visit my blog - this is a cost/benefit scenario that individual bloggers need to weigh up.

8. Enable links in RSS Feeds - Another way that I know a couple of bloggers use to get RSS readers to actually surf to their blogs is to enable the ability to post html/links in their RSS and then using links to previous posts in their blog, especially in the first paragraph or two of their posts. This is not a technique I’ve tried but I know of one blogger who swears by it and says it significantly impacted the number of visitors to his blog from RSS as well as the number of pages that they viewed.

9. Search Function - most blog blog platforms have the ability to use a search feature on your blog which enables users to search your blog for keywords. This feature obviously helps your readers to locate other posts on your site and as a result increases the potential for a multiple page view visit.

10. Build an Interactive Blog - one way to get readers coming back to your blog many times over a day is to have a blog that people want to interact with. I know some ProBlogger readers visit this site at least 10 times per day just so that they can engage in the conversation that happens in comments. Since I added the ’subscribe to comments’ feature on this blog I’ve noticed some readers coming even more than normal - this can only be increasing page view levels as people return throughout the day. I’ve written (some time ago now) a few ideas on interactive blogging here and here.

11. Quality Content -This should go without saying but needs to be reinforced. Obviously if you write quality content your readers will want more of the same. Useful, original and interesting content should leave your readers hungering for more. Work on the quality of your blog and you’ll find that things like traffic levels and the numbers of pages being read should look after themselves and be on the rise.

There are no doubt other techniques for increasing page views. I’ve heard bloggers who swear by writing loads of posts per day to encourage readers to come back numerous times per day as one such technique - but I’d love to hear your experiences in comments below.

Sunday 22 July, 2007

10 Tips on How to Increase Your Website Traffic

10 Tips on How to Increase Your Website Traffic

Not so long ago I had read an article about a woman who dramatically increased her page ranking from article submissions on her new website. That that was a fact I already knew, just like many other Internet Marketers. However, that article was the needed inspiration I needed for me to make some changes with my daily online business habits.

Since I started studying the changes in my online traffic, the effects on adding content to your website and the change in search engine rankings, I've learned the importance of writing what you know and sharing it with others out there on the Internet.

I never took submitting and writing articles on a consistent basis too seriously when I first started my home based business. Part of the reason was because I was not confident enough to share what I knew with others. Sure I posted a few articles, but I never posted those articles to as many directories and ezines that would accept my article.

Now before you go about thinking it might NOT be necessary to write an article periodically, take a look at some of the facts I have taken more seriously and learned as a result of posting articles online.

10 Benefits Of Submitting Articles Online:

1. Posting an article online has more of a dramatic affect than one might think for your home based business. Posting articles online builds the needed momentum of website traffic in the form of increased popularity and other sites linking to your site. All of those sites linking to you are one way links!

2. As you begin to submit more articles to more directories, your list of places for submitting articles will keep expanding.

3. Submitting to websites regularly will add more popularity to your website. This is the easiest way to improved your link popularity as well as your increased website traffic.

4. Most directories usually will let you include a resource box with your name and your website link with your article. Your website address will be found every time someone reads your article, creating more one way links to your site.

5. Most article directory websites store submitted articles in their archives. Since the search engine spiders love content, these sites are crawled often. This means YOU will benefit from the traffic and hyperlinks from each of your archived articles on their website.

6. The more articles you consistently write and distribute, the more you will become known as an expert for your home based business. The more you provide informative articles, your business will be given extra credibility which will help you to compete against your competition.

7. The more exposure your articles get for you, the more you will see your content showing up all over the Internet. Increasing the level of awareness and popularity builds credibility. Increased credibility may lead to requests to write other articles and improved sales of your product or service.

8. The more articles that you submit to major article directories, the more likely your content will be used for wider publication. The main reason why this occurs is because many newsletter and ezine publishers like to use article directories to search for good quality content for their articles.

9. Adding content that is useful for both the readers and to the search engines will gain in your website popularity. By posting your articles in popular directories, you are more likely to be chosen for a featured article in a newsletter and have several webmasters republish your article on their website.

10. If you have a blog or RSS syndication, you should submit your articles written in your blog to article directories. The most overlooked source of traffic for a blog is through article submission. You can turn your longer posts into articles and submit them to ezines or directories.

In Summary, make sure each article is written to inform and appeal to the reader, so that they are getting useful information from you. Once you start writing articles, be more systematic about submitting new ones on a regular basis so they built momentum.

You will find the articles already posted have been hard at work for me because they are free advertising from other people to post on their web sites. You will start getting the traffic and attention you'd been wishing for in months to come.

Saturday 21 July, 2007

How To Increase Your Website Traffic With Zero Cost

How To Increase Your Website Traffic With Zero Cost






How to increase your website traffic with zero cost'. It's a bold statement don't you think. But, believe me it's true. You can increase your traffic by 1000% with no cost involved if you do it the right way. Continue reading if you want to know how.

I've outlined 5 ways to reach your target. But, please keep in mind that these are not the only ways that you can do to increase your traffic. There are hundreds of techniques to increase traffic. But these one are the proven one. I've used it personally. More importantly, these techniques can get you FREE traffic. You're money is saved in your pocket. Let's go to the first one.

Technique #1: Linking strategy

Linking strategy is the easiest way to get free traffic. When I say "the easiest way" it does not mean that you can ask everybody to link to your site and do nothing after that. Compared to other techniques that you'll discover, this one will take less time to do.

Here's how to do it. First select the site in your niche market. Be selective. Choose one that has a high traffic. Usually a high traffic site is pretty stingy to put link to your site. So, the key here is to be persistent. Ask them how many visitors do they received per month and if they could link to your site. If they don't answer your request, email them the second time.

Be persistent. If they don't want to link to your site, ask them to trade link instead (reciprocal link). This is the last resort you want to have.

Word of warning: Don't crowd your site with too many links. Only accept link trading if it's really worth it.

Technique #2: Offer Free eBooks or articles

You'll fall in love with this technique if you see what it can do to your site. This technique can create an excellent'Viral marketing' effect. It can multiply the no of visitors to your site in a matter of days. This most important thing about this technique is that to offer something that is really useful to your visitor. So useful that they can only get that information from you!

You need to the 'wants' in your need market. What problems do they encounter? Solve these problems and you have a killer articles or e-book that you can give away for free. Remember, don't sell it. Give it away for free. If you feel really reluctant to give your article or e-book for free, you can give your visitors a partial of it. But, make sure it's really useful. Don't forget to put your name and your contact information in this article or e-book. Usually, if you write an article, you need to include your resource box at the very bottom of your article.

The most important task in this technique is to offer a reprint right to your visitors. What this mean is that your visitors can publish your articles or e-book to anyone in any medium; email, Ezine, website or anything. But please state your condition: Include your contact information or resource box. This will create viral effect to your visitors.

Before I forgot, there is one particular e-book compiler that is good in doing this kind of task. The name of this e-book compiler is 'E-book Edit Pro'. With this compiler, you can offer your visitors a customizable e-book. This is a great incentive for them to distribute your article or e-book since they can put their name and information in it. If you like to know more about the excellent compiler, please visit: http://www.ebookedit.com/

Technique #3: Classified Ad

This is the most time consuming technique compared to all 5. While it is time consuming, it is really worth it.

Tips - This technique should be used together with the above technique. Let me explain:

First, you need to write an e-book or article that you can give it away for free. Then, you need have an autoresponder. If you don't have an autoresponder (your hosting company should provide this service for free), you can get one for free. Just type 'free autoresponder' in your search engine and you'll get hundred of sites that provided free autoresponder. If you're searching for excellent autoresponder, I'd like to suggest these autoresponder:

1. http://www.getresponse.com - Most online marketers used this service. You pay monthly to them for providing service. The most important feature is that you can personalize your autoresponder with your visitor's name. They also provide a free service. The only catch is that there will be an ad in your email.

2. http://www.aweber.com - One of the first companies to provide autoresponder service. Try to visit their site if you want to learn more. Basically, it offers the same kind of service as getresponse.com.

3. http://www.autoresponseplus.com - You only need to pay one-time fees for this service since the software will be installed on your server. It's great in you don't want to pay high monthly fees. The only drawback for Autoresponseplus is that you need to have a little bit skill on how to install cgi (Common Gateway Interface) on your server.

Enough talking. Let's continue.

After you have your own autoresponder, place your free article in this autoresponder. Now, you need to advertise your autoresponder address in the classified ad website. Don't put your email address but your autoresponder address. The best part with this technique is that you can capture you visitor's email. You can contact them again and again if you have any offer in the future.

Technique #4: Deliver informational pack Ezine/newsletter

People surf the net to look for information. Out of 100, only 3 people surf the net to buy something. But others are doing some research or try to find something informational.

With this keep in mind, you can attract people to come to your site if you can deliver them timely information. By producing timely information, you glued these visitors to your site preventing them from going elsewhere. This can be done by giving them free newsletter or Ezine.

This is not an easy task because there is abundance of free information on the net. You need to give them something different from these 'free' stuff. Try to provide something unique in your Ezine. For example, if you're publishing music Ezine, try to make a deal with music label so that you can give special price to your subscriber. Make sure your subscriber cannot get this of kind if deal in other place. If you can create this unique proposition, you're already on top of the world. Your Ezine will spread like fire. More people will come to your site to subscribe your unique newsletter.

Technique #5: Offer affiliate program

This is the greatest FREE traffic generator technique out there. With this technique both parties win; you and your affiliate program participant. You get more traffic and sales, they get more money from referral commission.

This topic is really a large topic. I can write a whole e-book about how to create a successful affiliate program. But, I'll discuss the basic thing about affiliate program in here.

Basically, to create an effective affiliate program, you need to create an interest for your visitor's to join your affiliate program. You can do this by giving them high referral fees and marketing tools for them to use. Above all, you need to make them easy to promote your product or service. Don't make them do all the hard work. It is your job.

The next thing you need to do is to motivate them to spread the word about you. Contact them in a timely manner. Don't forget them after they've joined your program. Make them feel special. In fact, they are special since they are the one who will do the promotion and advertising. A well designed affiliate program can increase your website traffic and sales by unimaginable amount. But again, you need to devote all you effort in this technique if you want to have a successful affiliate program. Don't do it half way. Even if you've to work 18 hours a day to create your own affiliate program, it's really worth it in the future. The payoff is going to be thousand times your initial effort.

All of these techniques are free. You don't have to spend a dime on them. Try it on your site. I've tried all these techniques. And they work!

Nas Romli runs a site that help people to start their own home based business within 24 hours.Many have benefited from this service.Drop by at his site for more information about this amazing service:http://www.cashflowsecret.net

Saturday 14 July, 2007

The Google "sandbox"

The Google "sandbox"


An unofficial term that refers to a filter / penalty that is in fact not in use.

A phrase used by some webmaster and SEO communities, the "sandbox" refers to the phenomenon of a newly indexed or recently updated web site not appearing or virtually disappearing from competitive and/or generic search result pages for an indefinite time, while still being indexed, cached and shown for obscure queries. The consensus between those who use this phrase is that the time it takes for the web sites to be "clear of the sandbox" is used to evaluate whether a web page was created with a valid long-term business plan, opposed to being spam, or any other short term venture, one that would be breaching Google policies in its intentions.

Web sites that are, or have been near the borderline of triggering quality filters on Google may have seen major, yet periodical shifts in their rankings, and thus came to the wrong conclusions. An example of such false assumptions was that the pages affected were seen as valid, but too recent to be included in the index. Others were mentioning manual penalties based on business model evaluation. The term "sandbox" was used in so many instances that all it means, is that "you have a problem", that may or may not resolve by itself. However instead of waiting, investigating possible errors and examining your web site with a critical eye should be your priority. And in case you see no problems at all, you may need to refer to information on PageRank and especially on TrustRank.

Known issues


A general hint for web sites being indexed, but yet to show up on the results for broad, competitive or generic queries may be a relatively low TrustRank compared to other relevant pages. Recently launched web sites, and using certain practices on established web sites, may see an overall lack/drop of trust, that is either yet to be established or needs to be utilized with more care. The TrustRank patent does in fact implement the history of URLs and referral link age as two of its factors, however the age of the links alone will not keep a page from, or help it to rank significantly higher than others. The phenomenon the "sandbox effect" terminology refers to is a false assumption of grouping a wide range of possible causes under a single imaginary filter in the index. The actual causes that may result in the discussed effects may be one or more filters reacting to the web site, a low number of quality incoming links, or sometimes the database not yet carrying all information on the newest pages.

+ Resolution: Read up on all of the possible causes, as the term "sandbox" does not refer to neither the problem, nor its resolution. Newly launched web sites may find the information on TrustRank, Duplicate content, PageRank, Website Navigation, Anchor text link and Bulk updates the most important, while established web sites should definitely be checked for Canonical Issues, Duplicate content, again their Internal navigation and Anchor text links, and Bad neighborhood linking. A thorough examination however may reveal that the web site in question is yet to accumulate enough off-site quality factors ( PageRank, TrustRank and even relevance ), or that there is at least one problem with on or off-site parameters.

Reinclusion requests

Reinclusion requests


Web sites and domains that have a recent history of being banned from the index may not automatically be reincluded, not even after the offending pages have been removed or reworked. In certain cases where a web site is using a domain that has been banned because of actions of a previous owner, or when a manual penalty was issued, even for web sites that have been corrected to comply with all policies a notification from the webmaster may be needed. This request will send the signal to Google that the problems have been fixed, and it can now re-evaluate the pages. Do a Reinclusion request after you've made entirely sure that the web site in question is now in full respect of all policies and guidelines. Reinclusion requests can be made through the Google Webmaster Tools panel or the Google Webmaster Help Center.

Known issues


The reinclusion request is sometimes mistaken for a tool that is to re-evaluate changes made to a web site, even though the domain in question is not banned or penalized manually. In such cases doing a Reinclusion request will have no effect on the ranking of the web site. It is not a tool to haste crawling of pages that are thought to have been penalized by a filter, but a way to notify Google of a banned or manually penalized web site that is now ready to be re-evaluated. For the lifting of automated penalties, and/or exclusion from under the effect of a filter, you may rely on subsequent crawls recognizing the changes, and adjust the ranking of the site accordingly. If you do not know why your page is penalized or banned, not even after going through all possible on-page and linking pattern factors, you may need to refer to the article on Historic Domain Penalty.

Google Sitemaps

Google Sitemaps


Google Webmaster Tools feature the interface to upload a sitemap feed with the actual URLs that your web site currently is publicly serving . The list may be as large as 50.000 entries, and you may upload more than one sitemap feed to cover different areas, different levels of the web site. The sitemap is often used to map the entire list of pages, match them up with the URLs already indexed by Google, and send out Googlebot to check the ones that it may have been missed thus far. Note that validating a website in Google Webmaster Tools has countless other benefits, such as having direct access to link, relevance and crawl data on your website, even if no sitemap is uploaded.

Known issues


Note that Googlebot missing a page on your web site is most likely a navigational error, that have kept it from crawling the URL(s) in question. Thus even if these resources are to be included in the index, chances that they will not have proper links pointing to them are high. Pages that have such accessibility issues will most likely lack certain parameters, such as PageRank, and may end up as a supplemental result when doing queries they are relevant to.

+ Resolution: If you perceive such pages to be more important or relevant than others on your web site, you may need to re-examine your internal navigation in order to reflect this. Read more about Supplemental Results and Website Navigation.

Historic Domain Penalty

Historic Domain Penalty


With the free market of Domain names, one may acquire a preowned domain from its previous holder. While domain names - and URLs in general - do not hold the power to drastically strengthen the relevance signals of a web site, the ease of remembering a well marketed, short and/or on-topic brand name will lead many to negotiating for preowned domain names ( in hope of using it for a new web site or migrating another set of pages under new URLs ). However, some of the domain names may have had a history in Search Engines, and not necessarily a clean one. This should not pose any problems to the new owner, for communicating the change of ownership is usually enough to clear the records and let the domain start anew. However the actual need to check for a problem on these fronts may not even occur to those who purchase a used domain.

Known issues

Sometimes a latent thematic penalty / ban may not be evident to the website owner when purchasing a domain name, and only show its effects when the new web site starts to extend its relevance to the given topic, and tries to rank for themes it is penalized for. Also, the owner may not have noted to the webmaster the presence of a domain history on record. When a web site can not compete in a certain area, or is penalized / banned in general without actually breaching any policies or guidelines, the history of the domain may need to be checked, and a Reinclusion Request sent to Google through the Webmaster Tools pages. In these cases the penalty or ban may be the remains of a URL based record in the Index, a penalty that was raised because of ill natured methods used on the site that was previously hosted with the domain name.

+ Resolution: Without jumping to conclusions, checking the record of a previously owned domain may always be a good idea. An immediate measure for to-be-purchased or only recently bought domains could be to visit the actual web site, or checking the cached version of the pages ( historic supplemental results ) in the Google index and optionally in other Search Engines as well. A quick check on domain information available in Google may also reveal any existing problems. Should the question arise at a later time, or have these pages been removed from the database already, other methods, such as tracking the WHOIS record, checking any active historic, yet off-topic inbound links and their sources, and most importantly, the use of the Wayback Machine of the Internet Archive may carry some hints on previously hosted pages. It is fairly easy to identify malicious or MFA web site, and should you see such a picture when looking at the previously recorded states of the domain, you may need to file a Reinclusion Request through Google Webmaster Tools, explaining the situation and the fact of ownership change. Read more on Banned from Google and Reinclusion Requests.

Hijacking Google results

Hijacking Google results


Hijacking is either an accidental or deliberate abuse of known vulnerabilities of Search Engine indexing, where a web page poses as the original source of another URL, while is hosted and owned by a completely different entity. The phenomenon takes place when during a crawl, Search Engine bots discover a URL that - while is on a different domain than the one that has the original content - gives off false signals of being the new location of the same page. If the hijacker is left unnoticed, and no action is being taken, the original web page may be replaced in the Index by the falsely assumed new location, resulting in a complete takeover of a website's rankings by a 3rd party.

Search Engines sometimes mistake certain server messages as legitimate requests to index the source URL ( hijacker ) instead or ahead of its target ( the hijacked page ). In other cases online plagiarism can turn out to drop the original source in favor of the hijacker, if the abusing party owns a domain name that has the proper parameters set to outrank the target, and copies its content. Such factors are always technical related, misinterpretation of a server redirect, or the false assumption of a web page being migrated from an old URL to a new one. In either case, the act is in fact violating international copyright laws, thus is not only unethical, but in some cases, illegal. A deliberate act of web page hijacking may see legal actions by the owner of the original ( hijacked ) website. Making the proper precautions, monitoring your content on the web, and taking swift and firm action on the first sign of a hijack is of high importance. Hijacking, while has been a widespread problem until 2006, seems to be less of an issue since the temporary server redirect ( 302 redirect ) exploit in Google has been issued a fix. However plagiarism and proxy hijacking may still pose a problem, to which the final resolution from the Search Engine technicians are still in the works. Precautions that can be made are the setting up of access control to initially ban known proxies, and any request that is disguised as a Search Engine bot if it does not arrive from an IP associated to the domain of the given crawler. Also a properly set up Google alert may give out hints in time if the URL or unique content found on pages is being used elsewhere on the Internet. To do this, go to the Google alerts setup page and request reports on specific content ( adding the queries in between quotes ) and the domain name itself as well.

Known issues


Case 1,
The infamous 302 redirect hijacking, ( which used a temporary redirect server message or the META refresh attribute with the to-be-hijacked page as its target ) has been an issue for years, even if it occurred in very low numbers, the possibility of this exploit had to be dealt with. And as so, with the help of the webmaster community, constant reports and data analysis, this vulnerability in Google has been fixed.

+ Resolution: In the past the resolution to this problem was to contact the webmaster of the offending domain to disable the redirect or remove the page(s) in question, and in case the Index has already taken note of the new URL, also file a spam report at Google, explaining the situation. If the webmaster was not to respond, the proper action was to contact the hosting company, the server park, the registrar or any other entity that could take action against the hijack by making the offending page or domain inaccessible from the web.



Case 2,
The remaining exploits make use of the parameters of the hijackers' domains being relatively higher than their target in certain aspects, thus when the plagiarized content appears on the new URLs, the Google Index may identify it as the migration of the original website to a new location, delist the original, and keep the hijackers' pages. In these cases the new pages will first either outrank the original pages, and thus force them into slowly being filtered out as duplicate content, or replace the original site's rankings a page at a time. Typical are a Proxy Hijack or Hacked websites, in other words, computers being abused by a 3rd party. It is important to note that finding the domain that is hijacking the website does not mean it was deliberate, and even if it can concluded to be so, the domain ownership may rarely reveal the culprit. In general, hijacks are extremely rare, as the parameters to outrank the original URLs are very unlikely to be set high for domains that attempt a deliberate hijack, or have low security. However the abusers may be using other unethical and complex methods that can't be tracked by online means, such as hacking a highly trusted website or gaining access to trusted domains, pages or systems by other, sometimes illegal means such as spyware and trojan viruses. But even in the case of the hijacking domain having a higher PageRank, or even TrustRank, the owner of the original content can easily prove to Google that the new URLs are in fact another entity plagiarizing the original pages, and thus get the original website's rankings back, and the offending pages excluded from the Index.

+ Resolution: Block access to the pages of the website from the domain, IP or IP range that is copying the content. Make sure not to seal off access from other visitors, but do everything you can to keep the unwanted bots out of your server. File a spam report at Google, explaining the situation, and should the hijack be deliberate, you may seek legal advice as whether to file a DMCA complaint. Most importantly, you'll need to communicate the issue to Search Engines that have misinterpreted the content appearing on another set of URLs, and also block access to this and any further attempts on automated scrapers copying your content, but without denying access to legitimate requests, such as crawling by Googlebot. Keep in mind that the proxy bots copying your pages may also identify themselves as another entity to bypass the security. For this reason, you may need to match up the IP address of the requests to the domain they resolve to, and should an attempt to cloaking be evident ( a bot identifying itself to be from a Search Engine, while its IP address shows no relation to the domain of the bot in question, e.g. googlebot.com, crawl.yahoo.net ) you should deny access. Most often hijacking only poses a problem as it invokes a filter wrongly accusing the original URLs to be the duplicates. Read more on Duplicate content.

Scraper sites

Scraper sites


Ever since the introduction of affiliate and Pay Per Click advertising, there have been web sites on the Internet that are trying to produce an income from displaying commercials without adding real value for the user or to the Internet in general. Web sites that have no unique content, or in some cases no content at all, have been on the rise for the last couple of years. Some of these attempts at creating a virtually unique advertising surface are from web sites that automatically produce pages on a certain topic, by crawling the web, copying and blending information on the theme to make it less easy to identify as plagiarism, but organizing the data for web crawling bots, and not visitors. Some of these web sites have gone to a level in accumulating unique-looking but actually scraped and then combined content, that they would simply query different search engines, and use the titles, links and descriptions of the pages from the results as their content. Scraper pages may be disguised as many things, most are giving off the feel of a search result page, some are posing as blogs, news feeds, some are exact copies of content found elsewhere on the web. The common feature is the pay per click advertising links featured on the pages, for which the content was scraped for. Scraping itself is not a negative thing, however when automatically gathered information is reorganized without the knowledge, will or benefit of the original authors/publishers, furthermore the thus created web site is compiled in a way that it clearly is of no use to the visitors either, that web site is to be considered spam. A surface created to rank high in search engines, implementing an unnecessary step for users between queries and the actual web sites, benefiting from the disguised Pay Per Click advertisements.

Known issues


Since these pages are created automatically, and some can only be manually evaluated as spam, Google will eventually index some of the many. Thus links and content on such pages may sometimes point to, or be taken from another, legit and valid web site. In such cases the better established and longer history page is nearly never affected in any way, the links from such scraper sites are rarely taken into account for judging trust or relevance. Also most of such sites are soon filtered out or reported to the Google Web Spam team, and if not automatically, then manually removed from the index. In certain cases scraper sites may however cause a lower importance page within a web site to be considered for examination as duplicate content. Another rare issue is when a massive amount of 3rd party scraper site pages link to a web site, and thus generate an incoming link pattern for it, that is similar to massive link scheming methods.

+ Resolution: Should you become aware of your content being used on such a web site, or being linked to from such pages, you should report the URL in question to the Google Web Spam team through the Google Webmaster tools panel. For some technical precautions and security tips read more on Hijacking as well.

Anchor text link

Anchor text link


The calculation of link relevance is an important factor when identifying the theme of a web site, in which both navigation and incoming references play a huge role. The broad relevancy communicated towards the Google Index consists of the themes of individual pages, which are connected by their relevancy network and shape the overall image of the web site. While the topic is of course set by the actual content of the page, the level of its relevance builds with off-site and off-page references as well. If any web page is often linked to with anchor texts that successfully emphasize its theme, the words and derivations with which it is referred to become the phrases it is seen most relevant to. Thus both website navigation and inbound links carry votes, and not only for the used expression, but also for the entire theme, even for words with similar meaning that are not directly targeted. This is of course if the web page itself is relevant to these phrases. In case the terms used to point to the web page can not be found in its title and/or content, nor do they have any relation to its theme, the votes carry much less weight or are sometimes even discarded. Misuse of this practice is thus easy to identify. Also the number of references may not put as much weight into emphasizing the theme as the quality of the links pointing to the resource. Hundreds or thousands of links from less trusted, less important sources aren't likely to match up to the weight of a single referring link from a well trusted, quality web site.

Known issues

Case 1,
If the anchor text is irrelevant to the source and/or the target page, the link will most likely be ignored altogether, or only pass a single vote for the exact phrase. If a page does not rank at all for phrases that are otherwise relevant to its topic, even though the web site is well referenced from other sources, either the incoming links or the internal navigation anchor text is flawed in its attempt to carry the theme throughout the web site and will rank significantly lower than other URLs.

+ Resolution: The anchor text used to point to a page is matched to the title and the content of both the source and the target during relevancy calculations. While there's little chance of a penalty if either are off-topic, a proper anchor text and title is one of the most important initial signals to users who are yet to arrive to the page or web site. The theme should be clearly defined by the wording and be consistent in all three, allowing people to determine if it is the resource they need, whether they encounter a link on another web site, or the page title and description on Google Search results. Should any of the three exclude the keyphrase(s) of the resource, it will become hard for both users and bots to determine the exact theme, and thus the URLs will show much lower positions for the given queries even without a penalty. Also, while it is an evidence, and natural to have diverse wording of anchor text from referring sources, the title and the anchor text of the website navigation are both often used by people when creating a link. Thus again, choosing the proper, most descriptive phrases that match the content may be necessary to avoid lower than predicted position on the results pages.

Case 2,
Receiving links, or requesting others to reference pages with always the same anchor text will raise the question on how much control the web site had over the wording of its own "votes". If the profile shows a pattern that is the same as of sites trying to manipulate their rankings, this may raise a penalty once the same-anchor text link instances pass the natural threshold. Repeated misuse or overuse of such methods would lead to page-based penalties as opposed to phrase based, or even being banned from the Google index. Years of studies have shown a highly predictable pattern of "natural" linking in regards of anchor link texts used. Sometimes however, a page would accumulate a lot of references with the exact same anchor text by chance, enough to outweigh anything else in its linking profile.

+ Resolution: Reasons may be very simple, from not having a long enough title ( which people often use for anchor text ) to not having a description that others could paraphrase. While it is important to define the exact topic and role of a page ( and it is recommended to use consistent anchor text in the main navigation of a web site itself ), in cases of repetitive anchor text in inbound links, a signal is sent that the site is receiving manipulated "votes" to boost its relevancy. This is especially in the event when the branding targets a highly competitive term, which is often used by spammers and may be seen as manipulative. If a page has passed the natural threshold, and is now considered to have been excessively linked to with the same phrase from other domains, a penalty for this term is applied, forcing the URL to a lower position on the results pages. While the system can tell with a very good chance when such practices are in place, sometimes even natural linking patterns will show these signals. However, this penalty is automatic, and may only affect the given query and URL. If the page gets references from other domains with other phrases in the anchor text as well, the penalty may then be lifted. For this you may revisit and extend page titles and descriptions ( should they have been too short ), or provide some indirect ideas to visitors on how to describe / define the theme of the site.

Case 3,
Keyword stuffing, while an unofficial term, clearly describes a past spamming method of which now has a proper counter measure in the system of Google. Using improper length or irrelevant phrases in anchor text when pointing to an internal page may trigger the applying of a filter, and lower the rankings of the URL for searches that include the used words. Continued misuse of anchor text may also lead to the excluding of the URL(s) from the Index, including the source and target pages as well. Recent additions to spam filtering now examine the relevancy of the target page closely, and in certain cases highly competitive commercial terms included in the anchor text, but to a page that is not relevant to them, may be seen as manipulative.

+ Resolution: Accidental overuse of anchor text can easily be avoided by judging a text link, or text link navigation by its aesthetics. Two or even three word links are not at all uncommon, while an entire paragraph of words being used as the text for a link is obviously not meant for better user experience. Avoid stuffing too many keywords into a link, both for your internal navigation, and incoming links from other web sites. Again, any pattern that could be identified as not "natural", is easy to spot for anyone, thus you should assume that Googlebot and the Google algorithms can just as easily judge these cases with a very good accuracy.

Case 4,
A newly issued spam detection system, that has been created to battle off scraper sites, links purchased for their parameters ( PageRank ), spam and other manipulative attempts, now examines the relevancy of any given page with a complex, phrase-matching method. This patent involves predicting the number of only marginally related, competitive phrases present in a document for any given theme. Its effects in combination of other closely examined factors may affect websites that have been ranking well for certain phrases so far. The Google algorithm also looks for attempts to artificially create relevance from semantic correlation if the topic of the page would not indicate the presence of certain references ( yet is including them ). Should a page, by accident, pass the threshold of a natural number of related highly competitive phrases that are not supported by its off-page signals ( inbound links, relevant internal pages referencing it with just as relevant anchor text ), or should a page use an excessive amount of thematically unrelated, but semantically similar terms, it may receive a very distinct penalty ( dubbed by the webmaster community as the last page or -950 penalty ) for the exact queries it was assumed targeting. The pages would stop ranking for a phrase, in case they have a distinctly high TrustRank, they may take the very last positions shown for the given query, but may still have good positions for others. Also, as fluctuations within the system can indicate borderline cases of mixed problems, these URLs may be shown in their original, or better than original rankings for a period of time. Examples include when a page would have strong relevance signals for a two word keyphrase, and is seen attempting to create relevance ( links or content for topics that by themselves are seen as a separate theme ) by using other two word keyphrases that utilize a single part of the one it ranks for. A different case is when a page unknowingly passes the threshold of a "natural" number of references to highly competitive terms, and while human editorial opinion may conclude the topics to be related, automated examinations show similarities with manipulative attempts.

+ Resolution: This penalty is tied to relevancy, thus is often an indication of the lack of proper signals. It is applied automatically and thus any legitimate page can overcome its effects by gaining new outside references to justify the theme, or by using a clearly relevant wording in the title and anchor text pointing to the page. Also, this filter is likely to be adjusted in the upcoming period, to be more accurate in detecting spam documents. You may want to examine the theme hierarchy of your website by making sure the given page is referenced from already relevant pages within the website, and the navigation is using a relevant anchor text as well. Keep in mind that too broad or on the other hand, too specific keyphrases may send the signals of targeting a different theme than the page would be a match for. Single word anchor text may be too generic in certain cases ( and along other single word anchor text with different themes ), and uncommon derivations are not always recognized by the Google algorithm as a match for the topic. Read more on Website Navigation.

Accessibility and Usability issues

Accessibility and Usability issues


Since Googlebot is actually designed to simulate the behavior of an average Internet user, it will check web pages for accessibility and usability issues as well. In general, the algorithms identifying major problems of a web site are highly refined, thus errors, misused code or hard to comprehend layouts are all playing a role in deciding the ranking of the pages. While a few errors will most likely be ignored, major problems, site-wide navigational inconsistencies, and especially intentional misuse or even overuse of certain elements may very well lead to a decline in rankings.

Known issues


Accessibility and usability checks are heavily relying on browser compatibility, which in fact is an ever changing factor. Some practices may now be more widespread than they were about a year ago, yet still be viewed as a hindrance, because of a minority of web browsing software still can not display them correctly. Google is updating its algorithms and Googlebot constantly, thus is expanding the methods a web site may utilize in its design to get its content properly indexed. The results try to be on par with the majority and technical advancements. Shockwave Flash content is analyzed for its textual content, javascript based links are followed the same as anchor text links ( although they don't pass any parameters ), image maps, information in the NOFRAMES tag, and other advancements in standards are evaluated in the same manner for relevance and trust. However the broader range of browsers a web site can serve, the more importance it will be given to. There is still a hierarchy in judging usability issues, rendering the most accessible sites above the specialized designs. For example, text link references will weigh more than image based links, references buried in heavy code will likely to be followed at a slower rate than easy to access navigation.

+ Resolution: The W3C standard for web pages is a good hint on whether web sites are ready to be evaluated by Googlebot, based on the simulated user experience. While a page does not need to comply with all standards, major errors, and problems that are not only browser specific differences will less likely be ignored. Asking yourself the question whether your web site is easy to use, and whether it is accessible with most common web browsers is also a hint. A simple checklist might be to watch out for broken links, orphaned pages, loading time, number of links within the navigation, and the overall navigation communicating a consistent and coherent page hierarchy, images being labeled with ALT tags, the use of unique TITLE and META description tags, proper page encoding settings, language settings, text of readable size and color, no hidden text, no overuse of anchor text in links, no cloaking or off-screen content, no invisible layers, no redirect chains, no overuse of keywords to an extent where the content becomes meaningless, use of all necessary but also closing of all HTML tags, use of proper layout emphasizing the parts unique to a page, and the code not relying on yet to become standard practices. While the list of things to keep an eye out for could seem long, once thought over, the knowledge of web page coding and some common sense applied will save most pages from becoming a burden to your web site, or the visitor trying to decipher them. The most common errors are still the most obvious ones, with misused or vital but forgotten HTML code leading the list of problems, and cause many of the instances of a drop in rankings.

Bad neighborhood

Bad neighborhood


The broad definition of a Bad neighborhood is a network of web sites that have an irregular amount of penalized or banned participants for various reasons. Such networks will at some point of the chain link to web sites that are or were involved in Link scheming, hosting spyware, malware, offensive material, or have been involved in illegal activities such as phishing.

Known issues


Linking to, or in special cases being linked excessively from a Bad neighborhood may categorize a web site as a part of the network, however accidental or 3rd party linking is usually either discounted, or averaged to the overall linking pattern of a domain. In rare cases a minor penalty may be issued instead of a direct and full ban to bring the problem to the attention of the webmaster.

+ Resolution: Avoid linking to or being linked excessively from a Bad neighbourhood. Identifying such pages can be done by checking whether they've been banned from Google, by doing a search on their domain name, or by using the site: or info: operators. No information on the domain name, and/or a nonexistent PageRank ( especially for otherwise accessible internal pages ) are good indications of such web sites. If you find that you're being referred to by such a web site against your will, and that the page in question is still indexed by Google, make sure to report it through the Google Webmaster Tools' spam report utility. However while linking to such domains will most likely raise a penalty, and lower the overall trust of your own web site, being liked to from a Bad Neighborhood against your will, almost never has any effect on your ranking. Exceptions are only when there's an indication ( even if it is false ) that the source and the target are affiliated in any way.

Link Schemes

Link Schemes


Link schemes are seen as irregular, unnatural and controlled linking patterns. Less direct cross linking methods with the sole purpose of letting participating web sites accumulate PageRank at a higher rate than they would do "naturally". Some systems utilize a network of affiliated web sites linking to each other to hide this tactic, but in the end show up in the records for constantly cross-referencing one another. The algorithms were designed to map such link networks, and evaluate them on the basis of how well established the participating web sites are. Should a network consist of an irregular amount of web sites with no actual unique content, no visitors or outside references, or redirect chains to obtain traffic and/or PageRank, "thin" sites with all reasons to assume they have been created but to support others with references ( and no other purpose ) these networks will sooner or later be flagged as Bad Neighbourhoods. Such links are often discarded or the participants penalized or banned from the index. Read more about Bad Neighborhoods.

Known issues


Sometimes a web site may be linking to or be linked from such link scheming networks without the webmaster knowingly participating. Some minor attention is needed at the least when link requests are made to and by webmasters, so that they can avoid the penalties that may arise from being a part of link schemes.

+ Resolution: You should avoid such linking tactics, assuming that the algorithms can map networks of such web sites, and not only discount their referring links, but also penalize or ban web sites that they were meant to support. On another note, if you see a sudden rise in the number of links pointing to your site which you believe to be from scraper sites, you should report these URLs to the Google Spam team. In case of accidental site-wide or excessive 3rd party linking to your pages, you should contact the webmaster of the site(s) which show the links, the ISP and in some cases the registrar of the domain name and request these references to be normalized or added a nofollow attribute. If the source pages will not change their behavior you should contact the Google Spam team to report the problem. Read more about Scraper sites and the rel="nofollow" attribute.

Buying links

Buying links


You may purchase advertisements outside of AdWords to further promote your web site, and thus your link may appear on other web sites, and provide potential visibility. Any advertising system that is meant to do just that, letting people know of your web site, and providing a link to visit it, is viewed as everyday practice by the Google algorithm. Many affiliate programs and other advertising solutions will provide you with such services. However you should be cautious in what programs to enroll, and only advertise on trusted resources. Sometimes you won't be able to check each and every page for their validity, relevance or trust, and will need an overall understanding of the methods such a media agency is applying.

Known issues


The policy of Google regarding linking behavior applies to every and all links Googlebot finds and indexes from web sites, and thus is of course applied to links that were purchased for advertising. If an advertisement can not be matched up for the pattern of "natural" links ( is showing one or more irregularities, for example it would indicate that its sole purpose is to pass on PageRank from the page it is on ), the links may be discarded, or the URLs on the recipient's or on both ends penalized or banned from the index for the given phrase, or all phrases altogether. Note that the algorithm was designed to average linking behavior and does not penalize otherwise legit and well established web sites. The amount of links that are off-topic, those that are offered site-wide from another domain with occurrences of the thousands, links that are from a web site of another language and are off-topic, are matched up against the linking history of a web site, which will be evaluated based on the complete picture of its linking profile.

+ Resolution: You should never purchase links that are, or could be perceived as serving the sole purpose to raise PageRank / boost other parameters associated with the URL in an unnatural way. Ways to evaluate what could be seen as so include making sure that these references are made for the users, and not just Search Engine bots, thus are accessible for people, are on-topic, and relevant to the content and purpose of the page they reside on. Even so, in case you if have arranged links that you feel to be legitimate advertisements, but could be seen as manipulative for the Google ranking system, you may simply request a rel="nofollow" attribute to be added to each of the outgoing references. This attribute will send the message to the system, that these links are not to pass any parameters, and will thus lift any suspicion of trying to boost rankings by force.

( Example text link: Advertising text ).

For other linking profile related issues read more on Link Schemes and Bad Neighborhood.

Website Navigation

Website Navigation


The Web site Navigation or Internal Navigation is the way pages interact with each other within the same domain. The main and category specific options presented to the users on the pages, the links used to create a menu of items within the web site. Like cross-domain references ( inbound links ), these too carry characteristics that the Google index translates as parameters when judging the relevancy, importance and trust of a given page. The consistency of site navigation is the most important aspect of any domain next to the actual content presented, as in it allows users to browse through the pages and find the resources, services or products they are looking for. The system by which Google calculates the rankings on search results pages is closely tied to simulating user experience, thus while a properly set up list of menu items may not boost a web site's position as much as many off-site refereneces to it, a faulty, inconsistent, irrelevant or inaccessible navigation will definately hinder its efforts. Both for providing a good user experience, and being found in the Google Index.

Known issues

Case 1,
The weight, or importance of a page ( even within the same domain ) , is mainly calculated by the number of links that reference it, and also the importance of the pages it receives these references from. Sometimes pages that the designers and webmasters would like to be an important part of the web site do not gain the kind of parameters that would indicate this in the Google Index, and also in cases when more than a single page would be relevant for a query ( within the same domain ) not the most relevant one is presented on the search result pages. In most of these cases PageRank does not "flow" through the navigation in the way the intended, the pages' weight is unbalanced, and thus the positions of the URLs presented through the Google index suffer from a wide variety of inconsistencies.

+ Resolution: Make sure that the intended importance of resources is communicated to visitors and thus to Google as well, by making the most prominent pages accessible from all sections. In case there are too many items in each subsection or category, you may limit the navigation to the most important main areas and the references that are within the actual area of interest to keep the pages comprehensive. Make sure to bring pages that you believe to be of same level of importance to the same level of the link hierarchy, by allowing access in an equal number of "clicks" from the home page, and referencing them in an equal or near equal number of times from the relevant pages. ( But also, do not link to the same resource from a page more than once, unuless it is necessary ). Lay out a plan of tier 1, 2, and 3 pages to predict the PageRank and weight that would be aquired by the subsections from the home page, and simulate navigational funnels to test whether the pages are as "far" or "close" from the home page as their importance would indicate. Some great tools are available to check the levels and linking hierarchy ( an example is XENU Link Sleuth ), and for established sites registering and validating a domain in Google Webmaster Tools will allow access to internal link data, which will show the number of references to the pages within the same domain.

Case 2,
While the internal navigation may be planned well on a site, in case there are accessibility problems with the display of the menu and links, users and Googlebot may have some problems following the otherwise properly laid out structure. Such issues include the use of flash, javascript, and other non anchor text link navigation. These methods, while widespread and accessible for most users, still pose difficulties for certain browsers, computers and people with special needs when browsing the internet. Even if Googlebot follows the items of the navigation menu, in some cases the system may not be able to determine the amount and kind of parameters to pass with these links, and thus the consistency of the navigation will be virtually seen as broken.

+ Resolution: To address accessibility issues, you may need to create a navigation that makes use of anchor text links, either by replacing the current or as an additional set of menu items. This way all browsers, computers, and special programs will be able to comprehend and follow the navigation ( and occasionally translate the references using special programs, for example, to speech, other languages, or even relevancy signals towards Google ). Image links need to have proper ALT attributes set describing the resource that they point to. Javascript and flash based navigation will in most cases not pass any parameters to the target pages, neither PageRank, nor TrustRank nor Relevancy, rendering most of the internal sections virtually "unimportant" and "less relevant". Image links pass a certain amount of all parameters, but less than an anchor text link. In the case of internal pages not being able to receive any "votes" from the home page, you should check not only the layout but also the accessibility of the navigation links. A significantly lower or nonexistent PageRank even for high level pages may indicate the problem of using a technology that is not yet the standard.

Case 3,
Structures that might work well for visitors, in some cases may send different signals towards the ranking system, also, structures that may be the translation of other media ( brochures, slide shows, presentations ) could end up defying the purpose of web sites at the very base of their structure. A useful online resource always is to the point, and allows options for visitors to follow up on its references and topic. The Google algorithm too is meant to simulate user experience when deciding on the importance of a page ( or web site ).

+ Resolution: Keep in mind that PageRank is a parameter that is not passed in its whole, and with every step in the navigation, the votes carry less and less significance. A page that has been voted a certain level of importance may not pass the same amount with its links. In a controlled environment a home page with a PageRank of 3 may pass on its importance to ten subpages, and render them all PageRank 2, if these resources are linked to in the same manner and amount. These subpages may then link to other ( in this example "innermost" ) sections, passing them the parameters to have a PageRank score of 1. ( In this example there are already dozens of pages with a visible PageRank present on the web site ). If the home page linked to but a single page, the passed parameter would still only allow the target ( the tier 2 page ) to have a PageRank score of 2. And from then on, every other subsequent link would carry even less imporance, creating a redundant step in the navigation ( a "splash page", "intro" or language selection page ) that basically now has set the entire site one level lower in significance. Also, in the case of linear navigation ( home page links to second page, second page links to third page, third page links to fourth, and so on ... ) the PageRank parameter erodes with every step, and in the end only 3 pages will be of any weight on the domain altogether, with the rest probably marked as Supplemental for having no weight, no links from higher level pages. Make sure to plan a well laid out link hierarchy to evade such problems, as it is in your users' interest as well to not need to click through redundant pages, and linear "tunnels" of subsequent steps where there is no other option present but to advance forward.

Case 4,
Relevancy calculations rely on many factors. One of the main parameters are the themes and exact phrases a page is referenced to with anchor text links. Relevance may be narrowed down from a broad theme to subsections, and be used to create categories and subcategories of topics until the user arrives to the page that is of interest. ( Much like PageRank, a web site may use its home page to define its theme in the broadest possible meaning, and describe the different areas on subsequent sections and pages ) However the improper use of Anchor text relevance within a domain may end up breaking this chain, and even if a home page or main section would carry a certain broad theme, the references would not indicate that the subpages are on topic as well. Also, recent additions to the Google ranking system examine not only the relevancy of an anchor text link, but also if it is misused in a way to try and artificially create a new or broader category, mass up different topics on a page that is not perceived as a relevant source, or mass up too many "conflicting" themes, that - by current standards and the site's history - can not be legitimately related to each other within the given section. This sytem was meant to battle off scraper web sites that gather "near relevant" phrases and terms within a single page to create artificial relevance, but may also filter out otherwise well established pages if they're seen as presenting too many widely searched, popular and/or competitive phrases - while not having the status or history to indicate they would be a popular resource for them.

+ Resolution: Relevance calculations aren't necessarily one-way, as in a page about a given specific topic may as well reference another page with a much more generic anchor text link. The web site navigaton may need however to reflect how broadly a given topic is discussed by an individual page, and use wording that would clearly indicate the subject and purpose of the target of the link. It is advised to use the broadest possible description for the main pages, and narrow down the relevance with more and more specific anchor text as the navigation expands. Using the same phrases in an extensive way, or using words in the navigation that are highly competitive may have the effect of breaking the balance on which pages to show for a given query, may trigger an Anchor text related filter, or simply make the otherwise very specific and lower PageRank resources to compete with much broader themes and other, much more significant pages. Use the internal navigation to specify the main topics, narrowing down to specific areas of interest. Read more on Anchor text links.

TrustRank

TrustRank


TrustRank is a major factor that now replaces PageRank as the flagship of parameter groups in the Google algorithm. ( Note that a similar system is being used by Yahoo! Search as well ). It is of key importance for calculating ranking positions and the crawling frequency of web sites. It may also extend the grace period before penalties for accessibility and usability problems take place. There is no public information on a TrustRank score of a web site or URL, the only indications of trust are what you can perceive from a user point of view. TrustRank technology was based on the identification of web sites that have been an important resource on the Internet for a long time, web sites that are seen as "authorities" of a certain area of interest, web sites that are historically spam and error free, are provided by a long trusted source, promoted or maintained by a service that is seen less or nearly completely unlikely to show offensive material. The technology uses the most reliable network of references, and is - as PageRank - relying on links for many of its calculations, but also includes a lot of other factors that are next to impossible to be manipulated. Major portals, reference pages, national and international communities, long time conventional media publishers, governmental and educational pages, official pages of institutes, organizations, nonprofit organizations and the longest standing commercial web sites are among those that have been set or "voted" into the position of being a TrustRank hub. The algorithm for calculating TrustRank takes many parameters into consideration, from which the ones that are relatively new to Google ( not covered previously by PageRank and relevance calculations ) are examining linking patterns, the link profile, referral link age and history, and web page history. Trust can be accumulated by means common sense would indicate, including references from people at other trusted sources, refraining from using questionable and borderline business or optimization or methods, having no offensive content or major usability/accessibility problems, overall, "natural" growth and a clean web site history. Note that the TrustRank network evaluates all web pages, thus any and all inbound links carry this parameter ( with the sole exception of sources that are banned or penalized in the Google Index ). The system is currently the closest automatic quality factor, thus one can more or less deduct the impact of a new referrer from the source web site's popularity, status and quality. The algorithms were based on examining both user and web site behavior, translating common indications of trust and distrust into parameters, and are being fine-tuned constantly for an even higher accuracy in identifying them. Note that TrustRank is a technology that has been patented to utilize the implementation of human reviews, and that both positive and negative trust, while can be accumulated automatically over time, can be and is sometimes overruled by manual evaluation. Trust can be lost if abused, and distrust can be restored with correcting problems and optionally a goodwill request for re-evaluation.

Note: The term "TrustRank" is not used officially. You may address the parameter group simply as "trust".

As for battling the few manipulative methods aiming to artificially boost the TrustRank of a URL, Google has made many changes to its calculation rendering it virtually impossible. Trust is now closely tied to relevance calculations, and thus an off-topic link from a trusted source will not pass / negate the effects of this parameter ( and relevancy may be tracked back to as far as the home page of the source web site, thus eliminating references that are not within its original areas of interest ). All hubs and popular queries are being constantly monitored for any irregularities. From a diagnostic perspective, two of the changes are important. One is that in order to keep the rankings on search results pages a fair competition, TrustRank is being passed from page to page within any domain that is not a "set" hub. The other ( that also applies to "set" hubs ) is the effect of trust being tied to thematic relevancy. Read more on Anchor Text and Website Navigation.

Known issues


Case 1,
The effect of a certain threshold of TrustRank required to appear on the search results for generic searches, led some to believe that their web sites have been held back, penalized or banned from Google. While these pages were shown for less generic and obscure searches, single, or two word phrases would not list them in the results. For queries that are broad enough to assume the user is looking for the most general, most official information, only web sites with a certain minimum amount of TrustRank are shown for the results. See the unofficial term the "sandbox".

+ Resolution: As your web site accumulates more references to it from other web sites, its TrustRank will build gradually. Once it is perceived as a trusted resource for the given query, it will most likely appear on the search results for generic queries as well. Most phrases require only a minimum amount of this parameter to pass the threshold, while some closely monitored, highly competitive queries show only the most trusted results.


Case 2,
While TrustRank is a parameter applied to pages in general, an entire domain, IP range or even network may be affected by certain penalties that are connected to its calculation. A web site may lose trust to all pages for hosting or linking to ( or in special cases being excessively linked from ) other domains, which have been breaching Google policies, hosting spam, malware, spyware, or offensive material. Web sites with low or nonexistent TrustRank that virtually are unable to accumulate this parameter are usually under an automatic or manual penalty / ban for practices not only unethical, but sometimes even illegal. Becoming a part of link networks associated with such web sites may also lower or negate trust entirely. After checking many parameters, a following evaluation may conclude that the source and the target domains are affiliated. ( The detection of such affiliations is done with very good accuracy ). In such cases, losing trust and thus rankings would only be the first sign of a coming complete ban of all pages found on the domain.

+ Resolution: Make sure your web site does not host or link to pages that contain malware, spyware, offensive material, spam, or that are already penalized or banned for being marked as a phishing site, are or were at some point involved in illegal activities. If your site is being excessively linked to a 3rd party, you should contact the Google Web Spam team, or file a spam report through Google Webmaster Tools. Such 3rd party inbound links will not get the target pages banned, but may lower their trust ( and get a warning penalty issued ) if the linking pattern, ownership or IP range would indicate that the 3rd party may be affiliated with the targeted web site. Read more on Link Schemes and Historic Domain Penalty.

Google PageRank™

Google PageRank™


Please note that PageRank is currently being exported (starting late April, 2007). During this time you may see fluctuations between the old and the new PageRank score on your Google Toolbar, and occasionally PageRank may not appear at all. These shifts will eventually settle down as the new data replaces the old on all datacenters during the coming weeks.
The foundation of its system and trademark patent of Google, PageRank is one of the hundred or so factors that define the position of a web page for a given query. It is related to the number and importance of links pointing to a certain URL from other web sites, also an indication on how the internal navigation is laid out within the domain. Both Googlebot and users expect the page hierarchy of a web site being indicated by its metrics. PageRank may have become less important during the years of subsequent system updates, but is still a major factor. The best way to describe its use would be, that it is an indication of page importance or page weight within a set of relevant and trusted results. A page that is more important than another with all other parameters being the same, will naturally be shown first. However with TrustRank playing a major role in providing a better user experience, PageRank alone is not an indication of a page or a web site ranking higher than others. PageRank is based on but the amount of incoming links, references from important ( other high PageRank ) web sites, and the level of a page indicated by the internal navigation of a web site. Relevance and TrustRank are not indicated by this number.

Known issues


Case 1,
The PageRank that is published from the internal databases of Google usually has a 90 days refresh rate. It is a snapshot of what the PageRank of a single URL was at a certain time, which has been noted within about three months, thus is not real time information. While PageRank is being calculated to several decimals and updated constantly, the PageRank shown on the Google Toolbar will not show the current score. Thus a page with PageRank 0 shown may very well have a higher score than on the toolbar, and the opposite may happen as well, a high PageRank showing for certain pages may not indicate an already downgraded score, even if it has been decided in the background ( and has been calculated into displaying search results ). None the less it is still a good general indication for both page importance and in-site navigation layouts, and may be referred to as a factor when evaluating the weight of a URL.

+ Resolution: The only known area where public PageRank statistics are considered ahead of the PageRank in practice ( the unpublicized score ) is link trading, which is heavily advised against. Link exchanges that take place for the sole purpose of gaining PageRank will be seen as an attempt to manipulate search results, and thus be devalued, either automatically or during a manual evaluation. Domains that show irregular linking patterns or profiles, reference and are referenced by off-topic pages in an excessive amount, are being linked in great numbers from otherwise unvisited pages that were created to support their placement in Search Engines, and directories that don't use editorial evaluation when considering their listings are more often than not seen as manipulative practice.


Case 2,
Recently with the introduction of the reworked Supplemental Index, PageRank again became important in analyzing a web site's navigation funnels. If due to the internal linking of the pages, some are perceived less important than others ( are too many links away from the home page, are being linked to with irrelevant anchor text or no anchor text at all, are not being linked to in the expected amount to raise their in-site importance ) and thus gain less PageRank, they may become tagged as Supplemental Results. A low PageRank score ( 0 ) may indicate that the source of the problem is in the way these pages are referenced to within the web site ( unless of course the URL in question has been created after the last export of PageRank data, or the page has accumulated quality links since the calculation of its score currently showing ). Read more on Supplemental Results and Website Navigation.

+ Resolution: To achieve a medium or high PageRank, you need your web site to accumulate "natural", quality inbound links from other web sites. If the pages are popular, and are supported by a lot of references from other locations, and especially if important ( high PageRank ) web sites link to them, their score will increase naturally. Note again that despite its name, PageRank is not the only factor that decides how the page will be positioned on the search results. Also note that Google analysts have recorded and calculated many patterns of irregular linking behavior, and the algorithms can, with good accuracy, detect which references are not "natural". Such links are typically the ones that have been purchased or negotiated for the sole purpose of acquiring PageRank. A good indication on what to avoid if you see links that entirely disregard usability or user experience, are off-topic and probably would not be clicked on, and in general appear to be less aimed at visitors and more as an attempt to haste the accumulation of PageRank. Such links are often discarded, and in certain cases, or if the volume of such links is exceeding what could be seen as accidental, penalties could follow for either or both the source and the target URLs. See more in Link Schemes.


Case 3,
For pages within the same web site, PageRank is a good indication of linking hierarchy. A certain amount of PageRank is passed on from the home page to level one pages, for example categories, or sub-pages, then these pages again pass some of the importance to the resources they link to. Same level pages, in other words, pages that are an equal number of links away from the home page and/or are linked to in equal numbers within the domain, will share the same score. Thus planning the website navigation is of key importance when indicating what information you consider the most unique and useful, what pages you believe offer enough choice, but also enough information to your visitors as the first or second screen to be seen.

+ Resolution: Cross linking same level pages is important, however unrealistic navigation, such as hundreds or even thousands of links on every page are strongly advised against. Trying to communicate the message of all being tier one, will most likely be held up for further examination by anti-spam filters. Stay reasonable with such cross-linking. Read more on Website Navigation.


Case 4,
PageRank history and "natural" links: The system of PageRank was designed to evaluate a web site's importance during a period of the Internet when links would serve the sole purpose ( and were the single source ) of allowing a visitor to follow up on a certain topic. Web sites that were often referred to, were with a good chance the most valued resources for the information, thus were seen as the best choice and ranked highest in the search results. A revolutionary idea at the time, that since then became the target of spammers as soon as marketing through organic search results became a widespread practice. PageRank by itself was no more a viable source of information on the value of a web page. For years, the cross examination of user and web site behavior, link analysis and linking behavior patterns have been used to fine-tune search engines, including Google, which can now tell whether a link was meant for the users ( and in the same time increase the importance of a page ), or was added for the sole reason to cast a vote for a web site through the link popularity system. Too much of the latter will trigger closer examination, discarding of these votes, and sometimes a penalty or ban.

Bulk update

Bulk update


During the past year or so, many web sites that used spamming tactics to push their results into the Google database were implementing ways to automatically generate thousands, in certain cases even millions of pages. These were almost always carrying scraped content, or to put it another way, information that was copied off of other web sites, databases, directories, sometimes combined in a way that they would be hard to detect by the algorithms at that time. Also with the same methods, often unique subdomains were created along the way, trying to evade being filtered by the anti-spam algorithms.

Google as a counter-measure has implemented new ways of identifying irregular site expansion and page generation behaviour, resulting in a new filter that was meant to take a closer look on bulk updates of previously nonexistent URLs, both on new and well established web sites. Should a domain show the symptoms of being used to create a massive, but unoriginal or spam content, the algorithms now take an attempt to not only filter them out from the index, but take preemptive action and block their entry into the index altogether.

Known issues


Any web site that launches a number of pages that is irregular in the history of its domain is probable to be closely examined. Web sites that were producing new content at a certain pace, suddenly expanding in a much more rapid way, new web sites that are launched with several thousands of pages, and web sites that are re-designed and thus show content on thousands of new URLs seem to be affected by this new practice as well. In the end however, all valid URLs that are not seen as an attempt for spam are usually accepted into the index.

+ Resolution: In case you'd like to be exempt from such examinations, you should avoid bulk updates of thousands of new pages, and update your web site gradually instead. However this practice is not a penalty, but a simple precaution from Google, so that the quality of the search results may remain at their optimal level, excluding spam pages. The period for which the pages are examined for rarely seems to take longer than reasonable, and well established web sites will most likely not see an overall re-evaluation of all content because of such updates. The examination itself is meant to check whether such pages are an urged attempt to artificially build relevance, boost PageRank, provide non-unique context for advertisements, or are valid resources that are meant to serve visitors - and content that is to stay on the web site. Adding a massive amount of new URLs that were meant to do either but the last, may temporarily lower domain related parameters and cause a visible drop in rankings for a period of time. As the content and history of the newly added pages build, they will gradually allow the web site to regain this trust partially or completely.

Canonical URLs

Canonical URLs


Due to the filters applied to battle off spam and scraper web sites, duplicate content has become a major issue. The filter for duplicate content is applied to URLs that serve up the same web page content under different addresses, thus are filtering out potential cases of plagiarism or repetitive pages. See more information on Duplicate content.

In order for a web page not only be, but also be perceived as the only copy with its content, the proper server settings, internal navigation and inbound links are necessary. The Canonical URL is the URL that is thus set as the only URL to be able to serve that particular web page. In other words, it is the preferred URL for a single web page. Also, choosing a single Canonical URL to be used for each web page will help concentrate all incoming references, and accumulate all parameters such as PageRank in a more effective way.

Known issues


Sometimes a single web page with no additional copy of it existing on its server can still be perceived by the algorithm as the duplicate of another. This may be the result of not choosing or not setting up the www. subdomain preference on the server, or in the Google Webmaster tools panel, leaving the same web page displayed for more than one kind of parameter sets with dynamic queries or having directory index files linked to, both by their full, file level and shortened, directory level URLs that will default to the index files. ( For example in some cases the very same web page could be accessed through the following URLs : www.example.com/index.html , example.com/ , example.com/index.html, www.example.com/ , or in another example: www.example.com/product.php?item=10&action=review , www.example.com/product.php?item=10 , www.example.com/product.php?action=review&product=10 ... etc. )

+ Resolution: Make sure that a single web page can only be access through a single valid URL. Correct the navigation of the web site so that a single page is always linked to in the same manner, using the correct parameters with the URLs so that the same content ( for example database requests ) can not be accessed and served with more than one set of add-on strings, excluding variations such as different order in which the parameters are included in the URL. Also check whether you are relying on the server setup for web pages that are shown as default for directory level URLs. Make sure that such pages are referred to in the same way throughout the web site navigation, and that no inbound links are pointing to the other version either. You should also see to it that your server is set up properly for cases you can't avoid any of the above, and set up permanent redirects to correct the problem. Using 301, Permanent redirects in a .htaccess file should allow the correction of already existing duplicate URL entries and also prevent Google from indexing the same page on a different address. Also keep SSL protocol in mind, an http and an https version to the same page is also seen as a duplicate.

Omitted results

Omitted results


The link to the Omitted results, at the end of the last search results page, show the URLs that were judged to be very similar in their content to the ones already on the list, thus excluded in the first run. You may click on this link and see the full list of every matching URL for a certain query, and will find that it's a useful way of grouping multiple similar results from same domain, to occupy less space on the result pages, thus provide more options and variety.

Known issues


The algorithm judges similarity by relevance of the pages. If there are more than two relevant, important pages that match the query on the same web site, the rest will be shown only if the Search Omitted Results link is clicked.

Recently the evaluation for relevance has been extended with some additional parameters, and thus now includes the examination of the description of a web page, and repetitive use of entire blocks of content ( boilerplate text ). If the query matches the description or the boilerplate text, where by this pattern multiple pages are found to carry the exact same relevance, they will be grouped under the URL that has the highest values for its other parameters. An improperly written, or not present description may result in more relevant pages being grouped under such links, and not displaying the most relevant sub page for a web site, for in such cases the relevancy score will not be supported by this important factor. A page may be displayed with a snippet extracted from its content if the query matches a certain area in it. If this area is repeated on many other pages, and is not featured elsewhere in the content, again, the URLs would be grouped under the Omitted Results link.

+ Resolution: Make sure that all of your web pages have a descriptive, and on topic title and META description tag available. These description tags will also serve as the snippet appearing under the title of the page on the Search Result pages, whenever they include matching strings for the query made. You can check whether your pages improperly share the same description by examining your web site in Google with the site: operator. Pages sharing the same description will be grouped under the "Omitted Results" link. You may also want to avoid using entire blocks of repetitive or "boilerplate" text in the content that would describe the documents with the same words over and over again. If the section intended for natural branding has more weight or is more prominent on a page than any other descriptive content, and is repeated word by word on many other pages as well, the same effect would apply, grouping all but one or two URLs under the "Omitted Results" link.

Supplemental Results

Supplemental Results


Supplemental Results in the Google index display URLs of web pages identified as relevant for the search query, but judged less important than others. The index of Google has been designed to sort results in its index based on many factors, which not only indicate whether a web page is on topic, but also its importance ( or weight ), based on referring links it gains over time.

Supplemental results are not indications of a penalty, and not necessarily are symptoms of any kind of problem with your web site. These pages are cached, sorted and ranked based on relevance just as normal results, and are displayed on search result pages.

Supplemental results also may be indicating a previous version of a resource, at a URL that is otherwise featured in the index as non-supplemental, may show a now deleted redirected URL ( in both cases with a copy of the recorded state of the web page in the cache, for up to a year ), or simply be caused by the URL being identified as redundant, but none the less relevant information for a certain query. Supplemental results may rank best for obscure search queries.

A URL being listed in the supplemental instead of the primary Index is not a final and irreversible decision, and may change over time, as the page the URL refers to gets more and more referring links on the web from more important pages. Also, should a page lose its significance and thus its inbound links, it may be moved from the primary Index to the supplemental.

Known issues


Case 1,
Supplemental results are perceived by many as a penalty due to the fact that these results display below the normal results. These URLs are not penalized, only judged to be lower of importance by the algorithm, based on the same factors that sort all results.

+ Resolution: Supplemental pages in most cases have a low PageRank ( 0 ), indicating the most common issue, which is that they do not have enough quality and on-topic inbound links. In other words, there are too few or no references from other trusted and important web sites for the page. In case you feel your pages carry unique, nowhere else to be found information, information presented in a unique way, or that they should be recognized by the Google index as so, it should be your priority, and should be of no problem to inform people of their existence. Also visitors to other parts of your web site will sometimes find a certain page to be interesting enough to link to directly, to mention it on the Internet on other pages, and thus add to its importance. A page once Supplemental will be featured as a normal result once it is perceived trusted and important. Please note that this in no way means that you should participate in link schemes, purchase links to trick the algorithm or apply spam like tactics, for they are quite unlikely to work due to the filters identifying such patterns. The emphasis should be on natural linking.


Case 2,
In case your information is unique, or presented in a unique way, and your page(s) are marked as Supplemental, the web site may use a navigation that renders the URLs in question to be perceived of too low importance. While a web site itself may be showing normal results in the index, if pages not too far from the domain root are already considered unimportant, the navigation is most likely directing the attention away from them, especially by the logic of Googlebot.

+ Resolution: Lay out the site navigation in a way that the same level pages are in fact perceived by Googlebot as same level. Be reasonable, as for the truly unique information is likely to be less than you'd like to think, and patterns of several thousands of pages with very low importance are sometimes examined closely for being spam or not. Try to set up navigation funnels that are easy to follow, categorize the information in a comprehensive way, and provide a navigation that emphasizes the importance of a page by its reachability. Again, be reasonable. Having too many links on a single page for the sake of bringing all of them to the same level will result in the opposite effect than intended. Read more on Website Navigation and PageRank.


Case 3,
The Supplemental index is seen as incomplete, and is in the same process of constant updating as the normal index. Certain pages of your competitors' may yet to be evaluated for being supplemental or not. In other words, a page on your site being marked as supplemental, while a similar page on another site is not, does not mean it will remain as so. The same parameters are judged for each URL, thus once most of the relevant pages are checked, the supplemental index will most likely be perceived as it's name would indicate, supplemental results from related, less significant pages.

+ Resolution: The supplemental index is being crawled and updated regularly, although not quite as often as the normal index. However in case a web page in either index is found more suited for the other, it will be updated as so.

Enter your email address:

Delivered by FeedBurner