Update: Rand has provided this link to a copy of his presentation.
Moderated by Search Engine Watch editor, Chris Sherman, the first session (out of 5 to pick from) of the conference dealt with patents filed by the major search engines how they “might” affect the way web sites can be optimized for better rankings. Ranging from fairly general to breaking down information retrieval mathmatical formulas, there was enough content in this session to fill 3 or more posts so that is what I’ll do.
First up was Rand Fishkin of SEOMoz who presented in absentia via audio files embedded in his PowerPoint presentation. Rand’s presentation focused on several aspects of Google’s “Information Retrieval Based on Historical Data” patent which he has dissected on his web site. Of particular focus were the topics of temporal analysis of links, pages and web sites along with domain data and ranking history.
It is important to note that while many of the observations made from analysis of this Google patent are compelling, they are not all necessarily in place now or will be in the future. However, Google has taken the time to document specific methods.
The first concept that was introduced was the value of “document inception date” where Google records the date a particular document (like a web page) was first indexed. Most major search engines record this information about a domain name. Methods that search engines use to record a document inception date include: spidering, location via a hypertext link or through registration data from Whois – when domain was purchased and registered.
Additional Google Patent observations:
How do content changes affect rankings?
1. Google may positively weight pages that change often
2. Cosmetic changes are ignored to avoid automatic manipulation
3. Links that remain after pages are updated my be considered more valuable
How does the temporal analysis of links affect rankings?
1. Measuring when new links appear to identify trends. This method compares the current rate of link gain to the overall average rate of link gain. An upward trend may indicate freshness and a downward trend may mean the site is stale.
2. The percentage of links that are new
Weighting of links
1. Date the link appeared in the Google index
2. Date of changes in the anchor text
3. Date of change of the page the link is on
Additional link weight considerations include the trustworthiness of inbound links, the pages and sites the links are on as well as the freshness of the link page.
Spam detection through temporal link analysis
Considerations for what can flag a site as spam:
1. Speed of link gain
2. Source of link gain
3. Topical or spam? The Tsunami blog is an example of a site that gained a very large number of links quickly due to a topical event, not a mass linking campaign.
Domain related considerations. Note, Google is now a registrar and has access to Whois data for all domain names.
1. Duration of domain name registration, “Disposable” domain names are rarely registered for more than a year
2. DNS domain name server information and hosting company information – are they on any blacklists
3. May use a list of known bad contact information, black listed name servers, etc
Preventative measures Google may take against spam
1. Limit the maximum rank increase in a certain space of time
2. Consider references of the site or web document in news articles and discussion groups
3. Watch for falling rates of growth, links and citations
Traffic analysis issues
1. Large reduction of traffic
2. Seasonality may be used to help relevance
3. Measure advertising traffic to determine relevance including the click through rate of the traffic referrals from the pages the ads are on and the quality of advertisers that show on a site
User Data – Google can measure user search activities using data compiled from the Google toolbar, Gmail, Google desktop search and personal search
1. Number of times a page is clicked in the search engine results pages (SERPs)
2. Amount of time spent on a site compared to what the average is
What comes out of this analysis of the Google Historical Data patent is that there is no advantage to “quick” solutions. Regardless of whether Google has implemented any or all of the considerations of this patent, all the major search engines are continuously improving quality measures for search results. Search engines have matured from using on-page content to adding link popularity and now may start to use user data to improve search results quality – making current methods of “gaming” the search engines even more difficult and creating a better user experience.