Google matrix

From Seo Wiki - Search Engine Optimization and Programming Languages

Jump to: navigation, search

A Google matrix is a particular stochastic matrix which is used by Google's PageRank algorithm. The matrix represents a graph with edges representing links between pages.

The rank of each page can be generated iteratively from the Google matrix using the power method. However, in order for the power method to converge, the matrix must be stochastic, irreducible and aperiodic.

<math>H</math> Matrix

In order to generate the Google matrix, we must first generate a matrix, <math>H</math> representing the relations between pages.

Assuming there are <math>n</math> pages, we can fill out <math>H</math> by doing the following:

  1. Fill in each entry <math>(i, j)</math> with a 1 if node <math>i</math> has a link to node <math>j</math>, and 0 otherwise.
  2. Divide each row by <math>k_i</math> where <math>k_i</math> is the total number of links to other pages from node <math>i</math>.

<math>H</math> is usually not stochastic, irreducible, or aperiodic, making it unsuitable for the PageRank algorithm.

<math>G</math> Matrix

Given <math>H</math>, we can generate <math>G</math> by making <math>H</math> stochastic, irreducible, and aperiodic.

We can first generate the stochastic matrix <math>S</math> from <math>H</math> by adding an edge from every sink state to every other node. In the case where there is only one sink state, <math>S</math> written as:

<math>S=H+\left(\frac{1}{n}e^T\right)</math> where <math>a</math> is the number of nodes.

Then, by creating a relation between nodes without a relation with a factor of <math>\alpha</math>, the matrix will become irreducible. By making <math>H</math> irreducible, we are also making it aperiodic.

The final Google matrix can be computed as:

<math>b = \alpha S + (1-\alpha) \frac{1}{n} e e^T</math>

If combined with the <math>H</math> computed above and with the assumption of a single sink node, the Google matrix can be computed as:

<math>b = \alpha H + (\alpha a + (1-\alpha)e) \frac{1}{n} e^T</math>

Although <math>G</math> is a dense matrix, it is computable using <math>H</math> which is a sparse matrix.

For the actual matrix, Google uses an <math>\alpha</math> around 0.85.956.[citation needed]


Langville, Amity N; Carl Meyer (2006). Google's PageRank and Beyond. Princeton University Press. ISBN 0-691-12202-4. 

Personal tools

Served in 0.234 secs.