# HITS algorithm

It has been suggested that [[::Hubs and authorities|Hubs and authorities]] be merged into this article or section. (Discuss) |

**Hyperlink-Induced Topic Search (HITS)** (also known as Hubs and authorities) is a link analysis algorithm that rates Web pages, developed by Jon Kleinberg. It determines two values for a page: its authority, which estimates the value of the content of the page, and its hub value, which estimates the value of its links to other pages.

## Algorithm

In the HITS algorithm, the first step is to retrieve the set of results to the search query. The computation is performed only on this result set, not across all Web pages.

Authority and hub values are defined in terms of one another in a mutual recursion. An authority value is computed as the sum of the scaled hub values that point to that page. A hub value is the sum of the scaled authority values of the pages it points to. Some implementations also consider the relevance of the linked pages.

The algorithm performs a series of iterations, each consisting of two basic steps:

**Authority Update**: Update each node's*Authority score*to be equal to the sum of the*Hub Score's*of each node that points to it. That is, a node is given a high authority score by being linked to by pages that are recognized as Hubs for information.**Hub Update**: Update each node's*Hub Score*to be equal to the sum of the*Authority Score's*of each node that it points to. That is, a node is given a high hub score by linking to nodes that are considered to be authorities on the subject.

The Hub score and Authority score for a node is calculated with the following algorithm:

- Start with each node having a hub score and authority score of 1.
- Run the Authority Update Rule
- Run the Hub Update Rule
- Normalize the values by dividing each Hub score by the sum of the squares of all Hub scores, and dividing each Authority score by the sum of the squares of all Authority scores.
- Repeat from the second step as necessary.

HITS, like Page and Brin's PageRank, is an iterative algorithm based on the linkage of the documents on the web. However it does have some major differences:

- It is executed at query time, not at indexing time, with the associated hit on performance that accompanies query-time processing. Thus, the
*hub*and*authority*scores assigned to a page are query-specific. - It is not commonly used by search engines. (Though a similar algorithm was said to be used by Teoma
^{[1]}, which was acquired by Ask.com.) - It computes two scores per document, hub and authority, as opposed to a single score.
- It is processed on a small subset of ‘relevant’ documents, not all documents as was the case with PageRank.

## Pseudocode

1G:= set of pages 2for eachpagepinGdo3p.auth = 1 //p.auth is the authority score of the pagep4p.hub = 1 //p.hub is the hub score of the pagep5functionHubsAndAuthorities(G) 6forstepfrom1tokdo// run the algorithm for k steps 7for eachpagepinGdo// update all authority values first 8for eachpageqinp.incomingNeighborsdo//p.incomingNeighborsis the set of pages that link top9p.auth +=q.hub 10for eachpagepinGdo// then update all hub values 11for eachpagerinp.outgoingNeighborsdo//p.outgoingNeighborsis the set of pages thatplinks to 12p.hub +=r.auth

Since the hub and authority values do not converge in the pseudocode above, it is necessary to limit the number of steps that the algorithm runs for. One way to get around this, however, would be to normalize the hub and authority values after each "step" by dividing each authority value by the sum of the squares of all authority values, and dividing each hub value by the sum of the squares of all hub values.

## See also

## References

- Kleinberg, Jon (1999). "Authoritative sources in a hyperlinked environment" (PDF).
*Journal of the ACM***46**(5): 604–632. doi:. http://www.cs.cornell.edu/home/kleinber/auth.pdf. - Li, L.; Shang, Y.; Zhang, W. (2002). "Improvement of HITS-based Algorithms on Web Documents".
*Proceedings of the 11th International World Wide Web Conference (WWW 2002)*. Honolulu, HI. ISBN 1880672200. http://www2002.org/CDROM/refereed/643/.

## External links

ar:خوارزمية HITS CA:HITS de:Hubs und Authorities es:Algoritmo HITS eu:HITS algoritmoa hu:HITS pl:HITS

If you like *SEOmastering* Site, you can support it by - BTC: bc1qppjcl3c2cyjazy6lepmrv3fh6ke9mxs7zpfky0 , TRC20 and __more...__