From Seo Wiki - Search Engine Optimization and Programming Languages
|File:Google web search.png|
The Google homepage
list of domain names
|Type of site||Search Engine|
|Available language(s)||Multilingual (124)|
|Created by||Sergey Brin and Larry Page|
|Launched||September 15, 1997|
Google search is a web search engine owned by Google Inc. and is the most-used search engine on the Web. Google receives several hundred million queries each day through its various services. Google search was originally developed by Larry Page and Sergey Brin in 1997.
Google Search provides more than 22 special features beyond the original word-search capability. These include synonyms, weather forecasts, time zones, stock quotes, maps, earthquake data, movie showtimes, airports, home listings, and sports scores. (see below: Special features). There are special features for numbers including prices, temperatures, money/unit conversions ("10.5 cm in inches"), calculations ( 3*4+sqrt(6)-pi/2 ), package tracking, patents, area codes, and rudimentary language translation of displayed pages.
A Google search-results page is ordered by a priority rank called a "PageRank" which is kept secret to prevent spammers from forcing their pages to the top. Google Search provides many options for customized search (see below: Search options), such as: exclusion ("-xx"), inclusion ("+xx"), alternatives ("xx OR yy"), and wildcard ("x * x").
The search engine
Google's algorithm uses a patented system called PageRank to help rank web pages that match a given search string. The PageRank algorithm computes a recursive score for web pages, based on the weighted sum of the PageRanks of the pages linking to them. The PageRank derives from human-generated links, and is thought to correlate well with human concepts of importance. The exact percentage of the total of web pages that Google indexes is not known, as it is very hard to actually calculate. Previous keyword-based methods of ranking search results, used by many search engines that were once more popular than Google, would rank pages by how often the search terms occurred in the page, or how strongly associated the search terms were within each resulting page. In addition to PageRank, Google also uses other secret criteria for determining the ranking of pages on result lists, reported to be over 200 different indicators.
Google not only indexes and caches web pages but also takes "snapshots" of other file types, which include PDF, Word documents, Excel spreadsheets, Flash SWF, plain text files, and so on. Except in the case of text and SWF files, the cached version is a conversion to (X)HTML, allowing those without the corresponding viewer application to read the file.
Users can customize the search engine, by setting a default language, using the "SafeSearch" filtering technology and set the number of results shown on each page. Google has been criticized for placing long-term cookies on users' machines to store these preferences, a tactic which also enables them to track a user's search terms and retain the data for more than a year. For any query, up to the first 1000 results can be shown with a maximum of 100 displayed per page.
Despite its immense index, there is also a considerable amount of data available in online databases which are accessible by means of queries but not by links. This so-called invisible or deep Web is minimally covered by Google and other search engines. The deep Web contains library catalogs, official legislative documents of governments, phone books, and other content which is dynamically prepared to respond to a query.
Privacy in some countries forbids the showing of some links. For instance in Switzerland every private person can force Google Inc. to delete a link, which contains its name.
Since Google is the most popular search engine, many webmasters have become eager to influence their website's Google rankings. An industry of consultants has arisen to help websites increase their rankings on Google and on other search engines. This field, called search engine optimization, attempts to discern patterns in search engine listings, and then develop a methodology for improving rankings to draw more searchers to their client's sites.
Search engine optimization encompasses both "on page" factors (like body copy, title elements, H1 heading elements and image alt attribute values) and Off Page Optimization factors (like anchor text and PageRank). The general idea is to affect Google's relevance algorithm by incorporating the keywords being targeted in various places "on page", in particular the title element and the body copy (note: the higher up in the page, presumably the better its keyword prominence and thus the ranking). Too many occurrences of the keyword, however, cause the page to look suspect to Google's spam checking algorithms.
Google has published guidelines for website owners who would like to raise their rankings when using legitimate optimization consultants.
The Google search engine has many intuitive features making it more functional. Google search consists of a series of localized websites. The largest of those, the google.com site, is the top most-visited website today. Some of its features include a definition link for most searches including dictionary words, a list of how many results you got on your search, links to other searches (e.g. for words that Google believes to be misspelled, it provides a link to the search results using its proposed spelling), and many more.
Google's search engine normally accepts queries as a simple text, and breaks up the user's text into a sequence of search terms, which will usually be words that are to occur in the results, but may also be phrases, delimited by quotations marks ("), qualified terms, with a prefix such as "+", "-", or one of several advanced operators, such as "site:". The webpages of "Google Search Basics" describe each of these additional queries and options (see below: Search options).
Google's Advanced Search web form gives several additional fields which may be used to qualify searches by such criteria as date of first retrieval. All advanced queries transform to regular queries, usually with additional qualified terms.
Google often suggests what to type the rest of. Sometimes, autocomplete gives remarkably memorable results, often enough that a website has made a compilation of the most hilarious autocomplete suggestions submitters could ever find.
Google applies query expansion to the submitted search query, transforming it into the query that will actually be used to retrieve results. As with page ranking, the exact details of the algorithm Google uses are deliberately obscure, but certainly the following transformations are among those that occur:
- Term reordering: in information retrieval this is a standard technique to reduce the work involved in retrieving results. This transformation is invisible to the user, since the results ordering uses the original query order to determine relevance.
- Stemming is used to increase search quality by keeping small syntactic variants of search terms.
- There is a limited facility to fix possible misspellings in queries.
"I'm Feeling Lucky"
Google's homepage includes a button labeled "I'm Feeling Lucky". When a user clicks on the button the user will be taken directly to the first search result, bypassing the search engine results page. The thought is that if a user is "feeling lucky", the search engine will return the perfect match the first time without having to page through the search results. According to a study by Tom Chavez of "Rapt", this feature costs Google $110 million a year as 1% of all searches use this feature and bypass all advertising.
On October 30, 2009, for some users, the "I'm Feeling Lucky" button was removed from Google's main page, along with the regular search button. Both buttons were replaced with a field that reads, "This space intentionally left blank." This text fades out after a few moments, and normal search functionality is achieved by filling in the search field with the desired terms and pressing enter. A Google spokesperson explains, "This is just a test, and a way for us to gauge whether our users will like an even simpler search interface." Personalized Google homepages retained both buttons and their normal functions.
- weather – The weather conditions, temperature, wind, humidity, and forecast, for many cities, can be viewed by typing "weather" along with a city for larger cities or city and state, U.S. zip code, or city and country for smaller cities (such as: weather Lawrence, Kansas; weather Paris; weather Bremen, Germany).
- stock quotes – The market data for a specific company or fund can be viewed, by typing the ticker symbol (or include "stock"), such as: CSCO; MSFT; IBM stock; F stock (lists Ford Motor Co.); or AIVSX (fund). Results show inter-day changes, or 5-year graph, etc.
- time – The current time in many cities (worldwide), can be viewed by typing "time" and the name of the city (such as: time Cairo; time Pratt, KS).
- sports scores – The scores and schedules, for sports teams, can be displayed by typing the team name or league name into the search box.
- calculator – Calculation results can be determined, as calculated live, by entering a formula in numbers or words, such as: 6*77 +pi +sqrt(e^3)/888 plus 0.45. The user is given the option to search for the formula, after calculation.
- unit conversion – Measurements can be converted, by entering each phrase, such as: 10.5 cm in inches; or 90 km in miles
- currency conversion – A money or currency converter can be selected, by typing the names or currency codes (listed by ISO 4217): 6789 Euro in USD; 150 GBP in USD; 5000 Yen in USD; 5000 Yuan in lira (the U.S. dollar can be USD or "US$" or "$", while Canadian is CAD, etc.).
- dictionary lookup – A definition for a word or phrase can be found, by entering "define" plus the word(s) to lookup (such as: Define philosophy)
- maps – Some related maps can be displayed, by typing in the name or U.S. ZIP code of a location and the word "map" (such as: New York map; Kansas map; or Paris map).
- movie showtimes – Reviews or film showtimes can be listed for any movies playing nearby, by typing "movies" or the name of any current film into the search box. If a specific location was saved on a previous search, the top search result will display showtimes for nearby theaters for that movie. These listings however are sometimes totally incorrect and there is no way to ask Google to correct them; for example, on 25 July, for the El Capitan Theatre, google showtimes lists Up but according to the El Capitan website, the only movie playing that day is G-Force.
- public data – Trends for population (or unemployment rates) can be found for U.S. states & counties, by typing "population" or "unemployment rate" followed by a state or county name.
- real estate and housing – Home listings in a given area can be displayed, using the trigger words "housing", "home", or "real estate" followed by the name of a city or U.S. zip code.
- travel data/airports – The flight status for arriving or departing U.S. flights can be displayed, by typing in the name of the airline and the flight number into the search box (such as: American airlines 18). Delays at a specific airport can also be viewed (by typing the name of the city or three-letter airport code plus word "airport").
- package tracking – Package mail can be tracked by typing the tracking number of a Royal Mail, UPS, Fedex or USPS package directly into the search box. Results will include quick links to track the status of each shipment.
- patent numbers – U.S. patents can be searched by entering the word "patent" followed by the patent number into the search box (such as: Patent 5123123).
- area code – The geographical location (for any U.S. telephone area code) can be displayed by typing a 3-digit area code (such as: 650).
- synonym search – A search can match words similar to those specified, by placing the tilde sign (~) immediately in front of a search term, such as: ~fast food.
- U.S. Government search – Searching of U.S. government websites can be performed from webpage: www.google.com/ig/usgov.
The webpages maintained by the Google Help Center have text describing more than 15 various search options. The Google operators:
- OR – Search for either one, such as "price high OR low" searches for "price" with "high" or "low".
- "-" – Search while excluding a word, such as "apple -tree" searches where word "tree" is not used.
- "+" – Force inclusion of a word, such as "Name +of +the Game" to require the words "of" & "the" to appear on a matching page.
- "*" – Wildcard operator to match any words between other specific words.
Some of the query options are as follows:
- define: – The query prefix "define:" will provide a definition of the words listed after it.
- stocks: – After "stocks:" the query terms are treated as stock ticker symbols for lookup.
- site: – Restrict the results to those websites in the given domain, such as, site:www.acmeacme.com. The option "site:com" will search all domain URLs named with ".com" (no space after "site:").
- allintitle: – Only the page titles are searched (not the remaining text on each webpage).
- intitle: – Prefix to search in a webpage title, such as "intitle:google search" will list pages with word "google" in title, and word "search" anywhere (no space after "intitle:").
- allinurl: – Only the page URL address lines are searched (not the text inside each webpage).
- inurl: – Prefix for each word to be found in the URL; others words are matched anywhere, such as "inurl:acme search" matches "acme" in a URL, but matches "search" anywhere (no space after "inurl:").
The page-display options (or query types) are:
- cache: – Highlights the search-words within the cached document, such as "cache:www.google.com xxx" shows cached content with word "xxx" highlighted.
- link: – The prefix "link:" will list webpages that have links to the specified webpage, such as "link:www.google.com" lists webpages linking to the Google homepage.
- related: – The prefix "related:" will list webpages that are "similar" to a specified web page.
- info: – The prefix "info:" will display some background information about one specified webpage, such as, info:www.google.com. Typically, the info is the first text (160 bytes, about 23 words) contained in the page, displayed in the style of a results entry (for just the 1 page as matching the search).
- filetype: - results will only show files of the desired type (ex filetype:pdf will return pdf files)
Note that Google searches the HTML coding inside a webpage, not the screen appearance: the words displayed on a screen might not be listed in the same order in the HTML coding.
Some searches will give a 403 Forbidden error with the text
"We're sorry... ... but your query looks similar to automated requests from a computer virus or spyware application. To protect our users, we can't process your request right now. We'll restore your access as quickly as possible, so try again soon. In the meantime, if you suspect that your computer or network has been infected, you might want to run a virus checker or spyware remover to make sure that your systems are free of viruses and other spurious software. We apologize for the inconvenience, and hope we'll see you again on Google."
The screen was first reported in 2005, and was a response to the heavy use of Google by search engine optimization companies to check on ranks of sites they were optimizing. The message is triggered by high volumes of requests from a single IP address. Google apparently uses the Google cookie as part of its determination of refusing service.
In June 2009, after the death of pop superstar Michael Jackson, this message appeared to many internet users who were searching Google for news stories related to the singer, and was assumed by Google to be a DDoS attack, although many queries were legitimate searchers. This phenomenon quickly became known as the Jackson Effect.
January 2009 malware bug
Google flags search results with the message "This site may harm your computer" if the site is known to install malicious software in the background or otherwise surreptitiously. Google does this to protect users against visiting sites that could harm their computers. For approximately 40 minutes on January 31, 2009, all search results were mistakenly classified as malware and could therefore not be clicked; instead a warning message was displayed and the user was required to enter the requested URL manually. The bug was caused by human error. The URL of "/" (which expands to all URLs) was mistakenly checked in as a value to the file.
On certain occasions, the logo on Google's webpage will change to a special version, known as a "Google Doodle". Clicking on the Doodle links to a string of Google search results about the topic. The first was a reference to the Burning Man Festival in 1997, and others have been produced for the birthdays of notable people like Albert Einstein, historical events like the interlocking Lego block's 50th anniversary and holidays like Valentine's Day.
In August 2009, Google announced the rollout of a new search architecture, codenamed "Caffeine". The new architecture was designed to return results faster and to better deal with rapidly updated information from services including Facebook and Twitter. Google developers noted that most users would notice little immediate change, but invited developers to test the new search in its sandbox. One change that was noticeable was the search return time. In numerous tests it returned results in nearly half the time. Other differences noted for their impact upon Search Engine Optimization included heavier keyword weighting and the importance of the domains age. The move was interpreted in some quarters as a response to Microsoft's recent release of an upgraded version of its own search service, renamed Bing.
Google is available in many languages and has been localized for many countries.
The interface has also been made available in some languages for humorous purpose:
In addition to the main URL Google.com, Google owns 160 domain names for each of the countries/regions in which it has been localized. As Google is an American company, the main domain name can be considered as the U.S. one.
Some domain names unregistered by Google are currently squatted:
- Google.ua (Ukraine), the correct URL is google.com.ua
In addition to its tool for searching webpages, Google also provides services for searching images, Usenet newsgroups, news websites, videos, searching by locality, maps, and items for sale online. In 2006, Google has indexed over 25 billion web pages, 400 million queries per day, 1.3 billion images, and over one billion Usenet messages. It also caches much of the content that it indexes. Google operates other tools and services including Google News, Google Suggest, Google Product Search, Google Maps, Google Co-op, Google Earth, Google Blog Search and Google Desktop Search.
There are also products available from Google that are not directly search-related. Gmail, for example, is a webmail application, but still includes search features; Google Browser Sync does not offer any search facilities, although it aims to organize your browsing time.
- ↑ "WHOIS - google.com". http://reports.internic.net/cgi/whois?whois_nic=google.com&type=domain. Retrieved 2009-01-27.
- ↑ "Alexa Google traffic results". Alexa. http://www.alexa.com/data/details/traffic_details/google.com. Retrieved 2009-11-15.
- ↑ "Alexa Search Engine ranking". http://www.alexa.com/siteinfo/google.com+yahoo.com+altavista.com. Retrieved 2009-11-15.
- ↑ "Almost 12 Billion U.S. Searches Conducted in July". SearchEngineWatch. 2008-09-02. http://searchenginewatch.com/showPage.html?page=3630718.
- ↑ "WHOIS - google.com". http://reports.internic.net/cgi/whois?whois_nic=google.com&type=domain. Retrieved 2009-01-27.
- ↑ 6.00 6.01 6.02 6.03 6.04 6.05 6.06 6.07 6.08 6.09 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 "Search Features". Google.com. May 2009. http://www.google.com/intl/en/help/features.html.
- ↑ ...The *, or wildcard, is a little-known feature that can be very powerful...
- ↑ Sergey Brin and Lawrence Page (1998). "The Anatomy of a Large-Scale Hypertextual Web Search Engine". Stanford University. http://infolab.stanford.edu/~backrub/google.html. Retrieved 2009-11-15.
- ↑ "Corporate Information: Technology Overview". Google. http://www.google.com/corporate/tech.html. Retrieved 2009-11-15.
- ↑ "Google Frequently Asked Questions - File Types". Google. http://www.google.com/help/faq_filetypes.html#what. Retrieved 2008-01-29.
- ↑ Sherman, Chris and Price, Gary. "The Invisible Web: Uncovering Sources Search Engines Can't See, In: Library Trends 52 (2) 2003: Organizing the Internet:". pp. 282–298. http://hdl.handle.net/2142/8528.
- ↑ "Google Webmaster Guidelines". Google. http://www.google.com/webmasters/guidelines.html. Retrieved 2009-11-15.
- ↑ "Top 500". Alexa. http://www.alexa.com/site/ds/top_sites?ts_mode=global&lang=none. Retrieved 2008-04-15.
- ↑ "Google:Stemming". Google. http://www.google.com/support/bin/answer.py?answer=35889#stemming.
- ↑ "I'm feeling lucky( button costs Google $110 million per year". Valleywag. 2007. http://valleywag.com/tech/google/im-feeling-lucky-button-costs-google-110-million-per-year-324927.php. Retrieved 2008-01-19.
- ↑ "Google’s New Homepage Motto: 'This Space Intentionally Left Blank'". WallStreetJournal. 2009. http://digitaldaily.allthingsd.com/20091030/goog-page/. Retrieved 2009-11-17.
- ↑ Goel, Kavi; Ramanathan V. Guha, Othar Hansson (2009-05-12). "Introducing Rich Snippets". Google Webmaster Central Blog. Google. http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html. Retrieved 2009-05-25.
- ↑ 18.0 18.1 18.2 "Google and Search Engines". Emory University Law School. 2006. http://www.law.emory.edu/law-library/research/advanced-legal-research-class/finding-aids-and-searching/google.html.
- ↑ 19.0 19.1 19.2 19.3 19.4 19.5 19.6 19.7 "Google Help Center – Alternate query types", 2009, webpage: G-help.
- ↑ 20.0 20.1 "Google error page". http://www.google.com/support/bin/answer.py?answer=15661. Retrieved 2008-12-31.
- ↑ Krebs, Brian (2009-01-31). "Google: This Internet May Harm Your Computer". The Washington Post. http://voices.washingtonpost.com/securityfix/2009/01/google_this_internet_will_harm.html?hpid=news-col-blog. Retrieved 2009-01-31.
- ↑ 22.0 22.1 Mayer, Marissa (2009-01-31). "This site may harm your computer on every search result?!?!". The Official Google Blog. Google. http://googleblog.blogspot.com/2009/01/this-site-may-harm-your-computer-on.html. Retrieved 2009-01-31.
- ↑ Hwang, Dennis (June 8, 2004). "Oodles of Doodles". Google (corporate blog). http://googleblog.blogspot.com/2004/06/oodles-of-doodles.html. Retrieved July 19, 2006.
- ↑ "Google logos:Valentine's Day logo". February 14, 2007. http://www.google.com/logos/valentine07.gif. Retrieved April 6, 2007.
- ↑ 25.0 25.1 Harvey, Mike (11 August 2009). "Google unveils new "Caffeine" search engine". London: The Times. http://technology.timesonline.co.uk/tol/news/tech_and_web/personal_tech/article6792403.ece. Retrieved 14 August 2009.
- ↑ Culp, Katie (12 August 2009). "Google introduces new "Caffeine" search system". Fox News. http://www.foxbusiness.com/story/markets/industries/technology/google-introduces-new-caffeine-search/. Retrieved 14 August 2009.
- ↑ "Google never sleeps with Caffeine". http://www.jeffwendland.com/2009/08/11/google-never-sleeps-with-caffeine/.
- ↑ Martin, Paul (31 July 2009). "Bing - The new Search Engine from Microsoft and Yahoo". Cube3 Marketing. http://blog.cube3marketing.com/2009/07/31/bing-the-new-search-engine-from-microsoft-and-yahoo/. Retrieved 12 January 2010.
- ↑ Martin, Paul (27 August 2009). "Caffeine - The New Google Update". Cube3 Marketing. http://blog.cube3marketing.com/2009/08/27/caffeine-the-new-google-update/. Retrieved 12 January 2010.
- ↑ Barnett, Emma (11 August 2009). "Google reveals caffeine: a new faster search engine". The Telegraph. http://www.telegraph.co.uk/technology/google/6009176/Google-reveals-caffeine-a-new-faster-search-engine.html. Retrieved 14 August 2009.
- ↑ 31.0 31.1 Language Tools
- ↑ 32.0 32.1 Google, Web Crawling and Distributed Synchronization p. 11.
- ↑ Blogspot.com, Powering a Google search
- Google Hacks from O'Reilly is a book containing tips about using Google effectively. Now in its third edition. ISBN 0-596-52706-3.
- Google: The Missing Manual by Sarah Milstein and Rael Dornfest (O'Reilly, 2004). ISBN 0-596-00613-6
- How to Do Everything with Google by Fritz Schneider, Nancy Blachman, and Eric Fredricksen (McGraw-Hill Osborne Media, 2003). ISBN 0-07-223174-2
- Google Power by Chris Sherman (McGraw-Hill Osborne Media, 2005). ISBN 0-07-225787-3
- Barroso, Luiz Andre; Dean, Jeffrey; Hölzle, Urs (2003). "Web Search for a Planet: The Google Cluster Architecture". IEEE Micro 23 (2): 22–28. doi:10.1109/MM.2003.1196112.
- Blogpost.com, Evolution of Google Home Page from 1998 to 2008
- Web.Archive.org, A cached page of Google from 1998
ar:غوغل سيرش be-x-old:Google search bg:Google търсачка cs:Google (vyhledávač) de:Google es:Google#Buscador fa:جستجوگر گوگل fr:Google (moteur de recherche) hr:Google (tražilica) it:Google he:גוגל hu:Google kereső mr:गूगल शोध nl:Google no:Google Søk oc:Google (motor de recèrca) pl:Wyszukiwarka Google pt:Google Search ru:Google (поисковая система) sv:Google (sökmotor) th:กูเกิล เสิร์ช tr:Google (arama motoru) vi:Google search yi:גוגל (זוכמאשין) zh:Google搜索