Alexa Technology
Technology
How and Why We Crawl the Web
Alexa is continually crawling all publicly available web sites to create a series of snapshots of the Web. We use the data we collect to create features and services:
- Site Information: Traffic rankings, pictures of sites, links pointing to sites and more
- Related Links: Sites that are similar to the one you are currently viewing
To programmatically access Alexa's vast information about the Web, please visit Alexa Web Information Service. To keep Alexa from crawling your site, please visit this page.
Gathering Web Usage Information
In addition to the Alexa Crawl, which can tell us what is on the Web, Alexa utilizes web usage information, which tells us what is being seen on the web. This information comes from the community of Alexa Toolbar users. Each member of the community, in addition to getting a useful tool, is giving back. Simply by using the toolbar each member contributes valuable information about the web, how it is used, what is important and what is not. This information is returned to the community with improved Related Links, Traffic Rankings and more.
Finding Patterns in Data
The Alexa services are derived from our uniquely powerful combination of Web content and usage information.
Site Stats
Alexa gathers Site Stats from a variety of sources to provide key statistics about each site on the web. These include: Traffic Rank and Speed which are derived from Web usage information, and Other sites that link to this site, and Online Since, both of which come from Web content. For an example of Site Stats, see the Alexa Overview page for Schwab.com.
-
Contact Info
Alexa provides contact information for Web sites by mining for Web content gathered in the crawl. This information includes Site Owner, Address, Phone Number and contact e-mail address. See Contact Info for Schwab.com.
-
Traffic Details
Web usage information is utilized to provide information about the number of page views and number of users that Web sites receive. This data is also the basis for the Alexa traffic rank and traffic history graphs. See Traffic Details for Schwab.com
Our goal for these features is to help people navigate the Web more efficiently by giving them all the information they need to make informed decisions about the sites they visit.
-
Related Links
Whenever an Alexa Toolbar user visits a web page, the Alexa Toolbar retrieves information from the Alexa servers to suggest other pages that might be of interest to the user. To generate Related Links, we use several techniques, including:
- The usage paths of the collective Alexa community- this is the most important source of our information, since these paths show us which web sites our users believe are important and interesting.
- Clustering - the hundreds of millions of links on the Web can be used to find clusters of sites that are similar and relevant to one another. We mine this data by using custom databases to find and identify these clusters.
- Users' suggestions - we consider our users' suggestions to augment our Related Links recommendations.
The Alexa Toolbar
The Alexa toolbar is a program written by Alexa Internet that users install into the browser. Every time the user changes pages, the Alexa toolbar communicates with Alexa servers to retrieve information which is then displayed in the toolbar.
0 comments: