Fake Banner
    Gazing at the Web from the Back Porch
    By Michael Martinez | September 8th 2011 01:56 PM | Print | E-mail | Track Comments
    About Michael

    Michael Martinez has a Bachelor of Science degree in Computer Science, an Associate of Science degree in Data Processing Technology, and a few certifications...

    View Michael's Profile
    The universe I see when I sit on my porch and look up into the sky is very different from the universe a professional astronomer sees with all of today's available advanced technology.  On a clear night I may be able to see a few galaxies.  On any night astronomers around the world may be counting billions of galaxies.

    There are similar differences in the scales by which we measure the Web.  Most Websites use some sort of routine analytics reporting tool.  Even if you create a blog on Wordpress.com you can get some basic statistics, including lists of Websites that send your blog traffic through their links.  Professional Web marketers have developed numerous tools to measure "backlink profiles" both from within analytics and through third-party tools.

    The average Web marketer, using these tools, is equivalent to your local astronomy club's chief amateur astronomer.  He is the goto guy for all the newer members of the club; he has the fanciest and biggest home telescope in town; and he can quote a lot of astronomy books and tell you not only the names of obscure stars but also how to find them at just about any time of the year.

    The people who measure links between Websites at a level approaching scientific quality are the engineers and analysts who work for the metrics and search companies.  They are a small, elite industry who seldom emerge from the cluttered background processes of documenting and indexing the Web into the limelight of mass audience communication. There is also a small academic community that studies Webometrics and related topics. Some of their work has contributed to a small body of patents that document some of the limitations of Webometrics.

    That leaves the Web marketers to represent the majority of discussion on Web(o)metrics and in my opinion and experience Web marketers don't do a very good job of actually measuring the Web.  Their measurements tend to follow the money and the money doesn't follow the Web.  Social media metrics companies have attempted to compensate for the deficiencies in Web marketing metrics but their data tends to look like coffee table books on astronomy.  After all, someone has to pay for all these metrics and that someone is usually a marketing VP with a limited budget.  He wants to see pretty pictures, charts, circles connected by lines, and most importantly interpretations of the circles and lines.

    This situation forces everyone to fall back on observing the Web from within their own Websites.  If you even look at your "Referrers" data you're miles ahead of the next guy because most people don't even do that.  The links that other Websites use to send to traffic to your site are the strands of the Web that you can see.  If your analytics report several thousand referrers a month you have an average depth of sight into the Web.

    That is, your telescope can see about to the edge of the inner Solar system, where the last official planet orbits.  You might have an occasional glimpse of odd objects from the Oort cloud.

    Websites that receive referral traffic from 50,000 or more other sites can see their local galaxies.  These sites have "brand" power on a low scale.

    Websites that receive traffic from 250,000 or more sites per month see their local galactic clusters.  And here is where I'll stop the analogy because I've already outpaced the majority of analytics reports.

    When comparing the referrer profiles for multiple Websites in the same "vertical", one can often see substantial differences in traffic patterns.  You can even see these differences when comparing similar sites on free tools (of debatable value) like Alexa and Compete.

    Websites tend to favor specific Websites with their links.  These biases in link flow are attributable to a small number of causes.  Some Websites are members of proprietary or cooperative networks and therefore they may share content or navigational structures with each other.  Some Websites are social and their members tend to follow small groups of Websites; the larger the social community, the more referrals it makes to other sites.  Some Websites reflect the interests of individuals or small groups of writers whose exposure to the broader Web is limited.

    These tendrils of connectivity create Local Web spaces, communities that are often tightly interlinked (compared to their connectivity to the rest of the Web) and which often are represented by gateway or hub sites that attract the majority of "outside" links and around which the smaller sites are situated.  (A relevant study on social networks revealed that their interconnectivity resembled offline social networks -- as best I have been able to determine, the paradigm extends across the Web).

    The size of a local Web space is measurable in varying degrees of definition or precision.  For example, should only a single month's data be used to measure the local Web space, you may include many incidental links from sites that are not really part of your community.  Comparing referral data from many months, as much as up to two years, filters out incidental links and builds a picture of the Websites that consistently link to or drive traffic to a site through their links.

    However, aggregating historical data creates a time-lapse panoramic view of the Web that never existed.  That is, many Websites that were active 2 years ago may no longer be functioning today, or they may now be dormant.  The activity associated with those sites from 2 years ago skews your picture of the local Web space.  But some Websites change their links very rapidly.  For example, Websites that publish a lot of new articles change their front page links on a daily or hourly basis (Science 2.0 being a good example).  Many blogs and forums scroll older crawlable content onto secondary pages.

    Links vanish quickly and the more links a Website attracts the more links it loses.  Time affects both the connectivity of the Web and our attempts to measure it.  Web metrics do not adequately take time into consideration.  There is no way to truly glimpse a real-time view of the Web.  During the time it takes your server to retrieve the latest version of a single RSS feed from another site (measured in seconds or fractions of seconds), thousands of new pages of content have been created across the Web and hundreds of Web pages have vanished.

    A more accurate picture emerges when you aggregate referral data from multiple Websites over a short time frame.  You see the "Living Web Space" that surrounds the community of Websites you're measuring.  Again, the fewer time segments you use in your measurement, the more incidental links you include in your results.  But extending a time segment also adds incidental links to a profile.  A reasonable compromise for most local Web spaces is looking at 3 months' worth of referrals in monthly or weekly segments, filtering out non-repeating links.

    In measuring the connectivity of Websites we establish parameters that allow us to identify and track paths of influence, paths of viral communication, and the birth of new communities.  A Web community may spawn almost entirely without connection to other communities but often there is a temporary burst of linkage from older communities that helps establish new communities and then the new communities either die off or become large enough to sustain themselves.  Over time the older links vanish, sometimes to be replaced by new links, sometimes not.

    The problem of link decay was first documented by Tim Berners-Lee in 1998.  At the time he wrote: "There are no reasons at all in theory for people to change URIs (or stop maintaining documents), but millions of reasons in practice."  The reverse of link decay is the process of "link bursts", where as noted above significant numbers of links pointing to the same content appear within a relatively brief period of time.  These processes endlessly repeat themselves.  For that reason alone contemporary Web measurements have relatively short lifespans of accuracy or reliability.

    The practical limitations of measuring the Web from within link referral data aside, the chief value of these kinds of metrics is to provide a timeline of growth in visibility that can be compared to benchmark studies to assess the progress of a Website's marketing, popularity, reach, and influence.