next up previous
Next: Document Models on Up: Two Case Studies in Previous: Two Case Studies in

Introduction

The flow of information and influence in scientific societies is fascinating to its participants. Each develops intuitive models of the dynamics and structure of this flow. The challenge is to find data and methodology which permit rational exploration of these intuitions. One method, developed before the advent of the World Wide Web, relies on the co-citation metric [5]. The co-citation metric measures the distance between scientific documents. Two documents are close in this metric if they cite many of the same sources. Using this metric, maps can be built of the landscape of scientific literature. These maps show the relationship between fields and subfields and allow one to identify particularly influential publications. In an early study of the dynamics of citation maps, the field of cellular automata was mapped over a period of years [1].

The phenomenal spread of the internet has opened new vistas in the study of reference to scientific literature. One now has easily and publicly available a vast storehouse of archives and cross-references. Further, the rate of information flow is such that experimental perturbation is feasible.

The database on which the present study is based is admittedly meager; it consists of records of ftp transfers, and httpgif requests for two documents available over a period of months on the Artificial Life Online server at the Santa Fe Institute. Nonetheless, these data reveal interpretable trends, and support hope that a full-scale investigation would be fruitful. The motivations for mounting such a study are several: 1) The internet is a prime example of a complex adaptive system, a system consisting of many loosely interacting components, able to develop complex global behaviors in adaptive reaction to its environment. 2) A vast and explosively growing distributed data-base exists, while technical and intellectual tools for dealing with this new profusion of data are lacking. 3) The potential economic impact of these data is enormous; a fact sufficiently well-elaborated in the popular press that we need not dwell on it here. 4) Though the scientific community has long used the internet, it is only recently that volume of information distribution on electronic channels has seriously threatened to eclipse that of traditional channels, such as journal publication. Now is the time to understand the intrinsic dynamics of this distribution.





next up previous
Next: Document Models on Up: Two Case Studies in Previous: Two Case Studies in



Howard A. Gutowitz
Sun Dec 10 22:56:22 MST 1995