The Web: Big, Indexed, Distributed, Permanent

The Web is really huge, it contains more information than I can comprehend. The Web started exploding around November 1993 when NCSA Mosaic came out. Now, just about two years later, it's got literally millions of different pages out there, information that previously did not exist. When I started being serious about the web in January 1994, I felt I had a pretty good handle on what all there was out there. No way now.

Distributed, Indexed

In my CS seminar at the University of New Mexico, "Living Computation", taught by David Ackley, we've been talking broadly about grand visions of where computers and networks could take us. One of the papers we read is something by David Gelernter on "Lifestreams".

Briefly, the idea of Lifestreams is that you store every tiny bit of information a person ever generates, but you don't structure it in any way: just dump it all in a big pile. Later, when you need something from your information cache, you just construct a query on the fly: somehow sort through this enormous database finding your needle.

The article I read was deliberately futuristic - to me it seemed irresponsibly optimistic, like Star Trek. Not only is the problem of storing all that data difficult, but it's got to be impossible to search such a huge database in any useful fashion. But then I had an interesting experience with Lycos, one that gave a hint of what is possible. The message below is email I sent to my class:


From: nelson@santafe.edu (Nelson Minar)
Subject: the future, the Web
Date: Mon, 20 Nov 95 22:44:50 MST
Message-Id: <9511210544.AA04196@sfi.santafe.edu>

I'm in movie-browsing mode here on the Web, and thought I'd try to find references to "Pixelvision", a weird little motion picture camera that Fischer Price made a few years ago as a children's toy. It's an extremely low-res digital camera that records onto audio tape, horrible quality.

But hipster quality - Pixelvision has gotten some fame after a few low budget independent film makers used it: the effect is remarkable. The new vampire film "Nadja" is (I think) the first commercial film to use Pixelvision. It's a great movie, and the Pixelvision scenes give a nice flavour to the film.

Anyway, I was curious to learn more about Pixelvision cameras. Obscure topic, but what the hell? I type in Pixelvision on Lycos, and up comes 23 references in about 3 seconds. Gopher archives of a discussion last February on the listserv list SCREEN-L. Independent film festival programs from the last year, three or four underground culture zines, two reviews of Nadja, etc. Oh, and a couple of references to a graphics accelerator board.

The thing that amazed me isn't just I found references to this obscure topic, but the age of those references. Ten month old email! Year old zines! Will this stuff ever be deleted? Once things make it onto the Web, will they ever disappear? We have enormous amounts of fast indexed distributed storage, right here in this simple technology of the Web and Lycos. Maybe Gelernter wasn't so far off.

A bit shaken,
Nelson

                                __                      
nelson@santafe.edu              \/              http://www.santafe.edu/~nelson/
PGP key 9D719FAD   Fingerprint 3B 9B 8E 58 1C 90 57 3E  B7 99 ED 13 65 2E 0B 24

Nelson Minar <nelson@santafe.edu>
Last modified: Mon Nov 20 23:22:52 MST 1995