The Internet as an Information Management Technology: New Tools, New Paradigms and New Problems

Andrew Treloar, School of Computing and Mathematics, Deakin University, Rusden Campus, 662 Blackburn Road, Clayton, Victoria, 3168. Ph: +61 3 9244 7461 Fax: +61 3 9244 7460 Email: Andrew.Treloar@deakin.edu.au. WWW: http://www.deakin.edu.au/~aet/

Presented at the 47th Conference of the International Federation for Information and Documentation (FID), Omiya, Japan, October 1994. Last updated June 5, 1996.

Abstract
1. Introduction
2. The Internet
3. New Developments
4. Summary
5. References

Abstract

This paper briefly discusses the development of information management technologies to the present day. It then considers the development of the worldwide Internet as an evolving technology for managing information. The implication of some current trends on the Internet are discussed, with a focus on new tools, new paradigms for working with information, and some of the new problems this brings. Finally some challenges for the future are outlined.

Key Words: Information technologies, information management, Internet, networked information, information access tools, Networked Information Retrieval (NIR), Computer Mediated Communication (CMC).

1. Introduction

1.1 Information technologies

Throughout history one of the distinguishing characteristics of homo sapiens has been our use of information. Indeed it has been argued (Goody [1981], Diamond [1992]) that it was our ability to work with symbolic information through the medium of language that started our species on its rapid and continuing process of cultural (as opposed to biological) evolution.

For all of our history as a species, we have used a range of technologies to manage this information. Some of these have been 'soft' technologies, technologies in the broadest sense. These may not involve any physical artefact at all, but consist rather of ways of working with information using only our minds. Other technologies, such as computers, may be considered 'hard' technologies. In the middle are 'firm' technologies such as pen and paper that are relatively low-level but still quite powerful.

Regardless of their level, these information technologies can all be grouped according to the functions they perform. One common division by function is into communication, storage/retrieval, and analysis technologies. Communication technologies provide a way to move information through space (and optionally, time). They include things like talking (soft), writing (firm), or the telephone (hard). Storage/Retrieval technologies provide the ability to archive and access information. Examples might be memorisation (soft), writing (firm), and magnetic storage (hard). Analysis technologies support working with the information, processing it into other forms. Examples here might be the brain (soft), abacus (firm), computer (hard).

1.2 Information management

All of these technologies, regardless of technology level, have been used to support the management of information in its widest sense. Before considering information management, it is important to distinguish information from data or knowledge. UNESCO has defined them as follows (Monviloff [1990]):

"Information is some meaningful message transmitted from source to users. In this process information may be stored in information products and systems organized for the purpose of providing a memory in numerical, textual, sound and image forms. Information may also be communicated through interpersonal channels. The 'source' may be documentary material, institutions or people.

Data are facts, the raw material from which information is created for or by the user. Information once assimilated by an individual becomes personal knowledge. Personal knowledge once incorporated in books, files, software, personal messages, etc. serves as potential information for others."

There are almost as many definitions of information management (as opposed to data management) as there are practitioners. In fact, one could argue that information management as a discipline either exists on the boundaries between computer science, communication studies, and library/information science, or subsumes all of them (Agha [1992], Boaden & Lockett [1991]).

In this paper, I propose to use a broad definition of information management as an ongoing process with three more or less distinct phases. These are collection, processing and communication. Each phase uses to varying amounts the categories of information technology already described.

The collection phase is where the information worker is gathering the raw material to satisfy some information need (either their own, or on behalf of a client). The data they collect will ultimately be turned into useful information. The collected data may be completely unprocessed (raw) or at some remove from its source. In this case one might talk about secondary or tertiary data. Note that data and information are not absolute terms. What is data to one user may be useful information to another. The data collection phase may require access to a variety of sources in a variety of media, and will almost certainly involve the use of storage and retrieval technologies. If the data sought are in another location, then communications technologies will also be necessary to access these data.

The analysis phase revolves around turning data into information. A range of possible information products can always be derived from a collection of data, and the most appropriate product must be created for the ultimate end-user. Storage/retrieval technologies may be used to hold working information, and analysis technologies to support the process of working with the data.

The final phase involves providing the final information product(s) to the user. This needs to be done when the user requires it, and in an appropriate form. Consideration needs to be given to the media used to deliver the information, and the level at which the information is pitched. Communications technologies play a leading role in this phase.

1.3 Early developments

The earliest information technologies were 'soft' technologies, based on the wetware of our brains rather than on any hardware. Communication occurred through the use of language, and analysis took place without any assistance. Storage and retrieval used techniques such as rote memorisation of poems or stories, and the use of spatial recall methodologies. These used a mnemonic device called 'memory theatres' where the material to be memorised was attached to a familiar location (building or space). Each piece of information was associated with a particular part of the location (hall, doorway, room, etc.). Mental images recalling each piece of information were placed in the various locations and the entire assemblage committed to memory. This technique is a very ancient one - the word topic is derived from the Greek topos, meaning place. For more details , refer to Burke [1981], pp. 99 - 101, and Yates [1966]. These technologies were slow, limited, and risky. Slow, because it took large amounts of time to commit significant amounts of information to memory. Limited, because they only worked well with structured text or information that could be associated with symbols; accounting data or lists of stocks were much less amenable to such techniques. Risky, because of the possibilities of introduced errors and the hazards associated with having information stored only inside the heads of a few individuals. With these earliest technologies, information could be transmitted across time only through serial memorisation from one individual to the next, and across space only by moving the individual physically.

The development of writing freed information from the confines of the human brain and allowed the storage and communication of information independent of a specific individual. Information could now be transmitted in physical form across time and space, and different human cultures evolved an amazing range of ways to codify information in written form. Nevertheless, while writing information down was still faster than teaching it to another or memorising it, it was nonetheless fairly slow. Information could be copied multiple times, but still not quickly enough to support a publishing industry. Retrieval was facilitated through the organisation of the written materials, and sometimes through additional indices or reference lists, but was still largely carried out by humans. Analysis of information was sometimes supported through written summaries or the use of writing materials for calculations.

The development of printing enabled the rapid duplication and transmission of information, as well as multiply redundant storage in numerous locations. Printing, through the increase in the number of texts in libraries, forced the development of improved ways of organising and accessing such material. No significant developments in analysis were facilitated through printing however. Until very recently, print has been the main storage mechanism for most human-readable information.

The second half of this century has seen explosive development in both computing and communications technologies (what most people mean today when they refer to information technology). Communications technologies such as fax machines, mobile phones, satellite systems, and fibre-optics have enhanced our ability to move information rapidly and in a range of forms. Computer technologies have enormously expanded our ability to store and retrieve information, as well as to analyse it in an increasing variety of ways.

These two technologies are now converging to fundamentally change the ways in which we work with information. Nowhere is this convergence and some of the changes it might bring more clear than on the Internet. Nowhere is the challenge greater for information professionals.

2. The Internet

The Global Multiprotocol Open Internet includes the IP Internet (what most people think of as the Internet proper), Bitnet, FIDONet, UUCP, and OSI. It can best be thought of as a 'network of networks', linking over 14,000 separate networks around the world. Some parts (Bitnet/FIDONet) have primarily access to email only, while others (the IP Internet) have access to the full range of services. It is this latter group of networks that will be considered here.

At present, the Internet is going through a period of enormously rapid and exciting development. This makes it very difficult to predict how it might evolve. However, there are a number of trends that seem likely to continue to affect the development of this worldwide network.

Firstly, there is explosive growth in users connected, hosts providing information services, the range of information services available, and the tools to access these. New hosts, services and tools seem to appear almost weekly, making any figures out of date as soon as they are entered. By the August 1994 the Internet included more than 3 million host (multi-user) systems, and over 30 million users reachable through email. Even keeping track of how to find out what is available is becoming a full-time task. These numbers are currently doubling every ten months, and the rate of increase is itself increasing with no apparent end in sight.

Secondly, the Internet is increasingly driven by the spread of microcomputers and Unix. All of the new information access tools are based around the client-server architecture, running on multiple platforms and using standardised protocols. Microcomputers and their attendant graphical user interfaces (GUIs) are ubiquitous at the client end with Unix-based systems both at the client end (as high-end workstations) and the server end.

Thirdly, the user population is becoming more diverse as private service providers expand access to the Internet beyond the traditional academic and research base. In Australia, it is now possible for anyone to get low-cost (c. US$10/month) access to the Internet in most capital cities through the Australian Public Access Network Association (APANA).

Fourthly, there is a growing commercialisation of the Internet. The number of commercial networks connected is already more than half of the total. Private information providers are rushing to bring their services to the Internet community, often, but not always, for a fee.

It seems inescapable that in much of the developed world, the Internet or something like it will be in people's homes very soon. The implications of this enormously expanded user base are just starting to be grappled with.

3. New Developments

All this change provides a range of new tools for information management, introduces new paradigms into the information management community, and presents new problems to be solved. Each of these has significant implications for the way in which we work with information.

3.1 New Tools

Much of the excitement associated with the Internet is related to the information services available and the tools used to access these (Treloar [1993]).

Some of the older tools like text-only email and mainframe implementations of telnet and ftp have been around for some time and are well known and understood. The newer tools differ in four important respects from these older implementations. Firstly, they make use of the client-server approach to provide quite different front-end (client) and back-end (server) programs, communicating via standardised protocols. Secondly, they often come in both GUI and terminal-based versions. The GUI versions usually require TCP/IP as the underlying data transfer mechanism, often over Ethernet, and provide the richest functionality and greatest ease of use. The terminal versions, despite being handicapped by limited displays and much lower bandwidths, often do a surprisingly good job. Thirdly, they support facilities in addition to those provided on a local machine, things that are very different to the normal operations of logging on and copying files. Fourthly, they are based around a global world view; that is, they require the Internet or something like it to be meaningful and truly useful. The best known of these newer tools are archie, WAIS and Gopher. Less well known are Veronica, WWW, Mosaic and MUDs/MOOs. There are also GUI versions of ftp and telnet.

Archie (Deutsch [1992]) currently provides a searchable database of files available from anonymous ftp sites worldwide. The database is kept current by periodically 'polling' the ftp sites for changes to their holdings. The archie database is invaluable for locating particular files, if the name or part of it is known. Bunyip Information Systems Inc. is currently seeking to extend its usefulness to provide general centralised searching of distributed collections of information, and has just announced searching of gopher menus through archie, rivalling Veronica (see below).

WAIS (Kahle [1992], Stanton [1992] and Stein [1991]), standing for Wide Area Information Servers, is a distributed networked database system, based on an extension of the NISO Z39.50-1988 standard. This describes a model for an "originating application" (client) to query databases at one or more "targets" (servers). WAIS indexes the contents of text files. A WAIS search consists of specifying words or phrases, perhaps combined with boolean operators. It is also possible to ask a WAIS server to look for documents that are 'similar' to a specified sequence of text. The answer in either case is a list of document titles. Each document can be retrieved if desired. Hundreds of WAIS servers, often specialising in particular subjects areas, are now available, running on various back-end systems. Both GUI and terminal-based WAIS clients exist, although the terminal client swais is somewhat clumsy to use. WAIS also allows retrieval of the documents themselves once found.

Gopher (Wiggins [1993], Stanton [1992]) too is based on a true client/server model. Client software running on a user's personal workstation is preferred due to the better user interface and access to still images, audio files, and other resources, but terminal gopher clients also exist. Gopher offers access to files and interactive systems using a hierarchical menu system. Users navigate through menus to locate resources, which may include documents of various types, interactive telnet sessions, or gateways to other Internet services. Once resources have been located, they can often be retrieved to the local workstation, depending on the resource type.

The difficulty with 'Gopherspace' is knowing which Gopher server contains the required resources. Enter Veronica, or the "Very Easy Rodent-Oriented Netwide Index to Computerized Archives." Like archie, Veronica polls its target servers periodically to build its database. A user connecting to Veronica specifies a keyword to search for and is returned a list of document titles from throughout Gopherspace that match. Currently Veronica suffers from performance problems and difficulties in interpreting its results, but these problems are being worked on.

In contrast to the hierarchical, menu-based, one hierarchy per server organisation of Gopher, World-Wide Web, also written as WWW or W³ (Berners-Lee[1992]) is a distributed hypertext document delivery and access system based on the Hypertext Markup Language (HTML) for its document format, and the Hypertext Transfer Protocol (HTTP) for its delivery system. In practice what the user sees is a structured document with heading, sub-headings, highlighted links, and embedded graphics. Clicking on a link might display another image, play a sound file, show a movie or move to another structured document altogether. The location of these various resources is transparent to the user, and may be on their local machine, another machine in their building, or a machine on the other side of the world. Mosaic from the National Centre for Supercomputing Applications is perhaps the best known (although not the only!) WWW client tool, and provides gateways to other services, such as WAIS, Gopher, NetNews and ftp, as well as excellent multimedia support. The migration last year of the Mosaic WWW client application from XWindows to Windows and the Macintosh has enormously expanded the WWW community, as more and more people can access the Web from their desktops.

Multi-User Dimensions (MUDs) or MUD - Object Oriented (MOOs) originally evolved for the playing of fantasy-role games set in mythical/fantasy surroundings. In a MUD/MOO, multiple users can simultaneously communicate, interact, and even create new parts of the shared dimension. In many ways, a well-populated MUD can be viewed as a virtual community. An example of such community within a 'virtual university' can be experienced at Diversity University. The facilities provided by MOOs are starting to be used for serious scholarly discourse, enabling researcher to interact in a natural setting, and work cooperatively to solve problems or produce items of mutual interest. As examples, the Post Modern Culture MOO can be accessed here, and a MOO for the Institute for Advanced Technology in the Humanities is available here.

3.2 New Paradigms

These new tools and services in turn provide new paradigms for working with information (Treloar [1994]). When printing first appeared in Western Europe, the early printed works faithfully reflected the print, size, and organisation of the existing hand-written works (Bolter [1991], p. 3). It was only after some time that printers began experimenting with smaller type, more compact books, and such innovations as tables of contents and indexes. In the same way, most of us use the new technologies of the word-processor and electronic photocompositor to facilitate and improve the production of printed materials - the old paradigm for information distribution. The implications of these new paradigms for electronic information are only just starting to be considered. Each of them provides a new way to consider information and demands a new response from information workers.

3.2.1 Everyone a publisher

The Internet makes it possible for anyone to become a publisher of their own or anyone else's work. In theory this has always been possible, but in practice the costs of production and the difficulty of accessing distribution channels have proven to be a steep barrier. On the Internet, there are effectively no costs of production, and a wide range of ways to distribute work.

The production costs are effectively zero after the original has been created because the published work is accessed electronically. No physical copy needs to be printed by the publisher - any copies printed by the user will be at their cost and in their own time. The published work can be created in plain ASCII text (the lowest common denominator), in a popular word-processor format (Word/WordPerfect - allowing the inclusion of graphics), as Postscript code (for output on a laser-printer by the user), or in HTML-format (for access through Mosaic or mounting on a WWW server).

The information once created can be distributed in a multiplicity of ways. It can be automatically sent out via an automated mailing list, in which case users do not need to specifically request it. It can be placed on a machine with anonymous ftp access, and abstract only information mailed out. It can be placed on a gopher server, in which case it will appear in one of the hierarchical menus. It can be mounted as part of the World-Wide Web, with links pointed to it from other sites around the world. In all of these cases, the software to make this distribution possible is available free on the Internet, and the hardware requirements are very modest - any currently available microcomputer will do.

This ability of anyone to publish what they want to the world has the ability to fundamentally revolutionise the publishing industry and completely alter the way in which we access information.

3.2.2 Printless journals

A consequence of this revolution in publishing is that some forms of print-based information may disappear. The most likely initial candidate is the print journal. This is currently under threat both from the CD-ROM and the Internet. The print journal is in decline because of the cost for a yearly subscription, difficulties in locating information of interest, and delays in the publication process.

The cost is partly a function of printing costs for relatively small print runs, something that electronic publication will remove. It has to be admitted that the issue of obtaining subscription payments for network accessible information is still somewhat problematic. Selling a CD-ROM provides a much more reliable way at present for the publishers to earn their money.

The difficulties in locating information in journals are because of the inability to use a computer to realistically search human-readable text for combinations of keywords. The usual solution has been to create machine-readable indices to the print material. If the information is in electronic form to begin with, it is possible to search the entire text, rather than just an index/abstract.

The delays in publication are caused by a range of factors including a backlog of material coupled with a fixed production schedule, the need to rekey submissions and circulate galley proofs, and the communication delays between editor, author and reviewer. Electronic publication means that extra editions can be scheduled at little cost. Electronic submission (increasingly common with journals today) replaces rekeying with minor massaging of the submitted text. Electronic mail means that the turnaround time can be dramatically reduced. I recently submitted an article via electronic mail to its editor in the United States. He in turn sent it to a reviewer who returned her comments via email. I was able to resubmit my revised version almost immediately and the accepted text was placed directly into the publishing system for production in print form. Unfortunately, the final step of producing the print version for this journal is likely to take between six and nine months. As a result, an increasing number of journals (over 200 at last count) are appearing either in electronic and print form, or in electronic form alone.

An example of the latter is The Public-Access Computer Systems Review (PACS Review). This is an electronic journal about end-user computer systems in libraries, distributed at no charge on BITNET, Internet, and other computer networks. To subscribe to PACS-P, the list which provides notification of issues of the PACS Review, send e-mail to LISTSERV@UHUPVM1.UH.EDU: withSUBSCRIBE PACS-P First Name Last Name in the body of the message. The journal publishes papers on topics such as campus-wide information systems, CD-ROM LANs, document delivery systems, electronic publishing, expert systems, hypermedia and multimedia systems, locally mounted databases, network-based information resources and tools, and online catalogues. The PACS Review has two sections: Communications (papers selected by the Editor-in-Chief and the Associate Editor, Communications) and Refereed Articles (papers that are peer reviewed by Editorial Board members using a double-blind review procedure). The PACS Review also includes columns and reviews of books, network resources, and software. It is published on an irregular basis. Issues may contain a single article or multiple articles. It is a journal of consistently high quality, easily the equal of many print publications.

We need to start thinking of print delivery of scholarly information as being an industry in slow but terminal decline.

3.2.3 Fluid information

A consequence of the electronic storage and distribution of information on the Internet is that information itself becomes a plastic rather than static commodity. We are familiar with books appearing in a small number of editions, numbers of years between each edition, and with comparatively minor changes from one edition to the next. By contrast, documents on the Internet change radically on a monthly basis. This occurs so frequently that documents sport version numbers, like computer software. Moreover, because of the speed of change on the Internet, not only the contents of documents are subject to change, but the documents/files themselves are ephemeral. Information more than two years old can be very difficult to locate. If it has not been archived to magnetic tape, it may simply have disappeared. Even current information may change location overnight as the result of a directory reorganisation.

This fluid information scene poses real problems for anyone trying to keep abreast of current developments. It also means that we need to start thinking of information stores as rapidly changing and dynamic - not the normal view of conventional information stores like libraries.

3.2.4 Hyper-everything

Electronic information stores make it possible to link information in new and exciting ways. This linking of information is variously called hypertext or hypermedia, depending on the content of the documents concerned (Bolter [1991]). At its simplest level, it is possible to have hypertextual links within a single (albeit complex) document. A familiar example of this is the help files available for many Windows applications. Here a high-level table of contents can be used to move directly to a particular section. Highlighted phrases can be clicked on to move to related material. There is no need to 'read' through the document in the familiar linear fashion. By introducing other types of information within a document (graphics, sound, moving video), one moves from hypertext to hypermedia. Lastly it is possible to have links between documents, allowing the creation of an information 'web', with no defined entry or exit points, only routes along which one can travel.

The most obvious, and fastest growing, manifestation of this trend on the Internet is the explosive expansion of the World Wide Web. WWW servers are presently appearing at a great rate, with possibly over 1,000 worldwide at the time of writing. Servers exist to provide access to the text and images of (print) magazines, genetic sequences and sequence maps, weather satellite imagery, high-energy physics reprints, and much more besides. Working with Mosaic, it is possible to almost 'feel' the electronic web of interconnections spreading and binding the Internet together.

The weakness of such hyperlinkages is the problem of knowing what links to follow to locate the desired information, and the lack of any search function outside the currently displayed document. Nonetheless, the Web is the most exciting development on the Internet at present. All sorts of information resources are being reworked in WWW format, and the implications of hypertextual access to originally stand-alone documents are still being worked through. In many ways, this is the most challenging of the new paradigms.

3.3 New Problems

New ways of working with information, while exciting, are not without cost. There are significant potential problems associated with the introduction of any new technology, and the Internet is no exception. The challenge for information workers is to appropriate the benefits of the Internet while avoiding or ameliorating the attendant problems. What are some of these problems, and how might information managers respond?

3.3.1 There's too much!

This is perhaps the most common response among new users of the Internet. They are overwhelmed by the sheer volume and variety of information accessible effortlessly (more or less) and at no cost (usually). USENET news provides more than 3,000 newsgroups, some of which receive up to 100 new items daily. There are in excess of 1,000 anonymous ftp sites worldwide, each with hundreds of accessible files. Thousands of listservers distribute email messages on specific topics to hundreds of thousands of readers. Everywhere one looks, there is more information available than one can possibly assimilate or process, even working 24 hours/day, 7 days/week.

There are a variety of possible responses to this. In some ways, the situation is not qualitatively worse than trying to interact with a large university library. The difference is that the library contains information in a limited number of forms (monograph, serial, microfilm/fiche), this information is organised according to some cataloguing system, it changes only slowly over time, and we have used and become familiar with libraries from a very early age.

The range of possible information types on the Internet is becoming less of a problem. The Multipurpose Internet Mail Extension (MIME) standards are one standard way of dealing with differing types of Information, and Adobe's Acrobat (Warnock [1992]) document interchange is another. Mosaic is a tool designed to access a range of information formats. Standards are continuing to evolve and inter-operate, and this problem is clearly technically soluble.

Organising the information is more problematic. The Internet does not have a central cataloguing department like a library (Caplan [1993]). The anarchic/chaotic nature of its administrative organisation which is its strength in some ways militates against a coordinated response to the information organisation problem. Nonetheless, progress is being made using a variety of technologies. The hypertextual 'everything is linked to everything else' model of the WWW is one approach, although finding things in a web of millions of nodes may be a non-trivial task. An alternative hierarchical approach is that taken by Gopher, where individual sites maintain their own hierarchies, with links to other sites as required. Hierarchies as a way of organising information have a long tradition behind them, but finding the correct path is not always easy. Finally, there is some promising research being done on classifying and accessing electronic information based on the semantics of its contents. The Essence system developed by Michael Schwartz at the University of Colorado (Hardy and Schwartz [1993]), exploits file semantics to index both textual and binary files, generating WAIS-compatible indexes. These can then be searched using a range of readily-available client applications.

3.3.2 Quality

Related to the quantity of information available on the Internet is its quality. Existing channels for the distribution of information, while sometimes slow, have at least acted as a filter to limit or stop the release of low-quality information. They have also, of course, limited the distribution of high-quality information that was at odds with the received wisdom as embodied in the referees used to vet submissions. These filters are largely absent on much of the Internet. The corollary of 'everyone a publisher' is that much will get published/distributed that is of low quality to most who receive it. Admittedly what is irrelevant to you may be fascinating to me, but some things are boring or time-wasting to us all. The puerile and scatological outpourings of near-adolescent college students are one example that springs to (my) mind; gratuitous advertising of dubious products in inappropriate newsgroups is another.

A number of strategies are evolving as Internet users respond to the need for quality control.

One is the growth of peer-review electronic journals to parallel the equivalent in the print world. Such journals provide all the rigour of their print cousins, but with the advantage of much faster turnaround during the review process (all conducted using email) and very wide distribution. One excellent example in the library/information science world is the Public Access Computer Systems Review, already discussed.

Another successful way to cope with the flood of varying quality information is to set up 'filtering' email distribution lists. Here, one person undertakes to monitor a field and summarise the most important items for distribution. An informal network of such people/groups can provide very high-quality high-level information which can be followed up if desired. Phil Agre's Red Rock Eater list (to join, send email to RRE-REQUEST@WEBER. UCSD.EDU with SUBSCRIBE firstname lastname in the subject line) and Tony Barry's Wombat list (To join, send email to MAJORDOMO@AARNET.EDU.AU with SUBSCRIBE WOMBAT firstname lastname in the body of the message) are examples in the fields of CMC/ethics and NIR tools respectively.

A third way to avoid low-quality information is for people to 'vote with their feet' (or fingers). A recurring problem with USENET newsgroups is the endless asking of basic questions by new users without any sense of the history of the group, or any understanding of what is appropriate. If the signal to noise ratio of a group degrades too far, advanced users will simply stop reading and contributing and move elsewhere. In this case the group may well degenerate and die of its own accord.

3.3.3 Roles for Information Professionals

A significant challenge for those in the wider information world is deciding how to respond to the Internet. This paper should have demonstrated that ignoring it, as some librarians tried to do initially with computers, is not an option. There is too much happening that is of fundamental importance to the way we all work with information. The Internet, or whatever evolves out of it, will be one of the foundation stones for anyone working with information at the start of the next millennium. How are computer scientists, librarians, and documentationalists to respond to this?

Computer scientists have a fundamental role to play in developing the tools and technologies needed to manage the growing amount of networked information, and add to its richness. Much of this work to date has, however, been performed by computer scientists alone; this can mean that some opportunities to learn from other fields has been lost. Further, computer scientists may not be the most appropriate people to create the actual information resources. To stretch the 'tools' metaphor, they may be very good at creating chisels; they are probably not all good sculptors.

Librarians/Information Scientists can bring a lot to this challenging new field, and many are. Some of the most visionary ideas about how the Internet might evolve are coming from those with a librarianship background. Librarians have a long history of managing complex information resources, and a number of well-tested tools at their disposal. They are playing an invaluable role in the debates dealing with how best to catalogue networked information resources, the task of creating and managing unique network identifiers, and the architecture of information spaces. But they too need to adapt and learn from others.

Documentationalists can provide a perspective on the role of the document. Many Internet users employ it to access documents of all types, and the challenge of how best to structure and manage those documents is a very real one. The fluidity of networked information and the range of internal designs made possible by hypertext pose very real challenges.

It may well be that what is needed is a new type of information professional, an information manager in the broadest sense, who exists on the boundaries of some of the older professions, but is able to draw on what is best from them. If such professionals exist today in nascent form, surely they will be found on the Internet!

4. Summary

The Internet provides a range of tools and services for storing and retrieving data and information, and for communicating that information to the world. The use of the Internet as an information management technology provides great opportunities for new ways of working with information. It also poses a number of problems for current views of information. The new information paradigms inherent in the Internet or whatever it evolves into require a considered response from the information profession. They also require an involvement from all facets of the profession in shaping the evolution of this global networking infrastructure. This paper has suggested discussed some of the tools, outlined some of the paradigms, and suggested some of the potential problems. The challenge for us all is to become involved and shape what will be a significant, if not core, part of all our information futures.

5. References

Agha, Syed Salim [1992], "Reflections on information management as an integral core of information studies", Asian Libraries, July, pp. 53 - 60.

Berners-Lee, Tim et al. [1992], "World-Wide Web: The Information Universe." Electronic Networking: Research, Applications and Policy vol. 2, no. 1, pp. 52-58.

Boaden, R. and Lockett, G. [1991], "Information technology, information systems and information management: definition and development", European Journal of Information Systems, Vol. 1, No. 1, pp. 23 - 32.

Bolter, Jay David [1991], Writing Space - the Computer, Hypertext and the History of Writing, Lawrence Erlbaum Associates, N. J.

Burke, James [1985], The Day the Universe Changed, Little, Brown, and Co., Boston.

Bowman, C. M., Danzig, P. B., and Schwartz, M. F. [1993], "Research Problems for Scalable Internet Resource Discovery", Proceedings of INET '93.

Caplan, Priscilla [1993], "Cataloging Internet Resources." The Public-Access Computer Systems Review Vol. 4, no. 2, pp. 61-66.

December, John [1994a], Internet-Tools. Available in Text, Compressed Postscript, and HTML.

December, John [1994b], Internet-CMC. Available in Text, Compressed Postscript, and HTML.

Deutsch, Peter [1992], "Resource Discovery in an Internet Environment-- The Archie Approach." Electronic Networking: Research, Applications and Policy 2, no. 1, pp. 45-51.

Diamond, Jared [1992], The Third Chimpanzee - the evolution and future of the human animal, HarperCollins, New York.

EARN Association [1993], Guide to Network Resource Tools, Version 2.0. This document is available in electronic format from LISTSERV@EARNCC.BITNET. Send the command: GET filename where the filename is either NETTOOLS PS (Postscript) or NETTOOLS MEMO (plain text).

Foster, Jill, Brett, George, and Deutsch, Peter [1993], A Status Report on Networked Information Retrieval: Tools and Groups, Joint IETF/RARE/CNI Networked Information Retrieval - Working Group (NIR-WG)

Goody, Jack [1981], "Alphabets and Writing" in Williams, Raymond (ed.), Contact: Human Communication and its history, Thames and Hudson, London.

Hardy, D. R. and Schwartz, M. F. [1993], "Essence: A Resource Discovery System Based on Semantic File Indexing", Proceedings Winter USENIX Conference, San Diego, pp. 361 - 373.

Kahle, Brewster et al. [1992], "Wide Area Information Servers: An Executive Information System for Unstructured Files." Electronic Networking: Research, Applications and Policy , Vol. 2, no. 1, pp. 59-68.

Lynch, C. A., and Preston, C. M. [1992], "Describing and Classifying Networked Information Resources", Electronic Networking Research, Applications and Policy Vol. 2, No. 1, Spring, pp. 13-23.

Monviloff [1990], National Information Policy, UNESCO.

Neuman, B. Clifford [1992], "Prospero: A Tool for Organizing Internet Resources", Electronic Networking: Research, Applications and Policy, Vol. 2, No. 1, Spring.

Neuman, B. Clifford, and Augart, Steven Seger [1993], "Prospero: A Base for Building Information Infrastructure", Proc. INET '93.

Obraczka, K., Danzig, P. B., and Li, S. [1993], "Internet Resource Discovery Services", IEEE Computer, September, pp. 8 - 22.

Scott, Peter. [1992], "Using HYTELNET to Access Internet Resources." The Public-Access Computer Systems Review vol. 3, no. 4, pp. 15-21. To retrieve this file, send the following e-mail message to LISTSERV@UHUPVM1.UH.EDU : GET SCOTT PRV3N4 F=MAIL.

Savetz, Kevin [1993], Internet Services Frequently Asked Questions and Answers.

Schwartz, M. F., Emtage, A., Kahle, B. and Neuman, B. C. [1992], "A Comparison of Internet Resource Discovery Approaches", Computing Systems, Vol. 5, No. 4, Fall, pp. 461 - 493.

Treloar, A. [1993], "Towards a user-centred categorisation of Internet access tools", Proceedings of Networkshop '93, Melbourne.

Treloar, A. [1994], "Information Spaces and Affordances on the Internet", Proc. 5th Australian Conference on Information Systems (ACIS '94), Melbourne, September.

Warnock, J. [1992], "The new age of documents", Byte, June 1992, pp. 257 - 260.

Wiggins, Rich. [1993], "The University of Minnesota's Internet Gopher System: A Tool for Accessing Network-Based Electronic Information." The Public-Access Computer Systems Review vol. 4, no. 2, pp. 4-60. To retrieve this file, send the following e-mail messages to LISTSERV@UHUPVM1.UH.EDU: GET WIGGINS1 PRV4N2 F=MAIL and GET WIGGINS2 PRV4N2 F=MAIL.

Yanoff, Scott [1993], Special Internet Connections List, posted periodically to alt.internet.services. Finger yanoff@csd4.csd.uwm.edu for more information.

Yates, Frances, A. [1966], The Art of Memory, Penguin, Harmondsworth.