Bibliographic Control for Archival Materials

by Moya K. Mason

Introduction: The Dichotomy that Exists Between Libraries and Archives

The goal of library catalogues is to provide access and accurate searching strategies for the materials that make up the collection, which primarily means published monographs and journals that the user can find by author, title, or subject searching. This requires developing unique machine readable bibliographic records using AACR2, and the collocation of works using references to bring together related items (Clack 1990,2-4). Since Charles Cutter described his vision of library catalogues, with its primary focus on the facilitation of information retrieval, his statements have been very influential in the field of librarianship. As Jennifer A. Younger states in After Cutter: Authority Control in the Twenty-first Century:

Controlled vocabulary for proper names and subjects in the catalog has been a cornerstone of bibliographic control in libraries since the objectives of the catalog were codified by Charles Cutter in the late nineteenth century. Those objectives, as he listed them, provided clear direction to the librarians who were creating a system for making library holdings known to users (Younger 1995,133).

To achieve this in an online environment, libraries found that a great measure of authority control was needed to improve access, using tools such as the Library of Congress Subject Headings (Clack 1988,37). Subject headings are a very powerful tool for information retrieval, bringing together similar works. Libraries also needed to connect thousands of library databases together as a way to save money by sharing resources. In reality, it was the desire to share cataloguing copy and resources that prompted libraries to undertake retrospective conversion projects and to automate their library systems. MARC formatting and AACR2 were seen as the panacea for all the problems associated with computer technology and the expense of cataloguing in libraries. MARC formats are standards for the representation and communication of bibliographic and related information in machine readable form. They are the means of organizing and keeping track of library materials and bibliographic information, pertaining to recorded knowledge. The format is "the most commonly used and widely accepted standard for bibliographic exchange" (Hickerson 1988,556), bringing the universe of libraries much closer to fulfilling Charles Jewett's dream of a highly integrated network in the form of a truly national database (Spindler 1993,332). How do archival materials fit into this scheme of things?

By the 1970s, University of Toronto Library Automation Systems (UTLAS) had begun using the MARC AMC format for archival materials; the USMARC format was approved by the Society for American Archivists in 1981 (Hickerson 1988,556). MARC AMC's function is to connect a patron with an archival finding aid, which can lead them to more specific collection details (Tibbo 1994, 137). By 1986, 70,000 archival and manuscript records could be found in the RLIN database, integrated with library holdings (Hickerson 1988,553). These catalogue records were contributed by forty-seven archival programs, including special libraries, art museums, state archives, and the National Archives, as well as university repositories. As H. Thomas Hickerson writes in Archival Information Exchange and the Role of Bibliographic Networks, "archivists and librarians share the same goal of information control and dissemination . . . and the development of MARC AMC was essential to the current level of archives/library integration," as was the revision of the Anglo-American Cataloguing Rules for Manuscripts (Hickerson 1988,557). All the same, there are inherent problems when you try to use a system developed for a particular type of material, for another. For instance, the MARC record for library materials is completed by describing an item in hand, using the title page and the publishers' information. You often can't do that with archival materials. They have very generic titles that do little to distinguish themselves, and instead, "the items in hand are described in terms of the activity out of which they arose and the actions that have been taken on them" (Bearman 1989,30). Standard library materials and original archival materials really have little in common.

This essay will briefly discuss how archivists are using the MARC AMC format to exchange information about archived materials, and will expose some of the problems and issues raised by archival use of automated library systems.

The Battle for Exposure: How Archivists are Trying to Shed Light on Interesting and Valuable Materials

Historically, archives were and in many cases still are, like isolated islands in the South Pacific, having little contact or impact upon an area besides their own immediate circle of local influence. The finding aids and inventories are documented in print or electronic form within the archives and provide a great service for those living in the area and for visiting scholars. Archival materials differ greatly from the cache of library holdings in a few fundamental ways. For one, archives mostly deal with unpublished items; secondly, they handle items that are peculiarly distinct, one-of-a-kind, and sometimes quite valuable. It is easy to see why sharing cataloguing copy wasn't an underlying motive involved in the decision for some archivists to begin contributing to large library databases. Copy is not transferable from one collection to another.

The primary reason was to expose the collections to a national audience. Increasing the likelihood that collections would be easily accessible and would provide a wealth of virtually untapped information resources was a motivating factor. In addition, almost every institution was becoming electronically connected. Machine-readable records were certainly the way to go (Hannestad 1991,93). Archivists were also aware that it helps to have materials used when it comes to budget allocation time. Providing access to archived items increases the likelihood that individual archives will stay open. The addition of MARC AMC records was also a selling feature for the big utilities like OCLC because they could offer their member libraries access to a wealth of important and not easily accessible information (Bearman 1989,27). The inclusion of archival and manuscript materials to automated library systems was a prestigious move on the part of libraries as well, because these collections are "often considered the jewels of their institutional holdings" (Bearman 1989,31). However, one of the things archivists need to keep in mind is that if they are planning to put archival records in databases that have tens of millions of items, they cannot continue creating records suitable for small, local OPACs (Tibbo 1994,136). They need to give these MARC records a fighting chance by realizing how important they and bibliographic networks are to archival materials. Both increase access and provide collocation between unpublished and published resources (Hickerson 1988,569).

Not surprisingly, there are many problems associated with using library databases for archival materials. Some librarians have compared it to trying to find a needle in a haystack. Consider how few archival documents there are in OPACs, in relation to the numbers of books. As Matthew Benjamin Gilmore points out in Increasing Access to Archival Records in Library Online Public Access Catalogs:

In a case where archival materials are entered directly into an OPAC, and not into specially designed archival subsystems, the vague nature of many titles and the general subject headings may result in those materials disappearing into a void; they could be lost in a computerized catalog that can only be searched by traditional access points(Gilmore 1988,610-611).

The other major problem with most bibliographic databases and OPACs is that they are based on 'last-in, first-out' (LIFO), meaning only the first number of items retrieved will be viewed by the patron, and not necessarily any from an archival collection (Tibbo 1994,314). As Helen Tibbo remarks, only small collections can properly manage to have books and archival materials side by side. She believes the huge numbers of MARC records found in RLIN and OCLC databases are definitely not suitable environments for archival information (Tibbo 1994, 314). Stephen E. Hannestad writes that the MARC AMC format is only functional for smaller archival settings because it was developed for library collections, and therefore, has limited hierarchal capabilities that are definitely needed to properly describe archival materials and their provenance (Hannestad 1991,92). Tibbo also points out that enormous databases such as those developed by OCLC are great for copy cataloguing and inter-library loans, when an author or title is known, but when it comes to subject searching, "it is a relatively unexplored morass . . . the larger and more heterogeneous the database, the more difficult it is to conduct subject searches effectively" (Tibbo 1994, 313).

When the Library of Congress first developed its system of subject headings, the universe of knowledge was much smaller and library materials were assigned a couple of headings to allow for the fact that patrons don't always know specific titles or authors. As a result, LCSH really did revolutionalize information retrieval in libraries. Currently, some electronic records in the RLIN database have more than two hundred subject headings (Tibbo 1994,317). Besides taking up space in expensive databases, it doesn't help a user if they get 16,000 hits when they are only interested in finding a few good items on a subject. As more information becomes available, one of the biggest problems for users will be the large numbers of records they retrieve. Too many to be dealt with effectively. For example, many people are not willing to look at more than one hundred citations, with most refusing to go beyond fifty (Tibbo 1994, 316).

Solutions to Some of the More Fundamental Problems

So is there anything that archivists can do? Should archivists stop and realize that their contributions to large library databases do nothing more than provide "a false sense of security?" (Tibbo 1994,318) Librarians can't suggest that users try to limit their searches to only 'manuscripts' because very often archival materials have been MARC-coded in the book format. This type of search strategy would result in an omission of valuable records (Tibbo 1994,319). The problem of subject headings is a major one for the retrieval of archival materials. David Bearman has said that the best way to provide access to a particular item or collection is to have as many subject headings as possible (Bearman 1989,27). As mentioned earlier, that method can prove to be far from desirable for users. As Helen Tibbo points out, "broad, undifferentiated topical headings, common to LCSH, do not appear to work well for retrieval from large electronic databases" (Tibbo 1994,318). One possibility is to use the same indexing terms for collections that are quite similar, which would allow individual archives to cooperate with each other and help them to provide better information retrieval (Tibbo 1994,318). Archivists need to find a way to differentiate their materials that enhances information retrieval of these important documents. One way is to develop surrogates, archival finding aids, or MARC AMC records which are basically well-defined, succinct abstracts of the materials. As Helen Tibbo points out:

A good surrogate eliminates noisy information that is found in all full texts and could cause an item to be retrieved when it should not be; a good surrogate also includes information that will facilitate its retrieval in response to appropriate queries (Tibbo 1994,320).

Another way is to develop good in-house databases which can be connected to library OPACs, while concentrating on inventorying, constructing finding aids, and opening the lines of communication with other archives (Spindler 1993,341).

Conclusions

So the question is whether bibliographic control is possible for archival materials. The bottom line is that it is getting harder and harder to provide bibliographic control for materials in a universe of knowledge that continues to grow in leaps and bounds. The development of better search tools, more effective searching skills, and experienced librarians or archivists acting as sifters of information for clients will be necessary for the advancement of information retrieval. Without highly advanced searching skills, patrons will not be able to find the information they need and will require help to navigate through the electronic universe that we have created for them. As Stephen Hannestad writes:

While the Continental archivists at the turn of the century were required to be knowledgeable in Palaeography and even Sigillography, by the turn of the Twenty-first Century American archivists will be required to be knowledgeable in at least the application of automation, if not in computer science (Hannestad 1991,93).

For now, MARC AMC records and bibliographic databases are the best game in town for archivists, at least until some kind of cooperative plan can be put in place to facilitate a better way to get the information out there for the public. Until then, archival materials will have to hope that somehow a patron will come along and find them.

Bibliography

Bearman, David. Archives and Manuscript Control with Bibliographic Utilities: Opportunities and Challenges. American Archivist, Vol. 52, no. 1 (Winter 1989), pp. 26-39.

Clack, Doris H. Authority Control and Linked Bibliographic Database. Cataloguing & Classification Quarterly, Vol. 8 (1988): pp. 35-46.

Clack, Doris H. 1990. Authority Control: Principles, Applications, and Instructions. Chicago: ALA.

Gilmore, Matthew Benjamin. Increasing Access to Archival Records in Library Online Public Access Catalogs. Library Trends, Vol. 36 (Winter 1988), pp. 609-623.

Hannestad, Stephen E. Clay Tablets to Micro Chips: The Evolution of Archival Practice into the Twenty-First Century. Library Hi Tech, Issue 36, Vol. 9, no. 4 (1991), pp. 75-96.

Hickerson, Thomas H. Archival Information Exchange and the Role of Bibliographic Networks. Library Trends, Vol. 36 (Winter 1988), pp. 553-571.

Spindler Robert P. and Richard Pearce-Moses. Does AMC Mean "Archives Made Confusing?" Patron Understanding of USMARC AMC Catalog Records. American Archivist, Vol. 56 (Spring 1993), pp. 330-341.

Tibbo, Helen R. The Epic Struggle: Subject Retrieval from Large Databases. American Archivist, Vol. 57, no. 2 (Spring 1994), pp. 310-326.

Younger, Jennifer. After Cutter: Authority Control in the Twenty-First Century. Library Resources & Technical Services, Vol. 39 (April 1995): pp. 133-141.

Back to: Resume and More Papers