A Web World Full of Library Cats:
WorldCat, Open WorldCat, and WorldCat.org
by Laura Baas
While the Online Computer Library Center (OCLC) has long been a leader in developing online catalogs and a resource-sharing network, it continues leading libraries into the networking era through innovative research and initiatives—not the least of which is the movement of bibliographic data onto the web. Dempsey (2006) writes, “Put data where the user is. Sometimes the user will come to a catalog; other times the catalog needs to go to the user” (n.p.). Recent OCLC initiatives—namely Open WorldCat and WorldCat.org—embrace the web as a platform upon which library data should be placed in order to meet and serve users.
Open WorldCat and WorldCat.org both draw bibliographic information from Worldcat’s central database to directly connect users with information. Galvin (2006) writes, “OCLC has recast existing library services and created new ones with WorldCat at the core” (p. 17). The rationale underlying recent WorldCat incarnations is elucidated in OCLC’s strategic plan “Extending the OCLC Cooperative” (2000). The strategic plan underscores the exigency of making WorldCat’s information accessible beyond the library portal; it seeks to implement WorldCat “in many versions from many paths: through individual library portals…And through information partner portals (e.g., through database aggregators, Web search engines, and Web portals)” (OCLC, 2000, p. 12). Dempsey (2006) predicts, “…discovery of the catalogued collection will be increasingly disembedded, or lifted out, from the ILS system, and re-embedded in a variety of other contexts” (n.p.). Both Open WorldCat and WorldCat.org “disembed” catalog content and forge unprecedented open paths to WorldCat data. Ashmore and Grogg (2006) assert that these two resources put a user-driven, web-based spin on traditional library resources because they offer immediate access and they are freely available on the web (in contrast to being located five clicks deep on a library’s website or hidden behind a library wall). The bulk of the paper will delineate these two initiatives as they exemplify the opening of library bibliographic data to the world of the web. Challenges to and critiques of these two initiatives—particularly the allowances made to keyword searching and the cost of inclusion and exclusion for individual libraries—will be addressed.
A Different Breed of Cat: Open WorldCat
With WorldCat’s recent manifestations, searching WorldCat no longer means the same thing to all people. OCLC’s online union catalog and shared cataloging system began in 1971 and has since blossomed into the 84+ million record union catalog known as WorldCat (WorldCat Facts np). First Search and Connexion have long offered libraries paid access to this central WorldCat database. El-Sharbini comments that the effect of this subscription only access to WorldCat “left the valuable resources of world libraries outside the emerging information mainstream that we now call The Web” (p. 57). Simply put, Open WorldCat places the valuable bibliographic resources of world libraries onto The Web.
Nilges (2006) asserts “two driving forces” lie behind OCLC’s development of Open WorldCat—access and cooperation (p. 431). First, since many users choose to conduct their information seeking on the open Web, it follows that integrating library materials into the open Web will augment access. Second, by forging partnerships with prominent search engines and web sites and providing them with the data collected from extant inter-library cooperation, libraries’ collections and services gain visibility and entrance into the average web searcher’s daily workflow. Users no longer need to enter through the FirstSearch interface at a library or as a validated user in order to access WorldCat and “find” that which they seek (Galvin, 2006, p. 16). “Find in a Library” is OCLC’s chosen brand phrase that consistently prefaces every Open WorldCat record available through the Google and Yahoo! indices.
The Open WorldCat “Find in a Library” search display includes basic bibliographic data pulled from WorldCat’s MARC record fields. Through OCLC’s Registry initiative, it maintains a burgeoning directory of links to library catalogs which are then employed in order to take users directly to the page in the library’s OPAC that corresponds to the item found via Open WorldCat (Nilges, 2006, p. 440). OCLC has been compiling statistics to buttress its claim that opening up WorldCat to the world will increase library’s visibility and value. Examples of statistics supporting this claim include: users who enter a library’s website through “Find in a Library” links click on another link 15-20 percent of the time; users coming from an IP address that OCLC recognizes have frequently been clicking through to authenticated links (p. 443). Libraries are able to utilize the “Find in a Library” link from within their own catalog records as well. If an item is unavailable to users after running a local search, they can click on the Open WorldCat link and expand the search ever outward.
Ashmore and Grogg (2006) suggest that making WorldCat data available on the web represents a shift in the nature of the web: “In essence, Open WorldCat represented a shot at improving the quality of information on the Open Web by infusing it with organized and ‘vetted’ records” (p. 47). However, Open WorldCat on the open Web is not quite the utopia it at first appears. For starters, Google and Yahoo! have chosen to index only a portion of WorldCat’s data. The ramifications of this indexing decision is that users searching through a search engine have the chance of accessing only a small portion (currently about 3-4 million) of the available bibliographic data instead of the 84+ million records WorldCat made available. Even for the items that have been indexed, the perpetual hurdle Open WorldCat faces is making the service’s links available to users amidst the morass of search engine results. For many items that are in the index, it is difficult to find Open WorldCat records in popular search engines without doing a specific search for them (if the items are indexed, they will usually come up fairly soon if one uses the “find in a library” phrase along with the item information in the search box). The WorldCat.org website advises users to include the following limits with search terms: “find in a library” in Google and “site:worldcatlibraries.org” in Yahoo! (Other Ways np). Still, Open WorldCat links will continue to manifest low relevance in the Google economy until people start linking to them (for examples, libraries could link to these records from within their OPACS). Open WorldCat holds value for those who find and follow its “Find in a library links.” Nevertheless, its shortcomings moved OCLC in August 2006 to create a more stable and consistent presence on the web for WorldCat bibliographic data—enter WorldCat.org.
Destination Station: WorldCat.org
The Worldwide Web has become an information discovery point for many people. By making bibliographic data and holdings information publicly available, WorldCat.org seeks to become a discovery point for library resources. Like Open WorldCat, WorldCat.org exists to augment access to and visibility of library collections by placing them on the open Web. Thus, WorldCat.org is interrelated with Open WorldCat, but it differentiates itself by being a destination site in its own right. Barbara Quint asserts WorldCat.org to be, “the single-most important bibliographic database for monographs and library holdings ever seen on the planet” (p. 81). While Quint’s proclamation may seem dramatic, it does serve to underscore that WorldCat.org’s existence marks a recognition that library holdings need to be visible to people at the point of need in order for them to be accessed and used. For many searchers today, point of information need is web searching.
Dempsey (2006) suggests that in the wake of user’s interacting with consolidated information resources that offer unified discovery and impact user expectations of service such as Amazon, iTunes, and Google, information professionals need to investigate ways to enhance access to library information without locking it down behind library catalogs. Dempsey (2006) continues, “So, unlike the major online presences, our systems have low gravitational pull, they do not put the user in control, they do not adapt reflexively based on user behavior, they do not participate fully in the network experience of their users” (n.p.) He asserts that catalog data must be harnessed to leverage a discovery environment which is outside library’s control to bring people back into the library catalog environment (e.g., Open WorldCat). Integration of the discovery to delivery process will increase library’s gravitational pull and, subsequently, increase use of library resources
Unlike recent web behemoths, libraries have been recording item data and creating metadata for decades. Dempsey (2006) labels the phenomena of harnessing the data for users’ benefit by finding ways to lower transaction costs and unifying the discovery experience in order to draw users on the web back towards library services the “leveraged discovery experience” (n.p.). Open WorldCat and WorldCat.org provide access to library collections when and where it makes sense to web-based information seekers. They provide a new “discovery” view of the WorldCat database on the open Web that adds value to library resources.
Data Sources: Worldcat and the User
While having access to library resources through the web adds value, users also value that to which they are contributors. WorldCat.org enables user access to WorldCat data through a variety of contexts in attempts to gain loyal stakeholders who repeatedly utilize and contribute to library services. Galvin (2006) proposes, “The ability for online users to create identities, do their own work and even contribute to the library experience makes them stakeholders in their institutions. WorldCat enhancements are encouraging this behavior” (p. 16). Each record displays item information in tabs that have been built into WorldCat.org. The information contained in these tabs is part generated from bibliographic records and part user generated content. For example, the details tab contains space for bibliographic data about the item pulled from WorldCat’s database and for users to add table-of-contents and note information. The reviews tab enables registered users to write their own reviews. Users cannot tamper with reviews that have been added by others (although OCLC does reserve the right to delete inappropriate content). In order to add content to WorldCat.org, a user must register and create a WorldCat login and password.
WorldCat.org goes beyond user participation in catering to its users. The website’s search design is equally user-centric. The site defaults to a simple search using the increasingly prevalent one box keyword search solution. For searchers used to web search engines, WorldCat.org’s streamlined one-box search interface signals a respite from the oft-times convoluted OPAC. Users enter a term, WorldCat.org employs its search algorithm, and search results return as a list of possible items that may be further narrowed through faceted browsing.
Since keyword searching is imprecise at best and egregious at worst, faceted browsing is crucial for narrowing results sets to find the item sought. Depending on the search terms, WorldCat’s search results can be a mixture of all types of item. After the initial search is conducted, limiters (facets) appear on the left hand side of the interface, namely author, content, format, language, and year. Faceted browse allows a user to dynamically filter search results by categories, or facets, using the left-hand Refine Your Search panel, helping users narrow a large result set. The hyperlink facet labeled Content is built on OCLC’s conspectus which provides a subject hierarchy of three levels: division, category, and subject (OCLC Conspectus n.p.). While the Conspectus is not nearly as detailed as typical classifications, it does provide a simple, consistent backbone for content designations.
In addition to simplifying the search interface, WorldCat.org also streamlines the results interface. WorldCat.org FRBRizes its results so that searches return fewer results. The FRBR results for a work may be revealed for those who wish to see them by clicking on the editions tab. If book has multiple editions, the editions tab will appear on the results screen enabling users to see all the items that are similar. Lumping edition information within a tab both keeps it available and keeps it from cluttering up the results screen. The editions tab pulls together all of the various editions. Streamlining both the search and display interfaces manifests an effort to reach outside the library environment to gain the loyalty of users who do not know how to use or do not wish to use library language. The results screen also contains a green “Buy it Now” button for users to gain direct access to the item they seek (the button provides OCLC with a source of funding for WorldCat.org beyond FirstSearch subscriptions).
WorldCat.org also uses the web interface to facilitate user ease of access by offering a variety of web tools such as the downloadable search box and the browser and toolbar extensions. Boyd (2006) asserts, “OCLC has recognized that in order for libraries to remain relevant in Web 2.0, an interactive online world in which users expect to be able to interact and have instant results, they will have to change their approach” (p. 1). The downloadable search box may be added to any website, thus enabling entrance into WorldCat.org results from remote sites. Similarly, users can add the WorldCat.org button to the Firefox browser or to the Google or Yahoo! toolbar and have instantaneous access to WorldCat.org’s simple search results.
WorldCat.org: Relevance Ranking
Yet, all of these dazzlingly convenient search and access options are obviated if the simple keyword search does not return relevant results. A search engine’s relevancy is determined in large part by the algorithm it employs, and OCLC is not fully disclosing WorldCat.org’s underlying relevancy algorithm but it has revealed some general details. Hane (2006) asserts about a dozen factors are involved: “Title takes the highest priority; other factors include frequency of terms, number of library holdings, author names, publication year, etc” (p. 1). Since title words receive the most weight, including a title word in the search is likely to produce relevant results higher than using a word from the subject or author field. Also, search limits cannot be employed from the outset but must be selected after the initial results are returned. OCLC’s reticence to fully disclose the search algorithm and to provide a plethora of limiting options may be partially because WorldCat.org remains in beta, and OCLC is constantly refining the algorithm based on user feedback. However, reticence may also arise because WorldCat.org is intended for simple searching by users; OCLC still points to FirstSearch as the interface of choice for conducting advanced search techniques.
In this way, the proprietary FirstSearch database does retain some searching advantages at least for librarians familiar with its commands—help guides reveal that searching in certain ways pulls up certain items prescribed by the search criteria. Another difference is that FirstSearch returns results according to quantity of libraries holding an item, while WorldCat.org bases results mainly according to its relevancy algorithm. WorldCat.org is purposely designed to be a keyword searching only interface; for many web users, this is near nirvana.
Recent Changes to Beta WorldCat.org
Keyword searching is not a one size fits all option. Beall (2007) asserts relevancy ranking to be “a mysterious, inconsistent, and unnatural means of sorting search results, and a source of perpetual search fatigue...” (p. 49). Beall fears that “cheap and abundant keyword searching is beginning to replace metadata-enabled searching” (p. 50). WorldCat.org is a curious mix of using library bibliographic data to support keyword searching. The search engine has great value by virtue of its placement in users workflow, but librarians should also be wary of the shortcomings of keyword searching and continue to alert patrons of the value of using metadata-enabled search engines such as WorldCat through FirstSearch or the local OPAC.
Despite the popularity of keyword searching, there are times and there are users that call for advanced search options. After barely a month with only the one-box search, WorldCat.org responded to user feedback and added an advanced searching option to its interface. However, the simple one-box search is still the default on the home page and the “Advanced search” is still fairly basic (mainly involving adding additional search boxes for the title and author indices). Advanced search offers additional limits in the form of limiting results to items in a particular language, format, and publication date range. Advanced Search also incorporates the ability to search by the precise numeric fields of ISBN and OCLC number.
Overall, WorldCat.org’s advanced search comes nowhere near the over fifty keyword and phrase indices available through WorldCat in FirstSearch nor is it intended to. WorldCat.org is intended for the convenience of the average web user and the enhanced visibility of library data on the web. Since WorldCat.org provides entrance into meta-data enabled search engines, it is a valuable resource for today’s libraries.
Another recent change is that WorldCat.org is now programmed to determine the geographic location for a user based on IP address, even when the user has never before entered a postal code. The faceted browse feature has been enhanced so that users can now expand the abbreviated results by clicking on a “Show More” link. Each facet defaults to show five results, so the “Show More” option effectively adds to browsing functionality. WorldCat.org has recently added access to supplemental links pertaining to an item on the main results page under the heading “Web Resources”.
Yet, as with Open WorldCat, WorldCat.org falls short of library and information science utopia. The keyword searching pitfalls have already been discussed. In addition, WorldCat’s opening to the Web reaches only as far as its FirstSearch subscriptions. Libraries that contribute to WorldCat but do not subscribe to FirstSearch do not have the option to display their holdings in search results. As a result, users who search WorldCat.org may conclude that libraries near them do not hold an item when it is possible that their home library owns the item but simply does not subscribe to FirstSearch.
Mitchell (2006) laments this closed silo and asserts that true progress on the data front will be made when all libraries “(freely) contribute their own data, anyone can (appropriately) use and reuse the data [allowing others to build APIs based on WorldCat data],” and OCLC recognizes itself as just “a piece of an uncontrollable puzzle rather than a black hole that sucks all data and clicks to Ohio...”(n.p.). Open WorldCat continues to be limited by indexing and linking decisions largely beyond libraries’ control. By contrast, WorldCat.org holds only itself back. As long as some libraries’ holdings are excluded, WorldCat.org will remain a partial reflection of library holdings and will provide only partial access to the valuable data libraries have carefully collected and organized for decades.
Conclusion
Catalogers know the importance of providing users with multiple access points. Placing library data into new contexts such as the open Web has the potential to greatly augment access to this data. In order for this potential to be realized and for the movement of WorldCat.org’s data onto the open Web to have significant impact on users and the value of libraries, Boyd asserts libraries must move forward in two ways: catalogers must continue providing full and accurate records and libraries must enable deep linking to their catalogs so users can jump straight from the web into the relevant record [for example, setting up deep linking into the catalog from WorldCat.org] (p. 4). Simply making the data available to users is not enough—the data must be manipulated to maintain and capitalize on its underlying bibliographic organization in ways that users find accessible.
In sum, catalogers and cataloging services will continue to be an invaluable aspect of organizing information. Catalogers create records by combining cataloging rules with knowledge of how the recorded data will enable users to find, gather, select, evaluate, and navigate items. These records provide the foundations of catalogs, and each carefully recorded piece of data must be put to work in serving users. Open WorldCat and WorldCat.org make use of the bibliographic information that catalogers have been recording for decades. In this way, these OCLC cats are valuable tools that will help libraries reap the benefits of the networked era, as long as the libraries subscribe to FirstSearch, of course.
Return to Top |Return to Main Site | Return to WorldCat Site Home|Paper References