From Page to Pixel: The Evolution of the Academic Library

Volume 5, Issue 1, Winter 2015

By Alexander Gelfand

If you want to get a good sense of what Ground Zero looks like in the ongoing transformation of academic research libraries, just walk down to the second floor of Lehman Social Sciences Library, in the basement of the International Affairs Building.

There, amid the oversized atlases of the Map Room and the hushed stacks of the Business Collection, you will find an unassuming wooden door leading to the Center for Digital Research and Scholarship (CDRS). This is where Rebecca Kennison, director of CDRS,  and her team—a diverse set of professionals with expertise in everything from information science and multimedia  production  to  nonprofit  communications and fundraising—develop and deploy a host of technological tools to help faculty and graduate students across the University manage and share their research.

If that sounds like a broad mandate, it is. Among other things, the staff of CDRS show researchers how to deposit their materials in the Academic Commons, Columbia’s online repository of scholarly work; offer guidance on developing the digital data-management plans now required by major grant agencies like the National Science Foundation and the National Institutes of Health; and build full-blown digital platforms like the Women Film Pioneers Project, a collaborative online database that contains information on hundreds of women who worked in the silent film industry. (The project was initially conceived as a multivolume print reference by Jane Gaines, a professor of film here at Columbia.)

You can almost feel all of that activity buzzing away in the background of the windowless, warren-like confines of CDRS. Step back out into the lower level of Lehman, however, and the buzz is gone. The business stacks, redolent of old leather and bookbinding glue, are deserted, the carrels are empty, and the microfiche cabinets are adorned with signs explaining that most of their contents are now available online. A handful of students occupy the Quiet Study Area, but none peruse the neatly shelved foreign newspapers that surround them. Instead, their eyes are fixed on the glowing screens of their laptops and tablets.

Things look very different upstairs. The lone figure seated at the building’s sole remaining analog microfilm reader is a technician doing routine maintenance. But every available workstation in the sprawling Digital Social Science Center is occupied, and the group study rooms hum with lively conversation. Unlike the stacks below, the study rooms smell of coffee and people—the characteristic aroma of students at work.

Similar scenes play out across campus. The Robert M. Rosencrans Reading Room in Butler Library, for example, is filled to capacity, but you won’t hear many pages being turned; rather, the silence is broken only by the staccato clicking of fingers on keys and the occasional startup chime of an Apple laptop. Much the same is true in the third floor Reference Room, whose wirelessly connected inhabitants seem as oblivious to the print volumes surrounding them as they are to the ornately worked gilt ceiling and triple-tiered electric chandeliers that loom over their heads.

Just around the corner, however, the hardware at the Digital Humanities Center (DHC) is getting a workout, as a student uses a digital microfilm scanner to scroll at high speed through images displayed directly on a computer monitor. And downstairs in  the Studio@Butler, a “collaboratory” for educators, scholars, and librarians that is funded by Columbia University Libraries and the Graduate School of Arts and Sciences, an intense discussion about good pedagogical practice is taking place between Mark Phillipson, ’88CC, director of the Teaching Center, and a group of graduate teaching assistants. Run jointly by the DHC and the Teaching Center, the Studio partners with the Columbia Center for New Media Teaching and Learning (CCNMTL), CDRS, and others to offer a variety of programs. On any given day, you might find Alex Gil, digital scholarship coordinator  for the DHC, leading a workshop on natural language processing with Python; or representatives of CDRS offering tips on how to keep digital data safe and accessible; or, as was the case one afternoon this past September, faculty and graduate students from the Department of Latin American and Iberian Cultures holding a “researchathon,” the scholarly equivalent of a hackathon, to help a Ph.D. candidate find digital resources for his dissertation project.

Whether any of this surprises you will depend on when you last set foot inside an academic research library. But all of it speaks to the technological revolution that is reshaping such institutions across the country.


As Elliott Shore, executive director of the Association of Research Libraries, likes to point out, the structure and organization of the research library are deeply rooted  in 19th-century ways of thinking: specifically, in the idea that complex problems can be solved by breaking them down into discrete tasks and handing them to people with special expertise (i.e., certified librarians trained in cataloging, preservation, and the like); and in a production-economy mindset that frames libraries as knowledge factories that acquire, process, and make available discrete products for consumers (i.e., scholars and researchers).

That system worked well for all concerned as long as those products remained relatively fixed and well-defined, and as long as the research library itself held a monopoly on physical access to them. Needless to say, those conditions no longer apply. The rise of networked data and the World Wide Web effectively destroyed the research library’s monopoly on information, placing vast amounts of material in the hands of anyone with Internet access. And the sources and repositories of that information (wikis and blogs, video files and e-books, sensor data and cloud storage systems) have become far more diverse and ephemeral than traditional print volumes and journals, just as the questions surrounding their control and use—who owns them and where they are located, how they should be preserved and made accessible to researchers—have proliferated.

As a result, libraries and librarians have arrived at a rather strange place: one characterized by great uncertainty, yet also by great opportunity. On the one hand, many librarians have come to question everything from their training to their relevance; to ask what, exactly, they are supposed to do, and how they are supposed to do it, in an age when researchers are far less likely to set foot inside libraries at all, and when traditional printed matter—indeed, textual data in general—occupies an ever-diminishing proportion of the information they are expected to tame. On the other hand, they are perhaps more vital to the enterprise of scholarship than ever before. As information becomes richer and more mutable, harder to capture and easier to miss, librarians—who are, after all, experts in information management—are poised to become the researcher’s best friend. Who better to help you drink from a fire hose than a professional fireman? Just don’t expect today’s librarians to come packing the same gear, much less the same skills, as their predecessors.


Jeffrey Lancaster, Ph.D. ’11, Chemistry, sits at his desk on the ground floor of the Northwest Corner Building, rummaging through a small cardboard box.

“This is a fragment of a Roman urn from the Rare Book & Manuscript Library,” he says, pulling out a smallish piece of white plastic bearing a human figure in bas- relief. “And this is a self-assembled DNA nanocapsule,” he notes, holding up a pair of delicate, nested cylinders made from the same material.

Lancaster, who is emerging technology coordinator for the Digital Science Center (DSC), a unit of the Science and Engineering Library, printed both objects using the MakerBot Replicator 2 3-D printer that sits in a corner  of his office. Anyone in the Columbia community can submit a 3-D printing request, and over the past year, Lancaster has generated everything from a model of a supermassive black hole to a pair of chopsticks. Yet he sees the printer simply as a tool for engaging faculty and students of all stripes (scientists, journalists, art historians) in conversations about how technology can offer them new ways of doing their work. Could a mathematician, for example, render her equations as 3-D models using the licensed software packages available through the Center and print them for her students to see and touch? Could an archaeologist use computer-aided design software to print examples of ancient Greek pottery?

Barbara Rockenbach, director of the Humanities and History Libraries, views the scanning and optical character recognition capabilities of the DHC in much the same way. A researcher might come in simply to scan a manuscript and convert it to digital form. But a staffer can use that as an opportunity to introduce more sophisticated applications, like pattern recognition software that can mine texts for interesting motifs—something that Gil recently helped a faculty member do with Supreme Court Justice Sonia Sotomayor’s autobiography.

It bears noting that not everything worth mentioning about Columbia’s version of a 21st-century academic research library is digital in nature. Jim Neal, M.A. ’73, History, the outgoing Vice President for Information Services and University Librarian, notes that Columbia has undertaken unusually close partnerships with institutions such as Cornell and NYU to develop shared collections that will further expand the University’s already vast print holdings and make it easier for scholars to access information. ReCAP (Research Collections and Preservation Consortium), the massive print repository that Columbia shares with Princeton University and the New York Public Library in Forrestal, New Jersey, is now the largest in the world: the temperature-and-humidity controlled facility contains 11 million items and fills more than 250,000 requests each year from libraries around the world. (Columbia’s own library system, officially known as Columbia University Libraries/Information Systems, ranks among the five top academic libraries in North America; its 21 individual libraries hold more than 12 million volumes, 12 miles of manuscripts, and 800,000 rare books.) And Rockenbach emphasizes that the DHC is only able to work its technological magic because it sits on top of “an amazing print collection that continues to grow”— namely, the two million volumes housed in Butler’s stacks. “There are still moments when the print book matters,” she says. And moments, too, when nothing matters more than an experienced librarian with knowledge of your particular research area (medieval history, Chinese politics, global climate change) who can point you toward the right resources, including unique special collections like those in the C. V. Starr East Asian Library, which contains the largest collection of Tibetan-language materials outside China, and the Rare Book & Manuscript Library, whose holdings range from cuneiform tablets to printing presses.

Even those collections, however, are being reshaped by the digital tsunami, either because the print materials they contain are slowly being scanned and digitized, or because they are home to more and more material that was digital from the start. The Human Rights Web Archive at the Center for Human Rights Documentation and Research includes more than 50 million pages of content in 60 languages. And when the late poet and activist Amiri Baraka donated his papers to Columbia, he handed the University a hard drive with 15 years of e-mail on it. Preserving that kind of born-digital content—some of it created with now- obsolete hardware and software, some of it containing links to external websites or to audio or video files— presents a huge challenge for today’s librarians.

So, too, does acquiring the technical skills required to keep pace. Someone has to build the interfaces that allow researchers to find what they need. Someone also has to organize and format all of the underlying data in a way that renders it useful. And that someone is, increasingly, your friendly neighborhood librarian— who must, as a consequence, now know something about metadata and database design, interface usability, and digital preservation.

Toward that end, Neal, who earned his own degree in library science at a time when punch cards were considered cutting edge, has long advocated hiring personnel who have the skills and experience that a contemporary librarian needs, regardless of whether they hold an M.L.S. or equivalent professional credential. Hence staffers like Lancaster, who as a grad student built an app to help his lab mates search the digital versions of science journals. Or Gil, who holds a Ph.D. in English from the University of Virginia and was among the first to participate in the latter’s Praxis Program, which gives graduate students hands-on training in the digital humanities. Or Kennison, who spent most of her career prior to joining CDRS in science publishing and was employee number one at the nonprofit open-access publisher Public Library of Science (PLOS). None have degrees in library science— “I always used libraries, but this is the only time I’ve  ever worked in one,” Kennison says—but all possess some combination of subject matter expertise, research experience, and technical skill, attributes that characterize the hybrid professionals, or “hybrarians,” who comprise a growing percentage of those who  now find work as research librarians.

To be fair, however, those same attributes can be found among those who hold traditional credentials as well. Mark Phillipson earned his Ph.D. in English from the University of California, Berkeley; worked for the search portal Excite during the dot-com boom; and picked up his Master of Library and Information Science while teaching full time at Bowdoin College, where he garnered national attention for his pioneering use of wikis, before joining CCNMTL as a senior developer and eventually leading the Teaching Center. (As a 19-year-old Columbia undergraduate on federal work-study, Phillipson guarded Butler’s stacks, and spent hours erasing the pencil marks that graduate students left in the margins of books—an ironic task for someone who would go on to promote the use of online collaborative tools that allow students to digitally annotate texts without defacing them.) And Rockenbach, who holds dual Master’s degrees in Art History and Library and Information Science, worked not only in the Yale libraries but also for the online text and image repositories JSTOR and ARTstor before coming to Columbia. In an effort to learn the latest digital humanities tools and methods, she, Gil, and  the other members of Humanities and History are working together on a project-based training program, the Developing Librarian Project. (The first cohort is currently building an online history of Morningside Heights, unveiled in January 2015.)

These new-model librarians see themselves—and wish to be seen—not as service providers who are confined to storing, finding, and retrieving things on demand, but as research partners who can help faculty and students navigate an increasingly complex information environment. That trend is reflected in many ways: in the outreach that Gil does to various humanities departments in order to get a sense of the kinds of research questions they’re asking already, and to offer suggestions on how technology might help them ask new ones; in the letters that Kennison writes in support of grant applicants who must demonstrate that they have the infrastructure and support necessary to  make their digital research data accessible yet secure; and in the Digital Center internship program, which gives graduate students the opportunity to work with librarians and technologists on projects of their own choosing.

Those projects are themselves indicative of just how much times have changed. José Tomás Atria, a Ph.D. student in the Department of Sociology whose dissertation research involves mining a massive collection of digitized criminal transcripts from the historic Old Bailey Court in England, is using his internship at the Digital Social Science Center (DSSC) to develop online interfaces that will make it easier to share his work with other researchers and the general public. And Buck Wanner, a Ph.D. student in Theatre, is using his year as a DHC fellow to build a database of the rehearsal and performance spaces used by theatrical choreographers in New York City from 1970 to the present, along with their private residences—information he plans to render visually  with the help of the librarians at the DSSC, who have particular expertise with mapping software. It’s the kind of project, Wanner says, that could probably be done using analog tools and published as a print monograph rather than as an online resource. But it would also probably take five years to complete instead of one, and the end product—a single, specially bound copy stored under restricted access in a physical archive— probably wouldn’t be seen by more than a handful of people.

Not surprisingly, Rockenbach sees the digital capabilities and technological guidance offered by the Columbia libraries as tools for attracting and retaining faculty and graduate students. They certainly helped sway Emily Clark, a Ph.D. student in ethnomusicology, who decided to come to Columbia after earning a Master’s in Information Studies from the University of Texas at Austin. Ethnomusicologists have been quick  to use technology to give communities access to the sound recordings that researchers make of their music; and before she ever arrived on campus, Clark had already corresponded with Aaron Fox, a professor of music who has been repatriating recordings from the University’s archives to an indigenous Alaskan group through a password-protected website. Fox, in turn,  put Clark in touch with Rockenbach and Kennison, who pointed her toward other innovative uses of digital technology in her field. One day, they may even help her package and present her dissertation, which like much contemporary research might well include forms of data (video, audio, interactive multimedia) that are not easily captured in print—a challenge that was directly addressed by a recent event titled “What Is a Dissertation? New Models, New Methods, New Media” that was jointly organized by the Studio@Butler and CUNY and live-streamed over the Web.

All of this speaks to the way in which the library has become a place where faculty and students can mingle and collaborate: a place where they can think, as Phillipson says, about new projects and new scholarship that can be built around library materials. Free from the disciplinary constraints of any one department or the agenda of any particular institute, the library today represents a rare neutral space for scholarly activity, one where researchers of all kinds can come together to engage with their subjects, with technology, and with one another in fresh and provocative ways.

As the custodians of that space, librarians are now, more than ever, stewards of the shared scholarly endeavor that lies at the heart of any great research institution. Part information management consultants, part research advisers, and part technology gurus, they are the bridge between the library’s past and its future, and perhaps the University’s as well.