The e-book conversation so far has been dominated by concerns over the market share and intentions of Amazon, and, in the library world, whether libraries can license access to frontlist e-books from the major publishers, and if they can, at what price, under what terms, and through which intermediary. But there is a greater, long-term concern with the way our e-book future is shaping up—preservation.
As the e-book market develops on a licensed-access model, librarians caution that we are tumbling toward a digital memory hole in which large portions of our literary heritage could one day be lost altogether. They cite a host of difficulties in keeping digital content safe, usable, authentic, accessible, and discoverable, owing to a matrix of legal questions, format issues, digital rights management problems, and a fractured business ecosystem driven by proprietary platforms.
“Preservation is the ‘global warming’ issue for e-books,” says Robert Wolven, associate university librarian for bibliographic services and collection development at Columbia University Libraries and the outgoing co-chair of the American Library Association’s Digital Content Working Group. “Everyone knows that if we don’t do something now, we’ll be in big trouble later.”
Owning the Future
“Preservation is often acknowledged as important,” says Sheila Morrissey, a senior researcher at ITHAKA, a not-for-profit organization that helps the academic community use digital technologies to preserve and advance the scholarly record. In the early days of a still developing e-book market, Morrissey acknowledges that it is easy to "shunt aside" preservation issues. But libraries and publishers need to take joint action, she says, to ensure that the e-books we are reading for enjoyment and research can survive over the long-term.
Morrissey recently co-wrote a report on the subject for the Digital Preservation Coalition (DPC) called “Preserving E-books” with Amy Kirchhoff, an archive services product manager at Portico, a digital archive. In the report, Morrissey and Kirchhoff conclude that the burden of ensuring “long-term, permanent access” to licensed e-books is dangerously ill-defined, and that the preservation of e-books is in a tenuous state, rife with ambiguous responsibilities, rights, and questions about cost.
“A key issue is the ownership model for e-books, or, more correctly, the non-ownership model,” Kirchhoff explains. Because most e-books are licensed to a library or to an individual reader, and not owned, libraries cannot ensure the stability of their e-book collection as they have maintained physical collections over the years. “Publishers can, and have, removed content, or modified e-book content,” she adds, “and there are few explicit protocols for propagating such modifications to e-book content.”
The ability to modify e-books has thus far been framed as a positive for consumers, enabling ready access to updates and other premiums. But it cuts both ways, Kirchhoff notes.
“You do not want to turn on your e-book reader one day and find, as some readers have already, that all or part of a book has been deleted, and that the original version has not been preserved.” Indeed, e-books can easily disappear, or corrections, deletions, or other changes could be made without alerting readers, possibly altering the historical record.
In the present licensed-access e-book environment, perpetual access and preservation rights are the exception rather than the norm for e-books, and no best practices have yet been established for contracting with content producers, the DPC report acknowledges. The report concludes that when licensed content resides in the cloud, it is never truly in the possession of the library (or a consumer) and “there is no guarantee” of perpetual access or preservation, particularly if the e-book aggregator or service provider does not have explicit long-term preservation rights to the book.
Some publishers do offer “perpetual access” to “purchased” content. But that access is often murky. After all, what does a perpetual access promise mean if a library must continue to pay to access content it has “purchased” from a third party, with platform maintenance and service fees? What if that platform’s infrastructure no longer exists, or is based upon obsolete formats? What if the content becomes available only through a larger and more expensive bundle that the library does not want to license or purchase?
“Those most likely to suffer from this kind of loss—readers—are not in a position to act,” Wolven observes. “And those who could do something [libraries and publishers] have other priorities right now. But someone is going to have to support the up-front costs to reap future benefits.”
Meanwhile, the potential digital memory hole is also expanding because of the various digital rights management technologies and proprietary e-book formats. DRM can significantly impede the preservation of e-books, and changes in DRM technology, such as Adobe’s recently announced “non-backwardly compatible” change, can make some content forever opaque, because even if you can see the object, DRM can prevent its access.
“Some publishers enforce their license with digital rights management technologies, and without an appropriate ‘key’ or, over the long term, services that process that key, a ‘DRM’ed’ e-book today will not be readable in the future,” Morrissey says.
DRM issues are compounded by the multitude of e-book formats. For example, iBooks is a variant of ePub, but it has proprietary extensions that prevent it from being read on applications outside iBooks and iBooks Author. Kindle devices and applications will not render books in ePub format. And Nook, iBooks, and Kobo will not render books in MOBI (or KF8 or AZW) format. The various formats also have created a proliferation of ISBNs making it difficult for preservation institutions and libraries to identify what they are attempting to preserve.
“The proprietary nature of these formats especially in combination with their DRM regimes comprise a long-term preservation risk,” the DPC report states, although, in a glimmer of hope, it adds that ePub3 meets many of the requirements for a “preservation-robust format.”
Although major questions loom, the future is far from lost. The DPC report concludes that libraries and preservation institutions can negotiate permanent preservation rights and ensure that preservation copies do not include DRM technologies or other formats that prevent their future use. “Most publishers would like to preserve their content,” Morrissey says. “So these issues can all be dealt with at a contractual level.”
While she concedes that the publishers’ business models and the mission of libraries and other cultural memory institutions may not always coincide, Kirchhoff says that fruitful arrangements can and have been made. She cites the HathiTrust’s collective model, the subscription services of Portico, and CLOCKSS, an archive based on the successful LOCKSS program (Lots of Copies Keep Stuff Safe). These services preserve publisher titles in a secure dark archive—meaning that content in the archive is not made available without a trigger event, such as business failure or a catastrophic occurrence. In addition, some European countries have undertaken government mandates to ensure preservation of their digital cultural record.
“The experience of all these institutions is that it is possible to agree, to cooperate to ensure the long-term accessibility of the e-books that are being published,” she says. “Sometimes legislative action is part of this process, as in legal deposit, but publishers have also been very willing to cooperate voluntarily, often committing significant resources, to ensuring the survival of our cultural heritage.”
Notably, e-books’ legal and technical problems, including rights, DRM, and format issues, can all be solved by any number of preservation services, Kirchhoff says, with the biggest problem getting e-books into a preservation service in the first place.
Currently, the U.S. mandates that electronic-only serial publications be deposited at the Library of Congress; there is no similar requirement for e-books. This lack of a legal deposit for e-books is especially troublesome for the preservation of a growing number of born-digital, e-book only titles, especially the growing number of self-published e-books.
Self-published e-books represent “a considerable institutional challenge for preservation,” the report states. Self-published e-books have no mandatory legal deposit, and they are not required to use an ISBN (for example, Amazon assigns an internal identifier). And most glaringly, there is no “simple scalable way to contact and negotiate with all these independent authors” concerning preservation issues. “How do we even discover what the universe of those e-books is, so we can tell what needs to be preserved?” Morrissey asks.
“Libraries are the guarantors that future generations will still be able to read what authors are writing today,” Wolven says. But without a concerted effort he fears many e-books written today could be lost and “won’t be available to libraries at all.” And addressing e-book preservation now is critical, he adds, as today’s complex web of technical and legal issues would make a later “rescue effort” for lost e-books difficult.
The good news is, it’s still early. “Today’s e-books aren’t going to disappear next week or next year,” he says, which gives libraries and publishers some time to work things out. “The situation is far from hopeless. Most publishers do care about preservation, and there are any number of avenues to explore,” Wolven concludes. “But it’s up to libraries to point those out, urge things in the right direction, and make it clear that they care.”