At a December 2 pre-motion conference, a federal magistrate judge agreed to extend the discovery deadline in a high profile lawsuit filed by four major publishers against the Internet Archive over the Internet Archive's scanning and lending of print library books.

At an hour-long in-person hearing in New York, judge Ona T. Wang told the parties to continue discussions regarding three ongoing discovery disputes, and extended the discovery deadline from December 17 to January 31, 2022—and she left the door open for another extension if the parties’ discovery issues are still not resolved by February.

The conference was ordered after the Internet Archive has raised two discovery disputes with the court since August, and the publishers last month raised an issue of their own.

In the first dispute, Wang heard each side's position on a sweeping request from the Internet Archive, first brought to the court in an August 9 filing, in which IA attorneys are seeking monthly sales data for all books in print by the four plaintiff publishers (Hachette, Penguin Random House, HarperCollins, and Wiley) dating back to 2011.

At the hearing, IA attorney Joseph Gratz told Wang that the IA needed a broad range of data to show that the IA’s scanning and lending has not harmed the publishers’ bottom lines. He conceded, however, that the IA does not need 10 years worth of data and would be happy to work with a sample of “comparable” titles. But since the publishers are refusing to cooperate on compiling such a list, Gratz said, the IA is seeking the data to develop their own comps.

Plaintiffs' attorney Elizabeth McNamara countered that providing a decade’s worth of sales data would be burdensome in the extreme on the publisher plaintiffs—and unwieldy. It would involve "billions of lines of data" that would take many months to prepare and significant resources," McNamara said. Furthermore, the data wouldn’t help the Internet Archive’s defense because "books are not comparable" and because any such exercise still would not address a fundamental claim in the case, McNamara said—that the IA is not paying to license the publishers' content.

For her part, Wang expressed concern over the sheer the scope of the request, which she said was clearly not “proportional” and urged the parties to continue talking about a compromise involving a more realistic dataset. “You don’t need gold-plated shears to prune a bush,” Wang told Gratz, suggesting that the IA’s request sounded “like an inconceivably huge burden” when a more reasonable set of data might get the job done. Wang told the parties to keep talking, but warned that if they could not make progress on a compromise she would order briefs on the issue, “which will be expensive for both of you.”

Wang then heard the Internet Archive’s complaint that the Association of American Publishers, claiming legal privilege, is refusing to comply with a subpoena to hand over a range of internal communications. Once again, Wang told the parties to keep talking and declined to order briefs on the dispute at this juncture—at which point plaintiff’s attorney Scott Zebrak raised a jurisdictional issue, noting that the AAP subpoena was served in Washington D.C., not in New York. Wang then cut Zebrak off when he began to address the IA's "overly broad" subpoena, telling the parties to work out the subpoena's jurisdictional issue first.

Finally, Zebrak told Wang that the AAP’s complaints over the Internet Archive's document production were largely resolved. In a November 19 filing, Zebrak accused the Internet Archive of "stonewalling" and trying to “run out the clock” on discovery. But Zebrak told the court that four of the seven “categories” of concern with the plaintiffs were mostly dealt with, and while some concerns remained, Gratz told the court that the IA has turned over all the documents they have, and would continue to address any lingering concerns the plaintiffs may have.

The infringement lawsuit was first filed in June of 2020 by Hachette, HarperCollins, John Wiley & Sons, and Penguin Random House, and coordinated by the AAP. It alleges that the Internet Archive’s program to scan and lend print editions of library books under an untested legal theory known as controlled digital lending is copyright infringement on a massive scale. IA lawyers counter that its program respects copyright and is protected by fair use.

Wang ordered the parties to submit a status letter by January 14.