Wednesday, September 27, 2006

Columbia Journalism Review

By Siva Vaidhyanathan

Last May, Kevin Kelly, Wired magazine’s “senior maverick,” published in The New York Times Magazine his predictive account of flux within the book-publishing world. Kelly outlined what he claimed will happen (not might or could — will) to the practices of writing and reading under a new regime fostered by Google’s plan to scan millions of books and offer searchable texts to Internet users.

“So what happens when all the books in the world become a single liquid fabric of interconnected words and ideas?” Kelly wrote. “First, works on the margins of popularity will find a small audience larger than the near-zero audience they usually have now. . . . Second, the universal library will deepen our grasp of history, as every original document in the course of civilization is scanned and cross-linked. Third, the universal library of all books will cultivate a new sense of authority . . . .”

Kelly saw the linkage of text to text, book to book, as the answer to the information gaps that have made the progress of knowledge such a hard climb. “If you can truly incorporate all texts — past and present, multilingual — on a particular subject,” Kelly wrote, “then you can have a clearer sense of what we as a civilization, a species, do know and don’t know. The white spaces of our collective ignorance are highlighted, while the golden peaks of our knowledge are drawn with completeness. This degree of authority is only rarely achieved in scholarship today, but it will become routine.”

Such heady predictions of technological revolution have become so common, so accepted in our techno-fundamentalist culture, that even when John Updike criticized Kelly’s vision in an essay published a month later in The New York Times Book Review, he did not so much doubt Kelly’s vision of a universal digital library as lament it.

As it turns out, the move toward universal knowledge is not so easy. Google’s project, if it survives court challenges, would probably have modest effects on writing, reading, and publishing. For one thing, Kelly’s predictions depend on a part of the system he slights in his article: the copyright system. Copyright is not Kelly’s friend. He mentions it as a nuisance on the edge of his dream. To acknowledge that a lawyer-built system might trump an engineer-built system would have run counter to Kelly’s sermon.

Much of the press coverage of the Google project has missed some key facts: most libraries that are allowing Google to scan books are, so far, providing only books published before 1923 and thus already in the public domain, essentially missing most of the relevant and important books that scholars and researchers — not to mention casual readers — might want. Meanwhile, the current American copyright system will probably kill Google’s plan to scan the collections of the University of Michigan and the University of California system — the only libraries willing to offer Google works currently covered by copyright. In his article, Kelly breezed past the fact that the copyrighted works will be presented in a useless format — “snippets” that allow readers only glimpses into how a term is used in the text. Google users will not be able to read, copy, or print copyrighted works via Google. Google accepted that arrangement to limit its copyright liability. But the more “copyright friendly” the Google system is, the less user-friendly, and useful, it is. And even so it still may not fly in court.

Google is exploiting the instability of the copyright system in a digital age. The company’s struggle with publishers over its legal ability to pursue its project is the most interesting and perhaps most transformative conflict in the copyright wars. But there are many other battles — and many other significant stories — out in the copyright jungle. Yet reporters seem lost.

Copyright in recent years has certainly become too strong for its own good. It protects more content and outlaws more acts than ever before. It stifles individual creativity and hampers the discovery and sharing of culture and knowledge. To convey all this to readers, journalists need to understand the principles, paradoxes, licenses, and limits of the increasingly troubled copyright system. Copyright is not just an interesting story. As the most pervasive regulation of speech and culture, the copyright system will help determine the richness and strength of democracy in the twenty-first century.


It’s not that the press has ignored copyright. Recent fights have generated a remarkable amount of press. Since Napster broke into the news in 2000, journalists have been scrambling to keep up with the fast-moving and complicated stories of content protection, distribution, and revision that make up the wide array of copyright conflicts.

During this time of rapid change it’s been all too easy for reporters to fall into the trap of false dichotomies: hackers versus movie studios; kids versus music companies; librarians versus publishers. The peer-to-peer and music-file-sharing story, for instance, has consistently been covered as a business story with the tone of the sports page: winners and losers, scores and stats. In fact, peer-to-peer file sharing was more about technological innovation and the ways we use music in our lives than any sort of threat to the commercial music industry. As it stands today, after dozens of court cases and congressional hearings, peer-to-peer file-sharing remains strong. So does the music industry. The sky did not fall, our expectations did.

The most recent headline-grabbing copyright battle involved The Da Vinci Code. Did Dan Brown recycle elements of a 1982 nonfiction book for his bestselling novel? The authors of the earlier book sued Brown’s publisher, Random House U.K., in a London court in the spring of 2006 in an effort to prove that Brown lifted protected elements of their book, what they called “the architecture” of a speculative conspiracy theory about the life of Jesus. In the coverage of the trial, some reporters — even in publications like The New York Times, The Washington Post, and The San Diego Union-Tribune — used the word “plagiarism” as if it were a legal concept or cause of action. It isn’t. Copyright infringement and plagiarism are different acts with some potential overlap. One may infringe upon a copyright without plagiarizing and one may plagiarize — use ideas without attribution — without breaking the law. Plagiarism is an ethical concept. Copyright is a legal one.

Perhaps most troubling, though, was the way in which the Da Vinci Code story was so often covered without a clear statement of the operative principle of copyright: one cannot protect facts and ideas, only specific expressions of ideas. Dan Brown and Random House U.K. prevailed in the London court because the judge clearly saw that the earlier authors were trying to protect ideas. Most people don’t understand that important distinction. So it’s no surprise that most reporters don’t either.

Reporters often fail to see the big picture in copyright stories: that what is at stake is the long-term health of our culture. If the copyright system fails, huge industries could crumble. If it gets too strong, it could strangle future creativity and research. It is complex, and complexity can be a hard thing to render in journalistic prose.

The work situation of most reporters may also impede a thorough understanding of how copyright affects us all. Reporters labor for content companies, after all, and tend to view their role in the copyright system as one-dimensional. They are creators who get paid by copyright holders. So it’s understandable for journalists to express a certain amount of anxiety about the ways digital technologies have allowed expensive content to flow around the world cheaply.

Yet reporters can’t gather the raw material for their craft without a rich library of information in accessible form. When I was a reporter in the 1980s and 1990s, I could not write a good story without scouring the library and newspaper archives for other stories that added context. And like every reporter, I was constantly aware that my work was just one element in a cacophony of texts seeking readers and contributing to the aggregate understanding of our world. I was as much a copyright user as I was a copyright producer. Now that I write books, I am even more aware of my role as a taker and a giver. It takes a library, after all, to write a book.


We are constantly reminded that copyright law, as the Supreme Court once declared, is an “engine of free expression.” But more often these days, it’s instead an engine of corporate censorship.

Copyright is the right to say no. Copyright holders get to tell the rest of us that we can’t build on, revise, copy, or distribute their work. That’s a fair bargain most of the time. Copyright provides the incentive to bring work to market. It’s impossible to imagine anyone anteing up $300 million for Spider-Man 3 if we did not have a reasonable belief that copyright laws would limit its distribution to mostly legitimate and moneymaking channels.

Yet copyright has the potential of locking up knowledge, insight, information, and wisdom from the rest of the world. So it is also fundamentally a conditional restriction on speech and print. Copyright and the First Amendment are in constant and necessary tension. The law has for most of American history limited copyright — allowing it to fill its role as an incentive-maker for new creators yet curbing its censorious powers. For most of its 300-year history, the system has served us well, protecting the integrity of creative work while allowing the next generation of creators to build on the cultural foundations around them. These rights have helped fill our libraries with books, our walls with art, and our lives with song.

But something has gone terribly wrong. In recent years, large multinational media companies have captured the global copyright system and twisted it toward their own short-term interests. The people who are supposed to benefit most from a system that makes ideas available — readers, students, and citizens — have been excluded. No one in Congress wants to hear from college students or librarians.

More than ever, the law restricts what individuals can do with elements of their own culture. Generally the exercise of copyright protection is so extreme these days that even the most innocent use of images or song lyrics in scholarly work can generate a legal threat. Last year one of the brightest students in my department got an article accepted in the leading journal in the field. It was about advertising in the 1930s. The journal’s lawyers and editors refused to let her use images from the ads in question without permission, even though it is impossible to find out who owns the ads or if they were ever covered by copyright in the first place. The chilling effect trumped any claim of scholarly “fair use” or even common sense.


For most of the history of copyright in Europe and the United States, copying was hard and expensive, and the law punished those who made whole copies of others’ material for profit. The principle was simple: legitimate publishers would make no money after investing so much in authors, editors, and printing presses if the same products were available on the street. The price in such a hyper-competitive market would drop to close to zero. So copyright created artificial scarcity.

But we live in an age of abundance. Millions of people have in their homes and offices powerful copying machines and communication devices: their personal computers. It’s almost impossible to keep digital materials scarce once they are released to the public.

The industries that live by copyright — music, film, publishing, and software companies — continue to try. They encrypt video discs and compact discs so that consumers can’t play them on computers or make personal copies. They monitor and sue consumers who allow others to share digital materials over the Internet. But none of these tactics seem to be working. In fact, they have been counterproductive. The bullying attitude has alienated consumers. That does not mean that copyright has failed or that it has no future. It just has a more complicated and nuanced existence.

Here is the fundamental paradox: media companies keep expanding across the globe. They produce more software, books, music, video games, and films every year. They charge more for those products every year. And those industries repeatedly tell us that they are in crisis. If we do not radically alter our laws, technologies, and habits, the media companies argue, the industries that copyright protects will wither and die.

Yet they are not dying. Strangely, the global copyright industries are still rich and powerful. Many of them are adapting, changing their containers and their content, but they keep growing, expanding across the globe. Revenues in the music business did drop steadily from 2000 to 2003 — some years by up to 6.8 percent. Millions of people in Europe and North America use their high-speed Internet connections to download music files free. From Moscow to Mexico City to Manila, film and video piracy is rampant. For much of the world, teeming pirate bazaars serve as the chief (often only) source of those products. Yet the music industry has recovered from its early-decade lull rather well. Revenues for the major commercial labels in 2004 were 3.3 percent above 2003. Unit sales were up 4.4 percent. Revenues in 2004 were higher than in 1997 and comparable to those of 1998 — then considered very healthy years for the recording industry. This while illegal downloading continued all over the world.

Yet despite their ability to thrive in a new global/digital environment, the companies push for ever more restrictive laws — laws that fail to recognize the realities of the global flows of people, culture, and technology.

Recent changes to copyright in North America, Europe, and Australia threaten to chill creativity at the ground level — among noncorporate, individual, and communal artists. As a result, the risk and price of reusing elements of copyrighted culture are higher than ever before. If you wanted to make a scholarly documentary film about the history of country music, for example, you might end up with one that slights the contribution of Hank Williams and Elvis Presley because their estates would deny you permission to use the archival material. Other archives and estates would charge you prohibitive fees. We are losing much of the history of the twentieth century because the copyright industries are more litigious than ever.

Yet copyright, like culture itself, is not zero-sum. In its first weekend of theatrical release, Star Wars Episode III: Revenge of the Sith made a record $158.5 million at the box office. At the same time, thousands of people downloaded high-quality pirated digital copies from the Internet. Just days after the blockbuster release of the movie, attorneys for 20th Century Fox sent thousands of “cease-and-desist” letters to those sharing copies of the film over the Internet. The practice continued unabated.
How could a film make so much money when it was competing against its free version? The key to understanding that seeming paradox — less control, more revenue — is to realize that every download does not equal a lost sale. As the Stanford law professor Lawrence Lessig has argued, during the time when music downloads were 2.6 times those of legitimate music sales, revenues dropped less than 7 percent. If every download replaced a sale, there would be no commercial music industry left. The relationship between the free version and the legitimate version is rather complex, like the relationship between a public library and a book publisher. Sometimes free stuff sells stuff.


Here’s a primer for reporters who find themselves lost in the copyright jungle: American copyright law offers four basic democratic safeguards to the censorious power of copyright, a sort of bargain with the people. Each of these safeguards is currently at risk:

  • First and foremost, copyrights eventually expire, thus placing works into the public domain for all to buy cheaply and use freely. That is the most important part of the copyright bargain: We the people grant copyright as a temporary monopoly over the reproduction and distribution of specific works, and eventually we get the material back for the sake of our common heritage and collective knowledge. The works of Melville and Twain once benefited their authors exclusively. Now they belong to all of us. But as Congress continues to extend the term of copyright protection for works created decades ago (as it did in 1998 by adding twenty years to all active copyrights) it robs the people of their legacy.
  • Second, copyright restricts what consumers can do with the text of a book, but not the book itself; it governs the content, not the container. Thus people may sell and buy used books, and libraries may lend books freely, without permission from publishers. In the digital realm, however, copyright holders may install digital-rights-management schemes that limit the transportation of both the container and the content. So libraries may not lend out major portions of their materials if they are in digital form. As more works are digitized, libraries are shifting to the lighter, space-saving formats. As a result, libraries of the future could be less useful to citizens.
  • Third, as we have seen, copyright governs specific expressions, but not the facts or ideas upon which the expressions are based. Copyright does not protect ideas. But that is one of the most widely misunderstood aspects of copyright. And even that basic principle is under attack in the new digital environment. In 1997, the National Basketball Association tried to get pager and Internet companies to refrain from distributing game scores without permission. And more recently, Major League Baseball has tried, but so far has failed, to license the use of player statistics to limit “free riding” firms that make money facilitating fantasy baseball leagues. Every Congressional session, database companies try to create a new form of intellectual property that protects facts and data, thus evading the basic democratic right that lets facts flow freely.
  • Fourth, and not least, the copyright system has built into it an exception to the power of copyright: fair use. This significant loophole, too, is widely misunderstood, and deserves further discussion.

Generally, one may copy portions of another’s copyrighted work (and sometimes the entire work) for private, noncommercial uses, for education, criticism, journalism, or parody. Fair use operates as a defense against an accusation of infringement and grants confidence to users that they most likely will not be sued for using works in a reasonable way.

On paper, fair use seems pretty healthy. In recent years, for example, courts have definitively stated that making a parody of a copyrighted work is considered “transformative” and thus fair. Another example: a major ruling in 2002 enabled image search engines such as Google to thrive and expand beyond simple Web text searching into images and video because “thumbnails” of digital photographs are considered to be fair uses. Thumbnails, the court ruled, do not replace the original in the marketplace.

But two factors have put fair use beyond the reach of many users, especially artists and authors. First and foremost, fair use does not help you if your publisher or distributor does not believe in it. Many publishers demand that every quote — no matter how short or for what purpose — be cleared with specific permission, which is extremely cumbersome and often costly.

And fair use is somewhat confusing. There is widespread misunderstanding about it. In public forums I have heard claims such as “you can take 20 percent” of a work before the use becomes unfair, or, “there is a forty-word rule” for long quotes of text. Neither rule exists. Fair use is intentionally vague. It is meant for judges to apply, case by case. Meanwhile, copyright holders are more aggressive than ever and publishers and distributors are more concerned about suits. So in the real world, fair use is less fair and less useful.


Fair use is designed for small ball. It’s supposed to create some breathing room for individual critics or creators to do what they do. Under current law it’s not appropriate for large-scale endeavors — like the Google library project. Fair use may be too rickety a structure to support both free speech and the vast dreams of Google.

Reporters need to understand the company’s copyright ambitions. Google announced in December 2004 that it would begin scanning in millions of copyrighted books from the University of Michigan library, and in August 2006 the University of California system signed on. Predictably, some prominent publishers and authors have filed suit against the search-engine company.

The company’s plan was to include those works in its “Google Book Search” service. Books from the library would supplement both the copyrighted books that Google has contracted to offer via its “partner” program with publishers and the uncopyrighted works scanned from other libraries, including libraries of Harvard, Oxford, and New York City. While it would offer readers full-text access to older works out of copyright, it would provide only “snippets” of the copyrighted works that it scans without the authors’ permission from Michigan and California.

Google says that because users will only experience “snippets” of copyrighted text, their use of such material should be considered a fair use. That argument will be tested in court. But whether those snippets constitute fair use is just one part of the issue. To generate the “snippets,” Google is scanning the entire works and storing them on its servers. The plaintiffs argue that the initial scanning of the books itself — done to create the snippets from a vast database — constitutes copyright infringement, the very core of copyright. Courts will have to weigh whether the public is better served by a strict and clear conception of copyright law — that only the copyright holder has the right to give permission for any copy, regardless of the ultimate use or effect on the market — or a more flexible and pragmatic one in which the user experience matters more.

One of the least understood concepts of Google’s business is that it copies everything. When we post our words and images on the Web, we are implicitly licensing Google, Yahoo, and other search engines to make copies of our content to store in their huge farms of servers. Without such “cache” copies, search engines could not read and link to Web pages. In the Web world, massive copying is just business as usual.

But through the library project, Google is imposing the norms of the Web on authors and publishers who have not willingly digitized their works and thus have not licensed search engines to make cache copies. Publishers, at first, worried that the Google project would threaten book sales, but it soon became clear that project offers no risk to publishers’ core markets and projects. If anything, it could serve as a marketing boon. Now publishers are most offended by the prospect of a wealthy upstart corporation’s “free-riding” on their content to offer a commercial and potentially lucrative service without any regard for compensation or quality control. The publishers, in short, would like a piece of the revenue, and some say about the manner of display and search results.

Copyright has rarely been used as leverage to govern ancillary markets for goods that enhance the value or utility of the copyrighted works. Publishers have never, for instance, sued the makers of library catalogs, eyeglasses, or bookcases. But these are extreme times.

The mood of U.S. courts in recent years, especially the Supreme Court, has been to side with the copyright holder in this time of great technological flux. Google is an upstart facing off against some of the most powerful media companies in the world, including Viacom, News Corporation, and Disney — all of which have publishing wings. Courts will probably see this case as the existential showdown over the nature and future of copyright and rule to defend the status quo. Journalists should follow the case closely. The footnotes of any court decision could shape the future of journalism, publishing, libraries, and democracy.


Google aside, in recent years — thanks to the ferocious mania to protect everything and the astounding political power of media companies — the basic, democratic checks and balances that ensured that copyright would not operate as an instrument of private censorship have been seriously eroded. The most endangered principle is fair use: the right to use others’ copyrighted works in a reasonable way to promote important public functions such as criticism or education. And if fair use is in danger then good journalism is also threatened. Every journalist relies on fair use every day. So journalists have a self-interest in the copyright story.

And so does our society. Copyright was designed, as the Constitution declares, to “promote the progress” of knowledge and creativity. In the last thirty years we have seen this brilliant system corrupted and captured by the very industries that the old laws fostered. Yet the complexity and nuanced nature of copyright battles make it hard for nonexperts to grasp what’s at stake.
So it’s up to journalists to push deeper into stories in which copyright plays a part. Then the real challenge begins: explaining this messy system in clear language to a curious but confused audience. n

Siva Vaidhyanathan is an associate professor of culture and communication at New York University. He is the author of Copyrights and Copywrongs: The Rise of Intellectual Property and How It Threatens Creativity and The Anarchist in the Library. He blogs at