Copyright in Databases

I'm going to have more to say about data, databases, and intellectual property rights in the coming months. This longish post provides a basic primer on how U.S. copyright law applies to databases.

A. Copyright

Copyright attaches to an original work of authorship that has been embodied in a fixed form. The “work” to which copyright attaches can be the structure of the database or a relatively small part of a database, including an individual data element, such as a photograph. It is therefore possible for a database to contain multiple overlapping copyrighted works or elements. To the extent that a database owner has a copyright, or multiple copyrights, in elements of a database, the rights apply only to those copyrighted elements. The rights are to reproduce, publicly distribute or communicate, publicly display, publicly perform, and prepare adaptations or derivative works.

1. Standards for obtaining copyright

a. Originality

Copyright protects only an author’s “original” expression, which means expression independently created by the author that reflects a minimal spark of creativity. A database owner may have a copyright in the database structure or in the user interface with the database, whether that be a report form or an electronic display of field names associated with data. The key is whether the judgments made by the person(s) selecting and arranging the data require the exercise of sufficient discretion to make the selection or arrangement “original.” In Feist Publications, Inc. v. Rural Telephone Service Company, the United States Supreme Court held that a white pages telephone directory could not be copyrighted. The data—the telephone numbers and addresses—were “facts” which were not original because they had no “author.” Also, the selection and arrangement of the facts did not meet the originality requirement because the decision to order the entries alphabetically by name did not reflect the “minimal spark” of creativity needed.

As a practical matter, this originality standard prevents copyright from applying to complete databases – i.e. those that list all instances of a particular phenomenon – that are arranged in an unoriginal manner, such as alphabetically or by numeric value. However, courts have held that incomplete databases that reflect original selection and arrangement of data, such as a guide to the “best” restaurants in a city, are copyrightable in their selection and arrangement. Such a copyright would prohibit another from copying and posting such a guide on the Internet without permission. However, because the copyright would be limited to that particular selection and arrangement of restaurants, a user could use such a database as a reference for creating a different selection and arrangement of restaurants without violating the copyright owner’s copyright.

Copyright is also limited by the merger doctrine, which appears in many database disputes. If there are only a small set of practical choices for expressing an idea, the law holds that the idea and expression merge and the result is that there is no legal liability for using the expression.

Under these principles, metadata is copyrightable only if it reflects an author’s original expression. For example, a collection of simple bibliographic metadata with fields named “author,” “title,” “date of publication,” would not be sufficiently original to be copyrightable. More complex selections and arrangements may cross the line of originality. Finally, to the extent that software is used in a databases, software is protectable as a “literary work.” A discussion of copyright in executable code is beyond the scope of this entry.

b. Fixation

A work must also be “fixed” in any medium permitting the work to be perceived, reproduced, or otherwise communicated for a period of more than a transitory duration. The structure and arrangement of a database may be fixed any time that it is written down or implemented. For works created after January 1, 1978 in the United States, exclusive rights under copyright shower down upon the creator at the moment of fixation.

2. The Duration of Copyright

Under international treaties, copyright must last for at least the life of the author plus 50 years. Some countries, including the United States, have extended the length to the life of the author plus 70 years. Under U.S. law, if a work was made as a “work made for hire,” such as a work created by an employee within the scope of employment, the copyright lasts for 120 years from creation if the work is unpublished or 95 years from the date of publication.

3. Ownership and Transfer of Copyright

Copyright is owned initially by the author of the work. If the work is jointly produced by two or more authors, such as a copyrightable database compiled by two or more scholars, each has a legal interest in the copyright. When a work is produced by an employee, ownership differs by country. In the United States, the employer is treated as the author under the “work made for hire” doctrine and the employee has no rights in the resulting work. Elsewhere, the employee is treated as the author and retains certain moral rights in the work while the employer receives the economic rights in the work. Copyrights may be licensed or transferred. A non-exclusive license, or permission, may be granted orally or even by implication. A transfer or an exclusive license must be done in writing and signed by the copyright owner. Outside of the United States, some or all of the author’s moral rights cannot be transferred or terminated by agreement. The law on this issue varies by jurisdiction.

4. The Copyright Owner’s Rights

The rights of a copyright owner are similar throughout the world although the terminology differs as do the limitations and exceptions to these rights.

a. Reproduction

As the word “copyright” implies, the owner controls the right to reproduce the work in copies. The reproduction right covers both exact duplicates of a work and works that are “substantially similar” to the copyrighted work when it can be shown that the alleged copyist had access to the copyrighted work. In the United States, some courts have extended this right to cover even a temporary copy of a copyrighted work stored in a computer’s random access memory (“RAM”).

b. Public Distribution, Performance, Display or Communication

The United States divides the rights to express the work to the public into rights to distribute copies, display a copy, or publicly perform the work. In other parts of the world, these are subsumed within a right to communicate the work to the public.

Within the United States, courts have given the distribution right a broad reading. Some courts, including the appeals court in the Napster case, have held that a download of a file from a server connected to the internet is both a reproduction by the person requesting the file and a distribution by the owner of the machine that sends the file. The right of public performance applies whenever the copyrighted work can be listened to or watched by members of the public at large or a subset of the public larger than a family unit or circle of friends. Similarly, the display right covers works that can be viewed at home over a computer network as long as the work is accessible to the public at large or a subset of the public.

c. Right of Adaptation, Modification or Right to Prepare Derivative Works

A separate copyright arises with respect to modifications or adaptations of a copyrighted work so long as these modifications or adaptations are themselves original. This separate copyright applies only to these changes. The copyright owner has the right to control such adaptations unless a statutory provision, such as fair use, applies.

5. Theories of Secondary Liability

Those who build or operate databases also have to be aware that copyright law holds liable certain parties that enable or assist others in infringing copyright. In the United States, these theories are known as contributory infringement or vicarious infringement.

a. Contributory Infringement

Contributory copyright infringement requires proof that a third party intended to assist a copyright infringer in that activity. This intent can be shown when one supplies a means of infringement with the intent to induce another to infringe or with knowledge that the recipient will infringe. This principle is limited by the so-called Sony doctrine, by which one who supplies a service or technology that enables infringement, such as a VCR or photocopier, will be deemed not to have knowledge of infringement or intent to induce infringement so long as the service or technology is capable of substantial non-infringing uses.

Two examples illustrate the operation of this rule. In A&M Records, Inc. v. Napster, Inc., the court of appeals held that peer-to-peer file sharing is infringing but that Napster’s database system for connecting users for peer-to-peer file transfers was capable of substantial non-infringing uses and so it was entitled to rely on the Sony doctrine. (Napster was held liable on other grounds.) In contrast, in MGM Studios, Inc. v. Grokster, Ltd., the Supreme Court held that Grokster was liable for inducing users to infringe by specifically advertising its database service as a substitute for Napster’s.

b. Vicarious Liability for Copyright Infringement

Vicarious liability in the United States will apply whenever (1) one has control or supervisory power over the direct infringer’s infringing conduct and (2) one receives a direct financial benefit from the infringing conduct. In the Napster case, the court held that Napster had control over its users because it could refuse them access to the Napster server and, pursuant to the Terms of Service Agreements entered into with users, could terminate access if infringing conduct was discovered. Other courts have required a greater showing of actual control over the infringing conduct.

Similarly, a direct financial benefit is not limited to a share of the infringer’s profits. The Napster court held that Napster received a direct financial benefit from infringing file trading because users’ ability to obtain infringing audio files drew them to use Napster’s database. Additionally, Napster could potentially receive a financial benefit from having attracted a larger user base to the service.

6. Limitations and Exceptions

Copyrights’ limitations and exceptions vary by jurisdiction. In the United States, the broad “fair use” provision is a fact-specific balancing test that permits certain uses of copyrighted works without permission. Fair use is accompanied by some specific statutory limitations that cover, for example, certain uses in the classroom use and certain uses by libraries. The factors to consider for fair use are: (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for or value of the copyrighted work. The fact that a work is unpublished shall not itself bar a finding of fair use if such finding is made upon consideration of all the above factors.

Countries whose copyright law follows that of the United Kingdom, a more limited “fair dealing” provision enumerates specific exceptions to copyright. In Europe, Japan, and elsewhere, the limitations and exceptions are specified legislatively and cover some private copying and some research or educational uses.

7. Remedies and Penalties

In general, a copyright owner can seek an injunction against one who is either a direct or secondary infringer of copyright. The monetary consequences of infringement differ by jurisdiction. In the United States, the copyright owner may choose between actual or statutory damages. Actual damages cover the copyright owner’s lost profits as well as a right to the infringer’s profits derived from infringement. The range for statutory damages is $750 to $30,000 per copyrighted work infringed. If infringement is found to have been willful, the range increases to $150,000. The amount of statutory damages in a specific case is determined by the jury. There is a safe harbor from statutory damages for non-profit educational institutions if an employee reproduces a copyrighted work with a good faith belief that such reproduction is a fair use.

A separate safe harbor scheme applies to online service providers when their database is comprised of information stored at the direction of their users. An example of such a database would be YouTube’s video sharing database. The service provider is immune from monetary liability unless the provider has knowledge of infringement or has control over the infringer and receives a direct financial benefit from infringement. The safe harbor is contingent on a number of requirements, including that the provider have a copyright policy that terminates repeat infringers, that the provider comply with a notice-and-takedown procedure, and that the provider have an agent designated to receive notices of copyright infringement.

Case Examples

In cases arising after the Feist decision, the courts have faithfully applied the core holding that facts are in the public domain and free from copyright even when substantial investments are made to gather such facts. There has been more variation in the characterization of some kinds of data as facts and in application of the modicum-of-creativity standard to the selections and arrangements in database structures.

On the question of when data is copyrightable, a court of appeals found copyrightable expression in the “Red Book” listing of used car valuations. The defendant had copied these valuations into its database, asserting that it was merely copying unprotected factual information. The court disagreed, likening the valuations to expressive opinions and finding a modicum of originality in these. In addition, the selection and arrangement of the data, which included a division of the market into geographic regions, mileage adjustments in 5,000-mile increments, a selection of optional features for inclusion, entitled the plaintiff to a thin copyright in the database structure.

Subsequently, the same court found that the prices for futures contracts traded on the New York Mercantile Exchange (NYMEX) probably were not expressive data even though a committee makes some judgments in the setting of these prices. The court concluded that even if such price data were expressive, the merger doctrine applied because there was no other practicable way of expressing the idea other than through a numerical value and a rival was free to copy price data from NYMEX’s database without copyright liability.

Finally, where data are comprised of arbitrary numbers used as codes, the courts have split. One court of appeals has held that an automobile parts manufacturer owns no copyright in its parts numbers, which are generated by application of a numbering system that the company created. In contrast, another court of appeals has held that the American Dental Association owns a copyright in its codes for dental procedures.

On the question of copyright in database structures, a court of appeals found that the structure of a yellow pages directory including listing of Chinese restaurants was entitled to a “thin” copyright, but that copyright was not infringed by a rival database that included 1,500 of the listings because the rival had not copied the plaintiff’s data structure. Similarly, a different court of appeals acknowledged that although a yellow pages directory was copyrightable as a compilation, a rival did not violate that copyright by copying the name, address, telephone number, business type, and unit of advertisement purchased for each listing in the original publisher’s directory. Finally, a database of real estate tax assessments that arranged the data collected by the assessor into 456 fields grouped into 34 categories was sufficiently original to be copyrightable.

