Administration testimony on H.R. 354, the
"Collections of Information Antipiracy Act"
before the House Subcommittee on Courts and the Judiciary,
March 18, 1999



The Administration views database protection legislation from a number of perspectives: as a creator of data and a user of it; as an advocate of both economic incentives for socially useful investment and open, market-based competition free from artificial barriers; and as an entity committed both to effective law enforcement and to the First Amendment. Reconciling these perspectives is difficult in any context. The digital economy's rapid and unpredictable change makes this challenge even greater.

The Administration believes strongly in free markets, in which firms can meet demand for new products and services without having to overcome artificial barriers that keep consumers hostage to an undesirable status quo. However, we also recognize that there are circumstances in which markets need legal mechanisms in order to work efficiently. The Feist decision conclusively eliminated one form of legal protection for databases and created uncertainty concerning the extent of legal protection applicable to data collection and dissemination. Undeniably, Feist has altered the landscape, but the topography is still changing in ways that pull in different directions as to the nature and extent of protection that is needed.

In particular, the emerging digital environment has significant implications for this issue. It has become commonplace to observe that information is the currency of our economic age. That puts a premium on designing a legal regime that creates sufficient incentives to maximize investment in data collection -- to expand the available universe of information -- without putting in place unjustified obstacles to competition and innovation. Moreover, digital technology permits the creation of an infinite number of perfect copies at the touch of a button and therefore expands dramatically the risk that, in the absence of adequate legal remedies, piracy, or the threat of piracy, will deter investment in database creation. For all of these reasons, it is important to calibrate new private rights carefully -- to optimize overall economic and social benefits, to prevent unfairly undermining investments and agreements premised on the present law, and to preclude new opportunities for thwarting competition.

The U.S. Government has a unique stake in database legislation because it collects, manages, and disseminates massive amounts of information, possibly more information than any other entity in the world. In all these processes, it interacts with the private sector in a variety of ways. In addition, federal agencies are engaged in funding research, resulting in tremendous amounts of information that the government does not undertake to manage itself.

These activities represent enormous investment in highly complex knowledge management processes that are vital to human health, the environment, national security, scientific progress, and technological innovation -- and, in turn, to the economy as a whole. Changes in ground rules for the use and reuse of information must be designed to minimize disruption of these critical activities and to avoid imposition of new costs that could hinder research.

The sections which follow discuss the Administrations efforts to study database protection and access issues (Part II) and summarize the six principles which we believe should guide both domestic legislative and international treaty efforts in this area (Part III). Next, we elaborate on each principle, the Administration's concerns relating to that topic, and the range of possible solutions on which we believe interested parties should focus (Part IV) Finally, we offer some additional points that should be included in any database protection legislation.


In response to legislative proposals in Congress and developments in the World Intellectual Property Organization, the Administration devoted substantial energy in 1998 and 1999 to studying database protection and access issues. The Administration review of these issues has included a variety of mechanisms and fora:

In addition to these efforts, the Administration has carefully studied a wide range of reports, studies, legal opinions and legislation on database protection and access from the United States, Canada, Japan, and the European Union.

The Administration continues to discuss these issues with concerned parties and to examine specific topics and areas where we believe further information will help both the legislative process and any future study of the effects of database protection that might be mandated by legislation.


On August 4, 1998, in response to Senate consideration of then H.R. 2652, the Administration set out the principles that it believes should govern database protection legislation.

Now, as then, Administration supports legal protection against commercial misappropriation of collections of information. We believe that there should be effective legal remedies against "free-riders" who take databases gathered by others at considerable expense and reintroduce them into commerce as their own. This situation has arisen in recent case law and we believe that digital technology increases opportunities for such abuses.

At the same time, the Administration's concerns with the provisions of H.R. 354 are similar to those we expressed with respect to H.R. 2652, including the concern that the Constitution imposes significant constraints upon Congress's power to enact legislation of this sort. From a policy perspective, the Administration believes that legislation addressing collections of information should be crafted with the following principles in mind:

1. A change in the law is desirable to protect commercial database developers from commercial misappropriation of their database products where other legal protections and remedies are inadequate. 2. Because any database misappropriation regime will have effects on electronic commerce, any such law should be predictable, simple, minimal, transparent, and based on rough consensus in keeping with the principles expressed in the Framework for Global Electronic Commerce. Definitions and standards of behavior should be reasonably clear to data producers and users prior to the development of a substantial body of case law.

3. Consistent with Administration policies expressed in relevant Office of Management and Budget circulars and federal regulations, databases generated with Government funding generally should not be placed under exclusive control, de jure or de facto, of private parties.

4. Any database misappropriation regime must carefully define and describe the protected interests and prohibited activities, so as to avoid unintended consequences; legislation should not affect established contractual relationships and should apply only prospectively and with reasonable notice.

5. Any database misappropriation regime should provide exceptions analogous to "fair use" principles of copyright law; in particular, any effects on non-commercial research should be de minimis.

6. Consistent with the goals of the World Trade Organization (WTO) and U.S. trade policy, legislation should aim to ensure that U.S. companies enjoy available protection for their database products in other countries on the same terms as enjoyed by nationals of those countries.

With these principles in mind, the Administration turns to an analysis of H.R. 354.


A. First Principle -- Protect against commercial misappropriation

A change in the law is desirable to protect commercial database developers from commercial misappropriation of their database products where other legal protections and remedies are inadequate.

The Administration supports enactment of a statute to protect database creators against free riding -- the unauthorized taking and distribution of database material with resulting infliction of commercial harm (loss of customers) on the database creator. Indeed, there is considerable, if not complete, consensus that this kind of free riding can occur without additional legal protection for non-copyrightable databases and that such legal protection is necessary to prevent a diminution in database creation.

Section 1402 is the operative core of H.R. 354, providing what we refer to as the "basic prohibition" of this proposal to protect databases through a misappropriation model. Section 1402 prohibits unauthorized commercial misappropriation of a substantial amount of a database; it also appears to prohibit unauthorized extraction or "use" of data from a database by an individual, no matter how the information is used. The scope of protection provided by H.R. 354 thus comes close to replicating, albeit in different terms, that provided by copyright law to works eligible for copyright protection.

We do not believe that protection of that breadth is appropriate in the database context. As a policy matter, we must weigh the need to protect database creators against the potential impact on scientific research in particular, and the dissemination of information within the society generally. It therefore makes sense to focus any prohibition on the precise activities that pose the commercial threat -- "use" is simply too broad. Indeed, the breadth of the prohibition has required concerned parties to focus considerable attention on expanding the list of statutory exceptions to make clear that various activities would not be affected by the prohibition. We believe it more appropriate to narrow the prohibition so it is targeted on conduct like the troubling acts of commercial misappropriation identified in the Warren Publishing and ProCD cases.

Moreover, the First Amendment concerns are more significant here than in the copyright context. The Constitution itself provides for protection of copyright, and therefore requires an accommodation between copyright and free speech concerns. Also, copyright by its terms protects the author's expression, not basic facts. In the database context, there is no express constitutional authorization and the prohibition is directed against dissemination of facts. We therefore believe that a prohibition on the "use" of facts would present serious First Amendment concerns.

H.R. 354's basic prohibition consists of three basic elements, imposing liability on any person who "extracts or uses in commerce" all or a substantial part of a database so as to cause "harm" to the "actual or potential market" of the database creator. In our view, all three of these elements should be focused more precisely on the commercial free-riding situation.

To begin with, the "extract[s] or use[s]" language should be narrowed. One approach would be to limit the reach to a person who, without authorization "extracts for commercial distribution or distributes in commerce" all or a substantial part of a database. The substitution of "distribution" in place of "use" would clarify that the Act is directed at active behavior, rather than receptive activities such as viewing, reading, or analyzing. "Extracts for commercial distribution" would cover any replication preparatory to distribution in commerce.

At the same time, we believe that "distributes in commerce" should be understood broadly to encompass any mechanism for making data available, including offering access over a network, distribution within a firm, and sharing with business partners. Also, "in commerce" would not be limited to for-profit motives but would extend to free distribution that would diminish the commercial market for the database. Cf. [INSERT CITE to NET Act]

While the Administration continues to believe misappropriation for commercial purposes should be the focus of any legislative efforts, we recognize that systematic acts of "extraction" by individuals could conceivably undermine the legitimate market for a database product. We are not familiar with any reported cases or incidents of this kind, but we recognize that such harm could occur if misappropriation by individuals is systematic, repeated, or widespread. Such damage from cumulative acts of misappropriation may occur when it becomes customary in a particular economic sector or field of research to use information from particular data sources without authorization. At present, if there is no contract with the individual or his/her organization, the investor in a database has no effective civil remedy against such acts. We believe that one of the greatest challenges in drafting database protection legislation is providing database producers with some type of protection against such patterns of repeated individual activity without prohibiting noncommercial uses of data by individuals which most people believe should be treated as "fair uses." Our suggested language concerning "extraction for distribution" and "distribution" does not address this issue and we look forward to working with the Subcommittee to address this issue.

Second, the Subcommittee should consider whether the requirement of "harm" in section 1402 should be elevated to "substantial harm" as a means of shielding de minimis activities from any possible liability. We know that some proponents of H.R. 354 have expressed concern with a "substantial harm" standard because they believe that judges would compare the standard unfavorably to copyright law which requires only "harm." We agree that it is important to anticipate how judges would administer any new law, but we believe that a "substantial harm" standard is be familiar to courts from other areas of American law. Appropriate legislative history could direct judges away from unintended comparisons to copyright law or areas of the law where "substantial harm" has been interpreted to be a higher standard than intended in this bill.

At the same time, some critics of H.R. 354 have suggested that the proper trigger for liability is whether the misappropriation "so reduce[s] the incentive to produce the product or service that its existence or quality would be substantially threatened," a test from the National Basketball Association v. Motorola case. We do not think this "diminution of incentive" test is workable as a component of the basic prohibition; it does not comport with the Administration's principle (described below) that a database protection law should be predictable, simple, and transparent. A database user cannot be expected to know anything about the incentive structure that led another party to produce the database, thus the user has no way to judge in advance whether or not her acts will make her liable. We are concerned that the "diminution of incentive" test requires a counterfactual proof ill-suited for our legal system; court cases would involve much more complicated proofs than would be incurred with a harm test.

Third, we suggest reexamination of the concepts of "actual" and "potential" market. We are very concerned that as presently drafted these concepts are broader than market definitions used in other areas of the law, could be subject to manipulation by private entities, and could too easily expose legitimate business practices to substantial liability. We urge the Subcommittee to consider an objective definition tied to the product's current actual customer base or the market commonly exploited by similar products or services. We are concerned that any broader definition might deter entrepreneurs from developing new products and services that add significant value and do not compete directly with the original database. We believe that the Subcommittee should consider, individually and perhaps in combination, the notions of "principal market" drawn from unfair competition law, "neighboring market" proposed in the Hatch draft, and whether the concept of permitted transformative uses is as or more important as the definition of the market in targeting the free-riding we wish to prohibit.

The Department of Justice has serious constitutional concerns that the First Amendment restricts Congress's ability to enact legislation such as H.R. 354. We reiterate that these constitutional concerns are closely related to the scope of the basic prohibition, discussed above, as well as issues discussed below, including the range of permitted uses, resolution of the "perpetual protection" problem and the possibility of "sole source" situations.

B. Second Principle -- Keep it simple, transparent, and based on consensus

Because any database misappropriation regime will have effects on electronic commerce, any such law should be predictable, simple, minimal, transparent, and based on rough consensus in keeping with the principles expressed in the Framework for Global Electronic Commerce. Definitions and standards of behavior should be reasonably clear to data producers and users prior to the development of a substantial body of case law.

This principle informs all of our analysis. We believe that database legislation should be squarely directed at behavior that is widely acknowledged to be unfair and has been documented as a problem worthy of a legislative response. This will ensure that the legal system is not used to threaten litigation in borderline cases in a manner that may inhibit the flow of factual information and the vigor of free market competition.

We also believe that in introducing this new form of protection, some of the burden of promoting transparency and predictability should be borne by those who benefit. In particular, we suggest that the Subcommittee consider a notice system (e.g., a AD@ in a circle) to warn database users when a database producer is asserting protection under the law. This will also help reduce the costs of identifying multiple cascading interests that are likely to aggregate more frequently in databases than in works of authorship. In this regard, we applaud the addition of the Agood faith@ of the defendant as factor in allowing Apermitted use@ under Section 1403, although for reasons discussed below, we believe that the additional changes to the "permitted use" section are appropriate.

C. Third Principle -- Preserve access to government data

Consistent with Administration policies expressed in relevant Office of Management and Budget circulars and federal regulations, databases generated with Government funding generally should not be placed under exclusive control, de jure or de facto, of private parties.

1. Exemption of government data

The U.S. Government collects and creates enormous amounts of information, possibly more than any other entity in the world. State and local governments in the United States also gather and generate tremendous amounts of data. Broadly defined, government-generated data touches every sector of the economy and civic life. Government-funded data ranges from crime statistics to data on subatomic particles; from geological maps to court opinions; from immigration statistics to digital images of distant galaxies.

As a general rule, the Administration believes that database protection law generally should not protect government investment in generating data. There are three reasons for this conclusion. First, database protection proposals are premised on the need to provide an incentive for investment in data gathering; in the case of government-funded information, no incentive is needed. If a government decides that it is in the public interest to collect information on smog levels, education scores, or solar flare activity, it will do so. Second, there is a widespread sentiment that once data generation has been paid for with government funds, taxpayers should not have to pay "twice" for the same data.

Finally, the U.S. Government has historically pursued policies that strongly favor public funding of the creation and collection of information. The Administration believes that these policies have contributed greatly to the success of America's high technology and information industries as well as the strength of our democratic society. The Administration has stated elsewhere:

"Government information is a valuable national resource. It provides the public with knowledge of the government, society, and economy -- past, present, and future. It is a means to ensure the accountability of government, to manage the government's operations, to maintain the healthy performance of the economy, and is itself a commodity in the marketplace."
The Administration believes that the free flow of government-generated data is an important engine of economic growth; it will be an increasingly important resource for any society intent on creating jobs, businesses, and wealth in the "Information Age." Often, government-generated information is also critical to the health and safety of the population; we must ensure that any database protection law does not hamper the dissemination of such information.

H.R. 354 addresses the issue of government-generated data with the following section 1404(a) exclusion:

"Protection under this chapter shall not extend to collections of information gathered, organized, or maintained by or for a government entity, whether Federal, State, or local, including any employee or agent of such entity, or any person exclusively licensed by such entity, within the scope of the employment, agency, or license. Nothing in this subsection shall preclude protection under this chapter for information gathered, organized, or maintained by such an agent or licensee that is not within the scope of such agency or license, or by a Federal or State educational institution in the course of engaging in education or scholarship."
The Administration believe that this provision serves the general policy goal of making all forms of government information available to the public, but we believe the language is too narrow to satisfy this goal fully.

To begin with, we suggest that the Subcommittee examine existing definitions of "government information" for more inclusive descriptions of government-sponsored data collection. For example, OMB Circular A-130 states that "the definition of ‘government information' includes information created, collected, processed, disseminated, or disposed of both by and for the Federal Government." In particular, we believe that that the present language does not adequately covers situations in which the government contracts for or provides grants for information gathering. For example, some government contract expressly state that the private entity is not an "agent" or "licensee" of the government, removing the data gathering from the ambit of section 1404(a). One way to address this would be include language that information collected "under government contract" was included in the provision. Another possibility would be inclusion of language making clear that the 1404(a) exclusion also applies to data gathering "funded by the government."

In crafting broad statutory language that includes works created by government contract as government collections of information, a distinction should be drawn between (a) compilations of data made as a necessary element of a government-funded activity, and (b) compilations of data made by private entities over and above the activity being funded by the government. This appears to be the intent of the section 1404(a) language that:

"Nothing in this subsection shall preclude protection under this chapter for information gathered, organized, or maintained by [a government] agent or licensee that is not within the scope of such agency or license . . ."
This test also should be modified to account for government contractors and grantees who are neither licensees nor agents. For example, standards for when a database is necessary for a government contract could be developed from existing standards for when government agencies must collect data.

We also note that 1404(a) is currently worded so that data gathered by state-funded colleges and universities may enjoy protection under the bill. This same provision appeared in H.R. 2652 and the Committee report for that bill indicated that the statutory language was intended to ensure that "institutions that happen to be government owned should not be disadvantaged relative to private institutions when producing databases . . ." The Administration respectfully disagrees with this reasoning; we believe that public universities should fall within a broad definition of government institutions which generate collections of information. Instead of trying to draw a distinction between public universities and other government institutions, it might be more appropriate to concentrate on the distinction between public research and privately funded research at public institutions.

Higher education institutions are also a fertile ground for situations in which a database's generation is partially funded by the government. In such circumstances, what is fair to the researcher and to the public? The Hatch discussion draft would have placed outside the protection regime those databases "the creation or maintenance of which is substantially funded by [a] government entity." Without conducting a detailed analysis of the Senate discussion draft provisions, we agree that databases produced with substantial government funding should be treated like government databases, at least in the absence of a specific contrary provision in the government contract or grant.

2. Dissemination of government-generated data and

the potential for "capture"

Once data has been generated with public funding, there remains the goal of disseminating that data as broadly as possible. For many government agencies, the responsibility to make government-generated information widely available is a statutory obligation. Dissemination of government-generated data has always involved a mix of public and private resources. Through the Congressionally-mandated Federal Depository Library Program, the federal Government uses public libraries, libraries of public universities, and libraries of private institutions to make government-funded information widely available to citizens. In hundreds of cases ranging from the court system to the U.S. Geological Survey, private entities gather raw, government-generated data and then process, verify, and repackage the data to produce value-added products which are then widely disseminated.

Once there are such commercial products, any decisions to devote public resources to disseminate the raw government data further must be weighed against other demands for government resources. If government-generated data does not remain available to the public from government sources, there is the potential for capture of data, with one or a few private entities become the "sole source" for valuable social data.

When a U.S. Government work is integrated into a private, value-added product, copyright law requires that the U.S. Government portion remain unprotected and available for copying. The Administration has considered whether a parallel solution to the "capture" problem with collections of information would be appropriate: requiring private entities to identify government information in their value-added products and exclude such information from any database protection regime. The problem with this approach is that a private entity may make a considerable investment in gathering government data from disparate sources, bringing it together, and distributing it. This "value-added" would be lost - and the incentive for it destroyed - if all the data could be freely appropriated on the grounds that it is government-generated data in a private database.

On the other hand, not requiring that the government-generated data integrated into a private product remain outside the database protection regime creates the risk of "capture." Many people believe that this is a significant danger in the case of published court opinions in which there are only two major private publishers. Even when government-generated data remains available to the public from the government, it may be much more difficult to obtain than the private, value-added product. If only because the government does not advertise, it may appear that the private entity is the sole source for the government-generated data (both in the a raw or value-added form).

The Administration does not have any single proposal that will solve all of these issues. We do, however, have a few specific suggestions to address, to some degree, the capture and sole-source problems with government-generated data.

First, we recommend that the findings and/or legislative history of any database protection bill include express language recognizing the importance of keeping government-generated information in the public domain and urging agencies whose grants and/or contracts involve a significant amount of data generation to include provisions in the grants and/or contracts which push grantees and contractors to make research results available to the public in a non-commercial form. The Administration would support language calling for a government study to address this issue and offer recommendations to agencies, either individually or collectively, on how to improve non-commercial access to government-generated data resulting from research. At the same time, our recent experience with legislative mandates to amend OMB Circular A-130 counsels against any attempts at this time to impose any uniform access requirements on the wide range of government agencies.

Second, we believe that any database protection law along the lines of H.R. 354 should include a requirement that any private database producer whose database includes a substantial amount of government-generated data should note that fact with reasonably sufficient details about the government source of the data. By this, we mean, for example, "This database was compiled with substantial amounts of data from the National Weather Service, National Oceanic and Atmospheric Administration, Department of Commerce, Washington, D.C." but not "this database was compiled with information from the Department of Defense." In other words, the disclosure should reasonably direct the user to the government source. Defendants could be given an express defense where the database producer has included substantial amounts of government-generated information and failed to make such a disclosure.

We believe that such a requirement (and defense) would eliminate some apparent sole source situations by pointing the database user to alternative sources for the information. If the worth of the database producer's product was truly in the "value-added," consumers would stay with the private product. Such disclosures might also give government agencies a stronger incentive to maintain the raw data and keep it available to citizens, thus eliminating at least some sole source situations. Generally, we are hopeful that the digital environment and the Internet will, over time, make it possible for government agencies to provide more government-generated information at less cost through public channels.

D. Fourth Principle -- Avoid unintended consequences

Any database misappropriation regime must carefully define and describe the protected interests and prohibited activities, so as to avoid unintended consequences; legislation should not affect established contractual relationships and should apply only prospectively and with reasonable notice.
1. Prior contractual relationships

The Administration believes that any database protection law should expressly state that its provisions may not be used to enlarge or limit any rights, obligations, remedies, or practices under agreements entered into prior to the effective date of the law. This is especially important because today, many, if not most, commercially valuable databases are licensed rather than sold. The purpose of such statutory language would be to avoid unbalancing the contractual relationships that have been freely entered into before a database protection bill becomes law. This is a matter of notice and fairness. Providers of databases should not be permitted to assert limitations on use not contemplated at the time of the contract. Similarly, neither database users nor those under contract to produce databases should be able to take unfair advantage of a change in the law to assert rights where existing contracts may be silent.

For new contracts and licenses, parties will be able to negotiate with full knowledge of the application of this new law and can adjust the terms accordingly.

2. Prospective Application

We agree wholeheartedly that there should be no liability for conduct prior to the statute's effective date. With respect to situations in which the investment in the database occurred prior to the law's effective date, the situation is more complex. Based on a strict economic analysis, coverage of such databases is not necessary -- the investment occurred without the legal protection. On the other hand, there is some, albeit uncertain, legal protection now. That uncertain protection is provided by copyright, what people still believe to be copyright protection, and by state law. On balance, and especially in the context of a misappropriation approach, we believe that Section 4 of H.R. 354 takes an appropriate approach toward this issue.

3. The term of protection

Advocates of database protection have proposed database protection terms of up to 25 years. Alternative views have ranged from criticizing 15 years as too long to the minimalist bill's proposal for more limited rights of unlimited duration. The Administration currently believes that there is no single, optimal term of protection for the wide range of products subject to protection as "databases" or "collections of information."

In the absence of strong indicators of the optimal term for an ex ante incentive structure, we believe there are two virtues to the 15 year term of protection. First, it corresponds to the term of protection established in the European Union's Database Directive; this may facilitate emergence of an international standard while allowing us to concentrate on important issues like permitted uses and the flow of government-generated data. Second, we believe that 10-15 years roughly coincides with a substantial number of data producers beginning to maintain their records in digital formats. The presence of such digital archives of raw data is important in helping to avoid as many sole-source situations as possible.

Finally, the Administration would be troubled by any efforts -- present or future -- to establish a term of protection exceeding 15 years. While we recognize that there are and will be some data products which have substantial value after 15 years, the purpose of database protection legislation is to provide an incentive for the creation of new databases; we are doubtful that there are or will be many databases developed with a cost-recovery business plan going beyond 15 years.

4. The "perpetual protection" problem

Some critics of database protection have claimed that while proposals like H.R. 354 call for a fixed term of protection (15 years in this case), they actually raise the specter of "perpetual" protection for non-copyrighted databases. We believe that this is a serious issue that requires careful consideration. The critics' concern about "perpetual protection" has two foundations.

  1. "Perpetual protection" from "maintaining": the problem with the "organizing" and "maintaining" criteria
    The first source concern is the word "maintaining" in the basic prohibition. By including "maintaining" as a ground for protection, some database producers may assert that simply maintaining data collected long ago qualifies that data for continuing protection. H.R. 354 seeks to address this problem with the following provision that differs from H.R. 354's predecessor, H.R. 2652, in the bolded text:
    "1408(c) Additional Limitation - No criminal or civil action shall be maintained under this chapter for the extraction or use of all or a substantial part of a collection of information that occurs more than 15 years after the portion of the collection that was extracted or used was first offered for sale or otherwise in commerce, following the investment of resources that qualified that portion of the collection for protection under this chapter. In no case shall any protection under this chapter resulting from a substantial investment of resources in maintaining a pre-existing collection prevent any use or extraction of information from a copy of the pre-existing collection after the 15 years has expired with respect to the portion of that pre-existing collection that is so used or extracted, and no liability under this chapter shall thereafter attach to such acts or use or extraction."

    The final sentence of section 1408(c) apparently is intended to eliminate the possibility of "maintenance" being used to perpetuate protection for data entries.

    The Administration agrees with Chairman Coble that this potential problem must be addressed. We are concerned, however, that this approach is too complex. We believe that a simpler, more predictable legal regime would be produced by eliminating "maintaining" as a ground for protection in the basic prohibition. In fact, we urge the Subcommittee to consider whether either "maintaining" or "organizing" is needed as an event triggering protection under the statute. We believe that substituting "collecting" for "gathering" and making it the sole basis for protected investment would address this perpetual protection issue and better focus the statute.

    The present legislation is motivated by the need to correct the loss of protection for "industrious collection" under the "sweat of the brow" doctrine. Adding protection for "organizing" and "maintaining" would expand the protected investment well beyond what was historically allowed by the courts that embraced that doctrine. The Warren Publishing and ProCD cases involve collecting in the traditional sense, while there is no history or definition for "organizing" or "maintaining." Some aspects of maintaining data such as checking and adding facts are really aspects of "collecting" and should recognized as such. We also believe that "collecting" data captures much of the value in "organizing" data that can be lost to free-riders. On the other hand, merely mounting a database on a server is part of maintaining it, but mounting data for access does not suffer from the free-riding problem of collecting (i.e., it is an expense that must be borne by the misappropriator as well as the original publisher).

    For all these reasons, we think it only necessary to protect "collecting."

  2. Concern for de facto "perpetual protection"
We also believe that there is a potential "perpetual protection" problem that is more complicated. This problem is rooted in the need to provide some type of protection for revisions of databases. Legislation that provided protection to new databases but not to revisions of databases, would skew investment. There would be a disincentive to revise proven, useful databases in favor of creating new databases. Reassembling (largely) the same information in a new database would be inefficient not only for data gatherers, but for data users who -- in order to use the most current data -- would have to accustom themselves to the format of the new database. Therefore, any database protection legislation should offer protection for revisions of existing databases, so that new iterations of a protected database are themselves protected. But this means that eventually there may be unprotected data entries [from iterations of the database older than 15 years] intermingled with protected data entries [from more recent iterations].

This gives rise to a potential problem. In the classic case of a copyrighted book, the text loses protection at the end of its term, although new, revised versions of the text may enjoy fresh periods of protection. This means that one can find unprotected texts of The Raven or Leaves of Grass in libraries all over the country. At the same time, new versions of these books can be under some copyright protection (including new introductions, abridgements, "notes," artwork, etc.). It is possible to compare the two versions -- old, unprotected and new, protected -- side-by-side.

In the digital, on-line environment, content producers may choose not to sell copies of their works; access to a database may instead be licensed to users. The advantage is that the database user can receive the most current version of the compilation. The disadvantage is that the user may lack access to an old version of the database to compare old and new entries. The question is, how can a user, accessing only the newest version of a database that has gone through many iterations, distinguish unprotected data entries from protected data entries?

In Appendix D we give a simple example of how this problem would arise.

While the Administration believes that the new language of section 1408(c) helps ensure that the bill provides no de jure perpetual protection, we remain concerned that the digital environment could produce de facto perpetual protection because users would be unable to distinguish protected and unprotected data and, therefore, would be chilled in their use of unprotected data. Such inadvertent extension of the protection afforded by H.R. 354 could exacerbate other concerns, including the "sole source" issue and the constitutionality of the law.

There have been varied proposals to address this problem. One proposal has been to "tag" data entries so that older, unprotected data can be distinguished from protected data. We are not in a position to comment on the feasibility, technological or economic, of this suggestion. Another proposal - which is set out in the Senate discussion draft - would be the establishment of a deposit system to ensure that older, unprotected versions of the databases would be available to the public. We believe that the storage demands of such a deposit system would exceed anything the Copyright Office or even the Patent and Trademark Office now handles. It is also not clear how the costs of such a deposit system should be apportioned.

At different junctures in this statement, we have recommended establishing express statutory defenses to remedy possible problems in a database protection; we make the same type of suggestion here. Where the database that is the subject of a litigation is the descendant of a now unprotected database and has substantial elements in common with that unprotected database, the defendant should be able to raise, as a defense, that the most recent unprotected iteration of the database is not reasonably publicly available.

In other words, if Smith Industries has been issuing the "Smith Industrial Database" annually since 1980, and then in 1999 if Smith Industries sues someone for unauthorized distribution of the "1999 Smith Industrial Database" the defendant can raise as a defense that the 1983 Smith Industrial Database is no longer reasonably publicly available. If the 1983 database is reasonably publicly available, there is no such defense.

The virtue in this approach in comparison to mandatory "tagging" or deposit systems is that it allows each private enterprise to determine how to make its now unprotected database available to the public. Moreover, the database producer does not have to make this final decision until the term of protection is over. Some concern has been expressed about this proposal by database producers who produce continuously updated databases; their situation in relation to this proposed defense merits examination. But, as we said above, we propose the defense when the protected database "is the descendant of a now unprotected database and has substantial elements in common with that unprotected database." We believe that for many continuously updated databases, the most recent database would have almost no elements in common with their 15 year ancestor.

5. The "sole source" problem

There has been much discussion of what is called the sole-source problem: that many markets for data will be supplied by only one database provider. The sole-source problem arises most acutely when one entity controls access to a unique, unreplicable collection of information. This control may arise either purposefully, as with an exclusive contract with the data's original generator, or incidentally, when the data's original generator ceases to maintain it. Other practical sole-source situations can arise when an existing database operates as a natural monopoly, that is, it is possible, but not economically efficient, for someone else to build the dataset independently.

Even now, a sole-source may use contracts to preserve its market position against free riding by would-be competitors. Any form of database protection carries with it the possibility that it could further insulate a sole-source database provider against potential competition. Consequently, it will be important that any database protection legislation incorporate provisions that guard against the possibility that sole-source database providers will employ their new rights to the detriment of competition in related markets.

A partial answer to the sole-source problem is a savings clause such as the one in H.R. 354, providing that nothing in the bill operates to the detriment of federal antitrust law. Thus, for example, database owners would be as subject as any other economic actors to the application of the essential facilities doctrine, which prohibits owners of assets that are essential to the ability to compete in a market, and are not feasible to replicate, from refusing to deal with firms that need that access. On the other hand, this doctrine has been invoked relatively rarely, and understandably so: part of the incentive for the development of any valuable product or service is the hope that the product or service will be so attractive to consumers that it will become dominant. Regularly compelling access to valuable products and services could diminish their developers' incentives to invest in them in the first place.

At the same time, in markets such as data collection and dissemination, where natural-monopoly characteristics suggest that consumer choice among competing database products and services will not be common, some safety valve over and above the rarely used essential-facility doctrine may be necessary to ensure that database providers are not able to deny access to firms that require it in order to compete in downstream markets. Additional possibilities include the development of doctrines comparable to the misuse doctrine in patent and copyright law or, in extreme cases, the idea/expression merger doctrine in copyright law.

As with some other problems we have identified above, however, much of the concern arising from the sole-source problem can be eased by defining both the protected activity and the prohibited conduct narrowly. If the bill protects only data collection and generation, it will be covering value-adding conduct that enhances welfare, even though only one firm may find it worthwhile to engage in collecting and disseminating a particular type of database. Similarly, to the extent that the bill prohibits only distribution and extraction for the purpose of distribution, while conversely permitting transformative uses of data, it would leave data providers free to add value and enter markets that the original data collector's work alone was incapable of serving.

E. Fifth Principle -- Balance protection with permitted uses any database misappropriation regime should provide exceptions analogous to fair use principles of copyright law; in particular, any effects on non-commercial research should be de minimis.

Given the difficulty of foreseeing how "substantiality," "extraction" and other legislative terms will play out in a complex and rapidly changing environment, we expressed concern last summer that H.R. 2652 lacked a balancing mechanism analogous to the fair use doctrine in copyright sufficient to address the wide range of circumstances in which information is aggregated, used, and reused. We were especially concerned that the section 1203(d) exception for non-commercial research and educational uses did not ensure against disruption of legitimate non-commercial research and educational activities are not disrupted by the prohibition against commercial misappropriation. Last year, we also were concerned with equitable issues of access and use that may be especially important in markets exclusively served by a single data producer.

In reviewing the permitted acts provisions of H.R. 354 (section 1403), we would like to suggest, as an initial matter, that the Subcommittee rearrange the various "permitted acts" to move more clearly from absolutely shielded activities for all persons (such as use of insubstantial parts (1403(b)) to the more limited shields on activities set out in 1403(a)(2). We propose that the Subcommittee reorder section 1403 as shown in Appendix C. We believe that this reordering would provide legislation that is easier to understand and a clearer platform for full discussions on whether the permitted activities adequately address policy and constitutional concerns. This proposed reordering is separate from any substantive recommendations.

As to the substantive elements of the permitted acts section, the Administration is pleased that HR 354 limits the liability for nonprofit educational, scientific, and research purposes to uses that harm directly the actual market and that the legislation now includes as section 1403(a)(2)(A) provisions for "additional reasonable uses" similar to the fair use provisions of section 107 of the Copyright Act. However, we are concerned that the last sentence in section 1403(a)(2)(A) could be interpreted as overriding the criteria in 1403(a)(2)(A) with a standard that differs in form but not in substance from the basic operating provisions of Section 1402. The Administration would not agree with any intent is to override a "fair use"-like balancing test; on the other hand, if the last sentence of 1403(a)(2)(A) is intended only to restate the basic prohibition without disturbing the balancing test, it is extraneous language. We therefore recommend its deletion.

We recognize the desire to avoid the precise fair use terminology of the Copyright Act in order to make clear that the legislation is grounded in misappropriation rather than intellectual property. However, in the interests of transparency and predictability, we believe that the fair use principles of copyright are a sound platform to build on. Providing the safeguard of familiar fair use criteria that can help minimize any unintended consequences of the untested basic operating provisions of Section 1402. We believe that this would give courts the tools they need to do justice in particular situations.

The fair use factors may need to be framed or supplemented to allow courts to take into account that the subject matter is industrious collection rather than original expression, that the protected interest is purely economic, and that the proscribed behavior is a form of unfair competition. Courts might also be called on to recognize the unique conditions of some database markets. But we believe that the vast experience of courts in using the judicially crafted principles of fair use should be built into database protection legislation. It is worth noting that in the 23 years since Congress codified the fair use factors, it has neither narrowed nor expanded their application. While it may be appropriate to diverge from copyright fair use in creating the permitted uses regime for database legislation, the differences between the two should be clearly understood and recognized to by concerned parties.

Finally, we would reiterate a point made earlier: the scope of the basic prohibition will determine the weight that the permitted uses section must bear in judging both the policy and constitutionality of any database protection legislation.

F. Sixth Principle --

Ensure protection for U.S. companies abroad and promote harmonization Consistent with the goals of the World Trade Organization (WTO) and U.S. trade policy, legislation should aim to ensure that U.S. companies enjoy available protection for their database products in other countries on the same terms as enjoyed by nationals of those countries. There has been some discussion in the United States about the effects of the European Union's 1996 Database Directive (EU Directive) on American database producers. The EU Directive requires European Union Member States to provide sui generis protection for databases, but denies this protection to nationals of any foreign country unless that country offers "comparable protection to databases produced" by EU nationals.

The Administration opposes such "reciprocity" requirements, both domestically and internationally. We believe that commercial laws (including intellectual property and unfair business practices laws) should be administered on national treatment terms, that is, a country's domestic laws should treat a foreign national like one of the country's citizens. This principle is embodied in Article 3 of the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS Agreement) as well as more generally in the Paris Convention for the protection of Industrial Property and the Berne Convention for the Protection of Literary and Artistic Works.

The Administration believes that Congress should craft U.S. database protection legislation to meet the needs of the American economy. A database protection law properly balanced for the robust digital economy of the United States will serve as a model for other countries who hope to build businesses, employment, and economic activity in the new millennium.

At the same time, we believe that a law along the lines of H.R. 354 (with proper attention to the concerns we have identified) will amply provide protection "comparable" to that provided by national laws implementing the EU Directive. From the perspective of a private database producer, a misappropriation law as discussed in both the last and current Congress would, we believe, provide a cause of action and meaningful remedies in the same range of situations in which the laws implementing the EU Directive provide a cause of action and meaningful remedies.

Although we believe that a law along the lines of H.R. 354 would provide American database makers with protection under the EU Directive's reciprocity provision, the Administration would, for the reasons stated above, oppose any effort to put automatic reciprocity provisions into American law in this area. United States Trade Representative Charlene Barshevsky cited the reciprocity provision of the EU Directive as a subject of concern in announcing the Administration's 1998 Special 301 Review.

Rather than applying either pure national treatment or reciprocity, a U.S. database protection law could include provisions similar to section 104 of Title 17. This section establishes points of attachment for copyright protection -- for example, membership in a treaty protecting copyrighted works, works of stateless persons, and works subject to a Presidential Proclamation. Under section 104(5), the President may proclaim foreign works eligible for protection in the U.S. if he finds that the foreign country grants U.S. nationals full national treatment under its intellectual property laws. The Administration would support an appropriately crafted provision of this sort, to allow the President to affirmatively grant or deny database protection to foreign nationals on the appropriate finding by Executive branch agencies such as the USTR and/or Department of Commerce.

G. Additional Issues

1. Gradations of Criminal Liability

While we agree with Chairman Coble's decision to shield non-profit researchers and educators from any criminal liability under section 1407, we believe that the existing criminal provisions should be further refined, particularly by drawing a distinction between misdemeanor and felony conduct and requiring minimum amounts of damage under each. This will expand the range of charging options available to prosecutors. We have attached our recommendation for statutory language as Appendix A.

2. Data-Gathering Activities of Law Enforcement Agencies

We believe it is important to make clear that the legitimate data-gathering activities of law enforcement and intelligence agencies will not be affected by the bill. While we believe that intelligence gathering and national security activities are already shielded from liability by 1402 in that these activities will not cause "harm to the actual or potential" market of the product, we propose an additional statutory provision and legislative history as shown in Appendix B to confirm that these activities fall outside the bill's reach.

3. Administration Study

The Administration believes that, given our limited understanding of the future digital environment and the evolving markets for information, it would be desirable for the bill to include a provision for an interagency review of the law's impact at periodic intervals following implementation of the law. This would be consistent with laws and proposed laws in other emerging technologies where Congress has mandated examination of a new law's economic impact. Such a government study should be conducted jointly by the Department of Commerce, the Office of Science and Technology Policy, and the Department of Justice in consultation with the Register of Copyrights and other parties.

We believe that such a study should not be limited to any one set of issues or concepts; rather, it should explore issues including: database pricing before and after enactment of the law; database development before and after enactment of the law; international protection for American database producers; access issues; and "sole source" databases.

Office of Science and Technology Policy
1600 Pennsylvania Ave, N.W
Washington, DC 20502
Flag BAr
[Home Page][Citizens' Handbook icon][Help Desk]

To comment on this service, send feedback to the Web Development Team.