Lies, Damned Lies, and Statistics

NIH Releases Final Rule on Data-Sharing Requirement. The Result? Not So Bad.

The new policy takes some good — even surprisingly good — steps forward.

One of the best ways to detect research malpractice is to re-examine the data. That’s why one of the best ways to improve research is to require that data be shared — otherwise, no one can re-examine it in the first place.

With a 2020 budget of $42 billion, the National Institutes of Health is the biggest research funder in the United States. Its policies affect untold numbers of research projects at every research university. For years, it has toyed with a data sharing requirement, but has never gone quite far enough.

Last year, the NIH released a proposed data-sharing requirement and asked for public comments. Public comments are a great way to influence a regulatory agency, since the agency is by law required to take comments into account. Even better, regulatory agencies often value hearing from neutral and respected parties — such as AV — rather than only hearing from businesses or organizations that have an ax to grind because they are the ones subjected to regulation. Thus, I wrote up comments on behalf of AV, and got a couple of other foundation leaders to sign on (hopefully increasing the impact).

The NIH just released its Final Rule on Thursday. The result? Not so bad. The final rule is definitely improved along several dimensions where we (and others) filed comments.

First, the draft rule only said that NIH-funded researchers should include a plan for data sharing. This was too weak. And it wasn’t even much different from the current policy, which is easily evaded by writing up a “plan” that doesn’t actually include any sharing of data. Our comments therefore said that instead NIH should “establish a firm default expectation” that data “must be shared” with only a few exceptions.

The Final Rule doesn’t go quite this far, but did significantly strengthen the requirement: “NIH expects that in drafting Plans, researchers will maximize the appropriate sharing of scientific data.” NIH Director Francis Collins additionally commented that the new policy “establishes the baseline expectation that data sharing is a fundamental component of the research process.” This is a move in the right direction.

Second, the draft rule only applied to data that support “research findings” or “scholarly publications.” This was also too weak, since journals and scholars can be reluctant to publish null results, and lots of data therefore might as well be in the trash heap even though other scholars would benefit from knowing about null results. Our comments therefore argued that the rule should cover unpublished data as well.

Somewhat surprisingly, the NIH’s Final Rule did take up this point, and the Final Rule now applies to data “of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.” This is potentially a huge advance.

Third, the draft rule only said that it “encourages the use of established repositories for preserving and sharing scientific data.” We said that this didn’t go far enough: When scholars share only on a personal website or Dropbox, the data can disappear or be altered in short order. We said that the NIH should require the use of established repositories that won’t disappear next week. Moreover, we said that the NIH should have a default expectation that grants include budget line items to support repositories for the work of data sharing.

The Final Rule didn’t go quite as far as I would have liked, but it does now say that the NIH “strongly encourages the use of established repositories to the extent possible.” The NIH further issued supplemental guidance on how to find the right data repository, and better yet, issued supplemental guidance stating that NIH budget requests can include the costs of curating data, creating the supporting documentation, and paying the fees for a repository (including an example of a grant that prepays for 10 years of storage). This is really important: Data sharing will become more of a reality when it is a routine budget item.

Changing an institution like the NIH is a bit like trying to steer the Titanic with a single oar. But even the NIH is feeling the political pressure (not just from our comments, but from many other sources) to move toward a more robust data sharing policy. The new policy takes some good — even surprisingly good — steps forward.