The Lumen Database, a research project designed to facilitate research on takedown notices, has made a significant change to the way it offers data. While it will still be possible for researchers to access data with a special login, regular users will have to apply via email to see the details of a notice in full. The changes have been made in preparation for an expansion of its operations.
Thanks to projects like Google’s Transparency report, however, much-needed light can be shone on this murky area.
Users of Google’s service can see almost every detail of a copyright claim but when it comes to accurate research, it’s necessary to visit the Lumen Database, a research project that hosts millions of notices submitted by some of the biggest Internet companies.
The resource has become an essential tool for researchers and reporters interested in the cease-and-desist landscape. However, new changes at the resource will mean that the majority of users will now have less initial access to data.
In a nutshell, takedown notices presented in Lumen’s database will no longer list the precise URLs targeted by copyright holders. Instead, as the image below illustrates, the notices only list how many URLs were targeted at specific domains.
As is clear from the above, Lumen has removed the specific URL details, which are absolutely crucial if one is to even begin researching the effects of a particular takedown notice. However, on every redacted notice is a hyperlink which presents a system through which it is possible to get an unredacted copy.
Regular users wanting to properly research a notice now have to enter their email address to receive a single-use link to view it in full.
However, it now transpires that researchers and journalists will be able to obtain a special login to the Lumen Database that its operators hope will provide an experience that’s largely unchanged. That means we’ll continue to bring news on interesting takedowns and report on various trends.
That being said, the bigger question is why Lumen has taken this decision. Lumen project manager Adam Holland informs TorrentFreak that it’s all about expanding and improving the service.
“Lumen wants to remain a vibrant and valuable feature of the landscape with respect to research, journalism, and public awareness around takedown requests. We believe that we have been successful at doing this over the years and that some great work has come out of, or been predicated on, our data,” Holland says.
“But we also feel that it’s both possible and necessary for Lumen to continue to grow and improve. One obvious way in which to do so is to expand the number and type of notices we receive, as well as the range of institutions from which we receive them. We’ve heard from some companies that although they’d like to share notices with us, for a variety of idiosyncratic reasons, they don’t feel that they can do so under the current Lumen schema.”
Sensitivity over the amount of information made available by Lumen under default settings will also play an important role as the platform expands. Holland says that DMCA complaints will form just part of the project moving forward, with other forms of takedown notices from all over the world augmenting the database.
“We wish to be conscious of the concerns of those sending this broader variety of notices,” he says.
As readers will probably recall, the Lumen project has previously been subjected to criticism by copyright holders. We asked Holland if this had played a part in the decision to redact notices for more casual users of the resource, who some allege may have used it to obtain links to infringing content.
“Our traffic metrics simply don’t bear out any suggestion that the database is a viable tool for those seeking access to infringing or unauthorized content. But, we have always endeavored to strike a balance,” he explains.
“We think that the new framework allows the research community to stay informed while in no way compromising research done with the database. It also — importantly — reduces the significant workload associated with database maintenance, which will free up Lumen staff to do more productive things.”
We put it to Holland that there will probably be some members of the public who won’t enjoy jumping through additional hoops to gain full access to notices. However, he says that Lumen doesn’t really have a good sense from its traffic volumes how many people use the resource for specific reasons.
But while reduced access will probably be disappointing to some, there are those who see this development as a double-edged sword.
TorrentFreak spoke with a representative from an anti-piracy company who told us that less visibility for URLs will be welcomed by his clients.
“As a DMCA agent for copyright owners, I can say that Lumen and its predecessor Chilling Effects have long been seen as making a mockery of Google’s takedown procedure – why delist search results if those same results are all still listed in a notice linked at the bottom of the page?” he said.
“But I appreciate that the DMCA process can be and has been easily abused, so it’s important to have some kind of ability to check on potential censorship and/or erroneous takedowns.
“So while my clients will surely welcome a change that makes it trickier to access infringing material, I share the concerns of those who may feel that this places obstacles in the way of legitimate research and accountability.”
Finally, it’s worth noting the large effort expended by the Lumen team to keep the project going. The platform is currently receiving up to 70,000 notices per day (mostly filed under the DMCA) with many requiring redactions to preserve privacy.
These can be handled automatically but Holland explains that manual redactions take place frequently, with a single notice potentially taking 20 minutes or more to process.
Lumen kindly provided a list of companies and institutions that contribute (or have contributed) to the database. Any parties interesting in joining this group are invited to contact the project.
The Internet Archive
PGPSMedia [not currently sending]
Reddit [not currently sending]
Stripe [not currently sending]
Tuebl [not currently sending]
UC Berkeley – Infosec and policy
UC Berkeley – California Digital Library
UC Berkeley -Open Computing Facility
Many thanks to TorrentFreak for the breaking news.