Crawled open notebook science
Yesterday, I did a Google search for a procedure I developed in grad school on OpenWetWare, my former open notebook where all my original content was placed. In the search results, I came across a website that had crawled my notebook and reposted a page on their site. This website seems to be about AIDS research but, my original notebook entry has nothing to do about AIDS research. You can take a look at it here. While I have come across other blog type websites that repost my original content, this is the first I’ve seen of a “
medical” website reposting my content.
This website is obviously a site that is culling content from other places on the web. The science I did was completely unrelated to any forms of human health research. It was basic research, research done on a fundamental level with zero clinical implications. What happens to the people that are searching the internet for information on a scary subject such as AIDS and they come across my content there? Now, I’m happy that in the future, sites that crawl and repost information will help to perpetuate my original content. However, the information is now on a site guised to be about information on a deadly human disease. What if the person reading it does not possess enough information to filter my post and realize that it has zero pertinence to them? As a scientist and an educator, this concerns me a lot.
The persistence of ONS
It would seem that the persistence of original content in open notebook science will continue through repostings, however, without context to the content, that science is useless. This is not a good thing and has the capability of destroying the science being done in open science. I have no clue on how to fix it and this should be discussed at ScienceOnline2012 of which, I need to remember to register for.
Now, the origninal reason why I decided to start a new notebook on WordPress.com was because there are implications about using a service that has not been proven to be externally funded. For instance, what happens when the service is no longer available and all your content is gone? Of course, there are services such as the WayBackMachine that attempts to archive webpages but, from what I understand, services such as this are not well known and are basically an “after-thought” and an “exercise for a librarian” in the scientist’s mind.
Also, as gruesome as it sounds, I will eventually die. What happens to my scientific research then? Doing simple searches for some of the most popular sites around (in this era) on what to do when a user is deceased and how to access/memorialize their accounts shows that there is no standard procedure on the subject. If you have your own server and are using it to post your open notebook science, what happens to your content when you are no longer able to pay the fees associated with owning a domain? From what I understand, unless it has been archived, poof – it’s gone. This is not a good thing since open notebook science is a treasure trove of data; not only for science in the present, but for future historians and other scientists. It’s even a giant data source for linguists of the future because I’ll bet you $1 (which will probably vest to a lot of money in the future) that the language we speak and write now will not be the same in 500 years.
Both scenarios, culled content posted on unrelated websites and vanishing open notebook science due to death/lack of funds, are things that need to be discussed. Open scientists need tools in order to keep the persistence of open notebook science alive. Does this mean creating a non-profit to house our data or does it mean that University Libraries need to create programs in order to archive ONS?