¶ 1 Leave a comment on paragraph 1 0 At The Lexicon of DH workshop run by the CUNY Digital Fellows on September 29, our class’ very own Jojo Karlin ran a very informative and engaging discussion of what it means to do–and how to do–digital humanities work. In the course of the workshop, a very interesting question (to this librarian at least) was posed by a participant: Is there a database for all the data that is available? Perhaps if there was, I’d be out of a job!
¶ 2 Leave a comment on paragraph 2 0 The same participant asked, for argument’s sake, where they could go for information on US theater productions in the last century. I suggested the Internet Theater Database, but it was a resource I was already familiar with (and its scope is limited) and there’s also the Internet Broadway Database (scope also limited), but the inquirer was asking for a source to go to if they did not already know of a data set. So, I quickly tried to find such a database of data sets during the workshop. Not surprisingly, I have not been able to find this comprehensive database of all data (at least not yet). Of course, there are several issues at play as to why this database does not exist, not least of which is the price of information and the commercialization of data which often puts limits on how information is shared, in addition to the cost of compiling, hosting, and designing such a database.
¶ 3 Leave a comment on paragraph 3 0 Nonetheless, as with much of the DH community, arguments for sharing data are strongly rooted in the idea that openness will create an opportunity for growth and development. So, I wanted to share some of the projects and resources that I have found that are working towards an amalgamation of open data sets out there…
¶ 4 Leave a comment on paragraph 4 0 Conveniently enough, last week I was doing some collection development for the library I work at and came across a recommendation from Choice publication for Data USA. According to the “About” page, this project aims to place “public US Government data in your hands. Instead of searching through multiple data sources that are often incomplete and difficult to access…” As with much of the data currently available through this kind of portal, it is largely produced by government agencies.
¶ 5 Leave a comment on paragraph 5 0 During one of our initial class discussions, we went around and stated what projects or tools interested us or what we wanted to know more about. I stated that I was interested in the DPLA (Digital Public Library of America) and they do a great job of aggregating data from a range of cultural institutions throughout the country (and they make it pretty easy to access the data): https://dp.la/info/developers/
¶ 6 Leave a comment on paragraph 6 0 Where would the state of information be if Google didn’t have some handle on aggregating data???
¶ 7 Leave a comment on paragraph 7 0 Open Knowledge International has put together http://dataportals.org/
¶ 8 Leave a comment on paragraph 8 0 Their mission statement is very noble indeed: “We want to see enlightened societies around the world, where everyone has access to key information and the ability to use it to understand and shape their lives; where powerful institutions are comprehensible and accountable; and where vital research information that can help us tackle challenges such as poverty and climate change is available to all.”
¶ 9 Leave a comment on paragraph 9 0 And a few other sources I’ve found (and there are so many more!):
- ¶ 10 Leave a comment on paragraph 10 0
- http://about.jstor.org/service/data-for-research
- https://www.icpsr.umich.edu/icpsrweb/
- http://data.un.org/
- http://gdeltproject.org/about.html
- http://wiki.dbpedia.org/
- http://www.pewinternet.org/datasets/
- http://www.datacenter.org/research-tools/web-resources/
¶ 11 Leave a comment on paragraph 11 0 Forbes.com published a helpful list off 33 free big data sets earlier this year: http://www.forbes.com/sites/bernardmarr/2016/02/12/big-data-35-brilliant-and-free-data-sources-for-2016/#87796d267961
¶ 12 Leave a comment on paragraph 12 0 So much data is out there, but not in one place. Also, for the workshop participant inquiring about theater data, there does not seem to be a database for them. Although, for the record, they were only inquiring about theater data as a hypothetical and did not claim to actually be interested in that line of scholarship! It is my hope that more projects develop and the spirit of openness adopted by many scientific and government communities permeates across the spectrum of disciplines and industries so that one day there will be a database of all data.