What a cool idea! I don’t know anything about him except what I learned when I saw his exhibition at the Guggenheim last year, but it seems like a digital, interactive database would be the perfect home for his work. It’s already so… catalog-y (not a technical term) that it just seems to make sense.
Dannabelle, the Today series, in and of itself, offers you quite a lot to work with. It looks like in this initial stage it was challenging to work with some of the data the way that it was entered. I think you can go ahead and talk about those challenges in formatting and how you might need to address that. It would be possible for you to take that column and break it out into 3 different rows. Doing so could really help you to pin point the days that you have and then offers you some potential opportunities to talk about what’s missing–and trying to figure out what’s missing because it was destroyed by the artist or because of something else… Looking forward to next steps!
Dannabelle, I, too, know nothing about the artist. He does seem to be incredibly data driven. My response has to do with the potential larger meaning you hope to derive from your data analysis. What do think it will tell you about the artist and about the world he occupied and traveled through? Have to drawn any inferences or conclusions about him from what you’ve graphed so far? Do we know anything about his personal creative and artistic motivations and inspirations, his drive to produce art in this incredibly meticulous and regimented way? I know this sounds like Old School art criticism but as a historian I’m fascinated with the historical context in which this art was created.
Thanks Carolyn! That’s what I’m hoping. My worry is how to make that happen…
Thanks for the feedback, Dr. Rhody. I certainly have been some issues and concerns with formatting. I’m still continuing to add to the dataset and consider and reconsider the structure of my dataset. Which column are you referring to that I break up into three different row?
Also, I have been struggling with using R for this project but Chris Alen’s Sula’s Tableau workshop was really helpful!
Thanks for the questions, Dr. Brier. Part of my interest in analyzing and visualizing this particular On Kawara series has to do with the fact that he seems to be responding to contemporaneously “current” events in whatever part of the world he happened to be. Kawara described the process of the “Time Series” “everyday meditation” which seems to refer to the repetitive and ritualistic act of painting according to the rules he set forth in the series. Each painting is tied to historical and geographic context. As I mentioned, the paintings were stored in boxes with newspaper clippings published on the same day. I want to see the pattern that emerges-if there is one-between how Kawara painted-keeping in mind the sizes and colors of his canvases-and the events of the day.
I really liked the Guggenheim exhibition and so this project is very exciting to me! Personally I felt that the laborious ritual that the artist undertook, which is apparent when one faces the exhibited series that just goes on and on, has a very meditative quality (to a sublime extent, I might say). This was my initial and only encounter with the artist’s work, and I think a large part of my impression came from the materiality of the work. So naturally I am curious of the different insight that your analysis will bring to this work, as you abstract and modulate the data in various formats.
Your idea of incorporating international events is also very interesting and led me to some Googling. At first I assumed that the DPLA would have a fully digitized newspaper archive; it seems that it is largely an ongoing project, as described in this link: https://dp.la/info/2015/11/09/dpla-announces-knight-grant-to-research-the-potential-integration-of-newspaper-content/
Chronicling America, by the NEH and Library of Congress, and the Europeana Newspapers project that are mentioned in the above link seem interesting, but in the case of the former collection the time range doesn’t fit your project.
So this isn’t much helpful, but I did find something that might be of interest: while poking around the New York Times’ archive (http://query.nytimes.com/search/sitesearch/), I learned that NYT hosts scanned images of every front page on the web. For example, the url for Feb 14 1983 is: http://www.nytimes.com/images/1983/02/14/nytfrontpage/scan.jpg
Change the dates and you get other front pages as well. Unfortunately it is fairly low-res, but it could be something to start with. (Since Jul 6 2012 they also provide higher-res pdf scans: http://www.nytimes.com/images/2012/07/06/nytfrontpage/scan.pdf)
Another thing that popped into mind was Wikidata, that one might be able to search using a given date; their data items have date properties (https://www.wikidata.org/wiki/Wikidata:List_of_properties/Generic), which could be used return events that happened on a certain date. I am however not at all familiar with the Wikidata API so this might be a long shot. Looking forward to how the project goes!
Yes, the Guggenheim exhibition was very haunting and effective, I think in part because of the atmosphere of the architecture itself. And this project kind of does away with the material experience of looking at On Kawara’s work.
I was just playing around with the NY Times front page scans for my chart. Being that I am not as well travelled as Kawara and not as familiar with local news media of a lot of places he visited, I may just for the time of this project/class incorporate the NY Times front page scans. Thanks for the suggestions and feedback!
Sounds a very interesting project. I’d love to see if there is any correlation between the content changes of zines and current (female) politics throughout history. Or have zine creators their own agenda?
I’m looking forward to hearing more about your project Jenna.
Jenna, What a rich dataset to work with. It makes much sense to begin with the metadata, and one thought that I had in terms of thinking about the metadata is that it could be useful to hear just a little bit about how you went about creating a regularized vocabulary for description. It sounds as though this may be a process that combines several common practices in librarianship, and it’s a process I’m sure many of us would benefit from hearing more about. For the purposes of this project, I’m not sure that mapping everything to MARC records is necessary, and may create an unnecessary barrier at this stage to having a chance to really play around with the data. I definitely suggest pursuing OpenRefine for doing some generalized clean up, but you may find that using the .txt file and starting with some very generalizable text analysis could be useful. I’m sensitive to the concerns of scope creep, but looking at the rich collection of images you also have available to you, I’m wondering if something like Lev’s big-data image analysis tool could be an interesting way to approach talking about the intersections between visual and textual cultures in the collection. Looking forward to next steps!
Thank you, Lisa! I’m so relieved to get your feedback and direction.
I’ll skip mapping the records for now and will work with one dataset or the other. Only the MARC records have controlled vocabulary. The records I exported into the spreadsheet are informal/natural language. So would it make more sense to use the MARC .txt than the Excel file? That would require less clean up, since the data is more structured. The records from the Access file are possibly richer and contain more information about the creators, but maybe for this project easier is better?
Again, thank you!
Hi, Jenna. Great project that I think will be visually and conceptually exciting. I agree with Lisa that we need to get a much clearer sense of the type and quality of the metadata that you will to work with. That will help determine what the most productive ways for you to dive into this admittedly large project (which, by calculation will cover anywhere from 3,000 to 6,500 zines). I agree with Lisa’s idea to consider using Lev Manovich visual analysis tool on some sample of the zines you are considering using? That may be too big a project to start with but I wouldn’t sell the visual component short as you move forward.
Thank you, Steve!
If I’m going to use just the MARC records for now it’s a little over 4,000, I think, some using AACR2R2 & some RDA cataloging. I’ll work on it this week. I took a couple of days off, so maybe I can even take advantage of Digital Fellows office hours.
Quick GitHub suggestion: in README.md, don’t start each paragraph with the pound/hash sign (#). GitHub, and markdown, use # as the equivalent of <h1> in HTML. To render text as a paragraph, just insert a line break after it.
Here’s a good guide: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet.
Oh my goodness, thank you! That is very helpful. I see a hash sign and automatically think “comment!” Can you tell I’ve been spending all my time in R? 🙂
No problem! I love markdown as an easy-to-use markup language, but the hash sign can throw off users at first.
Not dull at all. Your project is super timely! I almost wrote “relevant,” but after the discussion in class I hesitate to use that word, since it’s subjective. Well, relevant to me, and to anyone who cares about electoral politics in the United States.
Carolyn, I’ll echo Jenna’s sentiments about timeliness! Something that would be useful to hear is about how you made decisions about what served as your guide while manipulating your data in R. Did you use an existing package that others who do similar work use? How might you check your work? There’s a lot to work with here, and certainly a lot to think about in terms of the relationship between migration and causation/correlation to election results. Really impressive work, though, branching out to new technologies and experimenting in public. Looking forward to next steps!
Carolyn: Now that the election’s over and Clinton “won” Virginia, your analysis is even more timely in that you were correct to want to analyze the evolution of Virginia from purple to blue in an election that swung the other way. I’d like to hear more about how you plan to analyze in and out migration and what the various data points you will be using from the IRS dataset on income. Are you analyzing every county in the state (which has many) or will you focus on several, especially the ones in northern Virginia which have experienced the greatest inflow and which have changed the most politically? I was also puzzled by your use of the term “American values” in your post and wasn’t entirely clear on what you meant. Is that Pew’s term or your own? If the former, how do they define those values?
Somehow in our little chats I missed that your project is on Occupy. That makes it all the more exciting to me. I have a few friends who were in the OWS Archives Working Group, focused on collecting the digital output of the movement. http://www.nycga.net/group-document-categories/the-occupy-wallstreet-archives
I’m sympathetic to your initial negative outcome and all the rewriting you’ll have to do.
Can I librarian you a little more? I have a friend who wrote an article for First Monday: http://firstmonday.org/ojs/index.php/fm/article/view/3845/3280 She was part of the People’s Library at OWS and being a well-funded post-doc at the time (that exists–in the private sector), visited Occupy sites all over the world–maybe even Istanbul??? I’m hunting around for other writings from Jessa. I saw her give a presentation or two on Occupy…
I’m excited to follow your work!!!
Great sources Jenna, thanks very much! Pls feed me more, I’d appreciate any sort of info.
I’ll definitely ask you more about your friends.
Élif, you’ve chosen a challenging dataset to work with that requires acquiring the data, figuring out what the metadata even means, having to figure out what the dates represent, cleaning / regularizing it, and then trying to think about what can be done with it to help advance your own research interests. What’s valuable from your experience has as much to do with what didn’t work as it does with what did. If you haven’t already, I’d suggest taking your dataset in to the Digital Fellows Office Hours or to PUG to work on transforming the html into a table of some form. It might be helpful to hear more detail about the steps you took to map the dataset, as I didn’t see in your list of metadata location names. Let’s talk in future about places to look for help on archival practices. Looking forward to next steps!
Thanks very much for your comment Lisa.
I’ll be at the DF office next Tuesday. Hope we can find a solution to move it forward.
Élif, This is an important AND difficult project. I really like the focus of your research, connecting the Occupy movements in Istanbul and NYC, but as described, I’m not clear where the U.S. piece fits in. Maybe you intend that to come later, after you’ve figured out how to capture and display the video materials from Turkey. In any case, I think figuring out how to display graphically (say, in a .csv file) the metadata about the 29 videos you have found on the bak.ma site would be a good start (for example, using a temporal scale to indicate when videos were shot and try to correlate that to major political moments in Turkey). I am puzzled by where your post ended in the mapping space, though. Am I wrong to assume that the videos are concentrated in a few key cities in Turkey, if not mostly in Istanbul? Yet, the map you chose to use is a topographic map of all of Turkey, which would be a particularly challenging map to display your data on. Wouldn’t a standard Mercator view of Turkey with major cities included be a better choice for your display? Also, given what Chris Sula had to say in class last week about how the NYC police were interested in his mapping of Occupy groups, do you have any concern about how the Erdogan government might make use of your mapping project? if so, that could pose problems for making your data public. Just a (political) thought.
Thanks a lot for your comment. Actually, and unfortunately, the comparative study between NY and Istanbul would be held in a very conventional way of dissertation writing. Here, in the metadata project that I proposed for #dhpraxis16, I want to work on a project which is partly involved to my research, or which can be a following step of my doctoral research in the future. I’ve also noticed that I didn’t explain clearly, all the images that I’ve posted above are taken from bak.ma’s website. They have also a feature called “view on map”, but, like I said, it doesn’t function properly. Moreover, bak.ma is built on an open access media archive program called pan.do/ra, and I have to check this mapping function with the program developers. But you are right, topographic map is hard for users. And security issues… There are different opinions about it, like it was mentioned by Chris also, and it’s definitely one of the core parts of my doctoral research. One side argues, these images are publicly available, and no harm to collect, archive, and share; and other side disagrees by saying, sharing with public doesn’t mean to become directly a part of an archive. But in the case of bak.ma, for example, users are uploading images by themselves. So I can say that it’s still a blur area. Hope I’ll discuss it better in my dissertation.
I would like to respond sympathetically in particular to the third concern you highlight in Cameron Blevens’s “Digital History’s Perpetual Future Tense” and the possibility for the spread of misinformation to a wide audience.
Our readings have focused on the increased popularity and legitimacy of digital humanities projects among academic communities. It is certainly comforting to be in a classroom with people who have an open view of dh. The first class I took as a MALS student on the DH track was a comparative literature course with literary scholars. On the second day of class, a major discussion took place amongst the all these phd candidates with an overwhelming majority of my classmates complaining about dh methods as illegitimate methods of analysis. It’s a criticism that is easy enough to see as a close-mindedness. Last week, somebody asked what value digital literary studies projects provide to people outside the academy. I think it is, though difficult for somebody like me who has a degree in English, an important thing to consider.
To return to the last concern Carolyn raised, when considering audiences outside the academy, dh practitioners are up against misinformation and perhaps more entertaining information. I think it is important to consider the concerns detractors raise and the forms which misinformation come in. How does Donald Trump or anybody come to cling to the belief that Muslims celebrated on 9/11 or that Obama founded ISIS? (I mean, beyond idiocy) I would wager that the Web, the tool with so much potential for digital humanists, helps to dispense such beliefs.
You raise some very important, vexed, and charged issues surrounding our readings about digital history, democratization of the past, and the responsibilities of a historian in the age of constantly available media. I’ll be interested to hear today your thoughts about The History Manifesto’s articulation of the role of the historian… but I thought I might also point you to an interesting experiment by Mills Kelly from George Mason University. In 2012, he offered a course called Lying About the Past: http://globalaffairs.gmu.edu/courses/1124/course_sections/6500. Yoni Applebaum does a pretty good summary of the experiment in his piece in The Atlantic: How the Professor Who Fooled Wikipedia Got Caught by Reddit: http://www.theatlantic.com/technology/archive/2012/05/how-the-professor-who-fooled-wikipedia-got-caught-by-reddit/257134/.
Your post raises the question: How does digital technology change the role of pedagogy in the history classroom? for the public historian?
Looking forward to continued conversation!
Danabelle: Thanks for your comment! I agree completely that the internet (rather than digital tools in general) is the main culprit, although talk radio and cable news also bear large responsibility. I’d be interested in exploring how the intersection of these more traditional media with the internet has changed the audience for and/or efficacy of conservative propaganda (perhaps a more fitting question for our course last semester!).
For what it’s worth, despite my above reservations, I very much disagree with those literature students you encountered. It seems to me that any good defense of the humanities in general (like Lisa’s last week) will apply equally well to digital humanities. Regardless, I think the validity of any method has everything to do with the particular questions one is trying to answer, so their response does seem rather close-minded.
Lisa: Thank you for the links! Looking forward to checking them out, and discussing more in class!
I’m curious if the suspect evidence behind Time on the Cross would have been as closely interrogated if the conclusions weren’t so offensive. I wonder if most digital humanities projects are subject to closer scrutiny that more traditional projects, or on the other hand, if some of those troubled by DH aren’t prepared to interrogate computational data if works could get a pass.
I realize this question isn’t central to the whole of Carolyn’s provocative (in the best possible way) post, but as we talk of how digital humanities projects and scholar/practitioners are evaluated by their peers, I think it’s worth considering.
And yes! I think about expertise vs. gatekeeping in academic and activist contexts and how I’d like to get past them as binary options.
I have nothing to contribute to point number three as I am baffled by a political campaign that is reality indifferent.
[…] valid argument is the proper aim of academic pursuits and the best path to knowledge (see also my previous post on this […]
These links are super helpful, and I’m grateful for all of the resources y’all Digital Fellows provide.
And yet, it’s hard for those of us working full time with traditional schedules to take advantage of most of the wonderful opportunities. For example, I’d be interested in the Python Learners and Text Analysis Working Groups. I’d also be psyched to get help from you during office hours….if office hours, were, say, after class!
Also–the ITP workshops page says Fall 2015. Is that a typo? It lists some Johanna Karlin teaching a workshop. Is she your identical cousin?
Jenna! The Fall 2015 is a typo — those are indeed this semester’s workshops! And yes – you’ve caught me in my formal self (which I hardly recognize)– Johanna is my full name and so they must have put me in based on my CUNY id or something…. mysteries!
I’m looking forward to attending the Digital Fellows’ NLTK workshop and the ITP’s Python workshop. Also hope to go to a text analysis working group meetup!
This is really interesting and made me want to abandon ! My most recent web development class (run by Girl Develop It, a national organization that runs low-cost tech classes) only covered HTML4. Would you recommend that academics starting to learning web development from scratch start with HTML5?
I think that’s the best practice. Generally, it’s always good to start with the most recent iteration of any programming language (with the exception of Python 3 vs Python 2). This stops you from getting stuck in your habits like me and keeps your projects up to date.
Luckily, your class is still very useful! Making the jump to 5 from 4 shouldn’t be any problem at all.
Thanks for your reply! Now I’m interested on your take on Python 2 vs 3? Is it still ok to start learning Python 2 in 2016? For example, I’ve read (just on the internet) that using the book Learn Python the Hard Way is a no-no in part because the author hasn’t updated his lessons to 3.
(Coincidentally for anyone interested, Girl Develop It is actually running an HTML5 class: http://www.meetup.com/girldevelopit/events/234180248/)
You’ll get a few different answers to this. When I first started using Python, Python 3 was the no-no because it wasn’t widely supported by third-party libraries. But that was years ago and we’ve gotten to a point where most support it now. In my opinion, there’s nothing wrong with starting with 2, since there aren’t too many important differences. In fact, most of the big changes you can still do with 2. The only thing to keep in mind is that it’s a legacy now and won’t ever be updated again; you’ll have to move to 3 eventually.
I seriously recommend Learn Python the Hard Way! He does a good job at moving slowly and explaining concepts clearly (a lot of software developers have a problem with this, for some reason). That’s actually what I used to brush up before the interview for the job I have right now. It will get you started on the path to learning the language. Yes, it will be outdated in some areas, but all that matters for a beginner is getting down the concepts of object-oriented programming. You’re not going to be building out a new operating system that will need years of endless support on the first day.
Great job explaining how, when, and where you collected your data. You may want to consider whether or not any of the images contained in the tweets are images of text longer than the character limit for Twitter. Also, are the Tweets from the dates of the event only? How many people use the hashtag before/after the event? We can begin to ask around about topic modeling applications that can handle Italian. In the meantime, you may also want to consider TXM, which is a European-funded text analysis tool… I think I remember them saying that that Italian was one of the languages they supported. Looking forward to next steps….
Dear professor, thank you very much for the suggestion.
Luri, really interesting project to use tweets and the dataset you’ve focused in on seems promising, especially given the broad political and policy themes. Nice, decent dataset size (1,500 tweets from 795 discrete individuals). Any sense demographically of who these people are in class, geographic, age, and ethnic terms? I’m glad there’s a European version of Mallet, per Lisa’s note, that works with Italian (my fingers are crossed, as we say in English)! Looking forward to the next step. One other thought: do you mean “participant reaction” rather than “participate reaction”? The former uses the noun; the latter the verb.
Dear professor Brier,
sorry for this very late reply.
Thank you for your suggestions.
I like how your post, like the chapter, begins with anxiety, and generally a personal take on the process of writing, and scholarly writing in particular.
I have to admit I’m not entirely clear on your argument (not because you didn’t lay it out well, but because I’m an undereducated neanderthal), and I’m not sure Fitzpatrick herself is suggesting throwing out the current mode of scholarly communications. Or maybe she is? I’m not sure either. I think that what she’s pushing back on is a “do things the same way forever” attitude, particularly in a market driven environment. –But maybe that viewpoint is informed by the last chapter in the book, on the university and the desire to smash the state that is scholarly communications is in right now, where university press marketing departments can overrule editorial boards.
Oh I completely agree, I don’t think she’s suggesting throwing the current mode out at all. That’s actually what I loved about her approach: it’s adaptive rather than reactive. (Granted, I have not finished the book yet, so I may be off base there–we can discuss on Wednesday!) But my understanding is that she is just as interested in the production of better scholarship as she is in keeping academia relevant, changing the nature of scholarly publication, etc.
Jenna–Aha! I think I figured out the source of your confusion (which has nothing to do with anyone’s education or neanderthalness and everything to do with my sloppy wording and general resistance to formal citations in blog posts :/ ) When I mentioned not throwing out traditional scholarly forms, I’m referencing an argument from Clifford Lynch on page 84. It’s a case for making digital work that is in some way recognizable as or translatable to print, on the grounds that it will provide the new work with “scholarly legitimacy.” Fitzpatrick’s point is that digital scholarship will be limited by a focus on making work that is easily translated between print and digital; my argument is that there is a stronger case to be made for keeping argumentative essays around than that they will be useful for coaxing skeptical colleagues into accepting digital work. Does that make sense?
Sorry for the confusion, and thanks for drawing my attention to that bit of careless sentence-crafting! Now I want to edit my original post based on your feedback…or would that be too meta?
The research process here makes good sense: you’ve started with a solid question, a sense of the modalities you’d like to use to approach the question, and research components (GIS shape files, layers, and source materials) that would be useful. It may be helpful at the end here to go back to your original question about the way in which geography–and particularly mobility and location within that geography–are important components of your research question to consider what data points exist in your resource materials and what you may be missing. Have you had a chance to play around with any of that data yet? Looking forward to hearing more…
Marti, this is a great topic and I look forward to seeing how the mapping portion of it plays out. Will this convey a sense of how specific neighborhoods or streets tended to concentrate homosexual life and culture in Weimar Berlin, as a similar map of Manhattan in the 1940s and 1950s might? Also, you really start by defining homosexuality broadly but end up only focusing in on gay men (at least insofar as I can ascertain from the titles and descriptions of the several source books and guides you will be using). I don’t think you want or mean to equate homosexuality with men alone, which is what George Chauncey sort of did in his book Gay New York. Will any of these Weimar sources also tell us something about lesbian life in Weimar Berlin?
First of all, thank you so much for sharing your insights on my data project. I really appreciate it.
I did not mean to sound like I would exclusively focus on male homosexuals—I actually intend the contrary. Even before the arrival of the Auden-Isherwood circle in the late 1920s, Weimar Berlin was an affirming space for female and male homosexuals, transvestites and transsexuals. So, with this data project I plan on highlighting some of the areas and specific places in Berlin where all these social groups used to congregate, especially in Berlin-Kreuzberg and Berlin-Friedrichshain. Some names that come to mind are the Paradiesgärtlein [Little Garden of Eden], the Sankt-Margaretenkeller or even the famous ElDorado.
The sources I have been working with do talk about both female and male homosexuals, but I am still doing some more research on this topic. For instance, in Gay Berlin, Robert Beachy—even though he does concentrate on male homosexuals—quotes a similar project that does give more emphasis to lesbians. Unfortunately, I don’t recall the name of that book right now, but I will make sure to look into it as well.
On a separate note, I also got the chance to speak with Prof. Rhody last week about the mapping software I was using (QGIS) and the progress I was making so far. She also recommended me to look into NeatLine, and I am thinking of experimenting a little bit with CARTO—although the latter does not offer the possibility to georeference old maps.
I love your project, Lauren! “Define and defend” indeed. I will be interested to see what you come up with and how your work could be replicated in my library.
Lauren, the question of physical “measuring” in conversation with the kind of “measuring” that happens through the forms of computational analysis you mention wanting to try brings up lots of interesting questions about value and decision making. I’m not sure I follow quite yet which specific relationships you’re interested in diagramming. This is likely because as someone who is not formally trained as a librarian, I’m not as steeped in the awareness of how rich or complicated or unresolved the relationship between congressional subject heading and call numbers might be. Perhaps that’s an area in which the diagram itself could prove illustrative. I’m also curious to hear more about the way in which those correlations between subject heading and call number map to degree offerings and… perhaps… student and faculty scholarly production? In other words, do the proportions of reference materials physically and quantitatively demonstrate a correlation to the number of capstone projects students produce for various disciplines? Perhaps that is a bit too far of a reach, though. Looking forward to your next steps!
Lauren, I love that you’ve spent as much time on figuring out why you are doing this project (which has very practical, real world consequences for you and your library’s patrons) as how you are executing it. Your project uses big data to help explain and justify how your institution might be transformed in the future. My mind went to the same scope-creep kind of considerations as yours did (for example, it would be cool and useful to graph usage in each subject area category, as you suggest, but that’s another huge data project in its own right). Just correlating the areas of academic interest/focus in the collection t seems like an ambitious enough place to start. Like Lisa, I thought a table of the LOC subject listings would be a good place to start. As a historian, I was thrilled at how large a word “History” is in your Wordle! I think the biggest problem in getting the data ready for analysis will be those instances (of which I assume there are many) where the subject of the item doesn’t easily map against your departmental structure at the college (e.g., many schools don’t have discrete academic program in the social sciences, such as Sociology or Political Science, choosing instead to lump them together in agglomerated programs called “Social Sciences,” which is how they are at several CUNY community colleges).
Achim, this is a very promising project and I’m pleased with how far you’ve been able to carry it forward already. It would be interesting for you to speculate, even at this early stage, about what kinds of data about gender and anti-woman attitudes in Korea you might discover when doing the sentiment analysis of tweets you propose doing in the next stage. Are you assuming that those who self identify as feminist, both females and males (I assume), will be more open to other kinds of identities and political/ideological positions? What larger questions about gender and identity in contemporary Korea is this data analysis likely to reveal?
Achim, this is really cool work. I am really interested in the computational aspect, because it is in line with my learning goals, and just as interested in how you constructed your project and the decisions you made. I have recently had the idea to collect posts from the subreddit The_Donald after discovering that I understand so little of the ideology and even rhetorical devices and memes employed there. I think my project will take a very similar direction to what you describe, though as I think of it, its scope already seems rather tremendous!
Achim, what a provocative and interesting project topic! I can’t believe someone was so disturbed by feminism that they were driven to join ISIS and that so many men(is it just males?) share the sentiment. I wonder if some of them are “just” trolling or if that can be determined through analysis. Anyway, I hope you are able to move forward with it using machine learning techniques. It seems like an intimidating process but with so much potential. Would it be feasible to do it with a smaller or sample portion of your dataset? Like just using the “feminist” hashtag?
Thanks for sharing Tom. I was really interested in going, but couldn’t so it is good to get a feel of what transpired.
I like the idea of being excited about making things accessible to all, rather than seeking to be in compliance. I know I struggle with it though. I have a color-blind colleague who always complains about Excel’s default chart colors, which I am often guilty of using.
I am intrigued by the idea of augmenting visualization with physical or aural representations. I think the challenge posed is rewarding, not limiting, and pushes us to be more thoughtful about what we do with our materials.
It was a great lecture (and our seminar was pretty well represented there).
I too am guilty of not giving enough attention to “accessibility from first principles” and focusing on compliance (alt text, avoiding red/green coloring). Miele challenged me to think of ways to go deeper.
This is a great summation of Joshua Miele’s talk, Tom!
Miele’s discussion of a rethinking of the meaning of accessibility was particularly strong and convincing. Before I studied English, I spent a semester as an architecture student. A small but significant part of what drove me away was the unwelcome, sort of patronizing culture in the wood shop (where we had to build our models)! Sure people of all genders were welcome to the facilities, but it was well known that women who asked for adjustments (all users of the machines had to ask shop workers to change settings on machines) were subject to a level of flexing and condescension as opposed to discussions about their actual projects our male counterparts seemed to receive. I wonder how different it would be if there was just one woman working in the shop.
That being said, as Miele mentioned, (gender, ethnic, class, etc.) diversity and accessibility are distinct issues that require different strategies to address. In my academic and professional experience, there has been very little talk and consideration about accessibility. His call for a “top-down” approach, placing people of different abilities in positions of power, as his work demonstrates, is far more effective than perfunctory 508 standards.
I was definitely moved by the talk to be involved in improving accessibility issues and am looking forward to contributing to YouDescribe
Greg, there’s a lot here, and you’ve managed to take your project really far in a short period of time! You may want to consider as you move ahead how doing this kind of project with a constructed concordance relates to doing similar text analysis with the full text of the novel. Which of the tools that you’ve experimented with do you think you might be likely to continue using? Which of these experiments seemed more valuable to you than others? A really capacious approach to so many different technologies here. Looking forward to next steps!
Greg. I’m impressed by how far you’ve already taken this. My first question in all such data projects is: How much do you trust the data you are suing, that is the Rosenbloom Concordance? Is it generally accepted in the field as a reasonable capture of an admittedly utterly unique piece of writing? You’ve tried so many different approaches and tools already, it’s probably a good moment to step back and assess carefully and critically what you’ve learned so far before you plunge ahead with ever more intricate data analysis. Congratulations on your fearless immersion into one of the tougher pieces of writing around.
I completely understand the skepticism regarding this data and I think that if I were to continue with this dataset, I would take the complete text file of FW and run it through a couple of concordance generator scripts with GUIs and compare the results to see how accurate the original set was. The reliability of the data even went over my head when I was creating it, so it’s definitely good to get some grounding like this and really ask these questions.
Kate, this is an important and interesting project. There are obviously two major issues with Wikipedia: the preponderance of male editors and writers who develop and oversee/make determinations about content of the entries; and the very real gender mismatch of the total number of entries on male vs female subjects (regardless of who is writing and editing those entries). The Wikimedia Foundation is very aware of the problem and has engaged in outreach efforts to expand the number of entries on women and on the number of women contributors to Wikipedia. The Graduate Center’s own Michael Mandiberg has helped lead those efforts, as you can see in this piece in the New Yorker: http://www.newyorker.com/tech/elements/a-feminist-edit-a-thon-seeks-to-reshape-wikipedia and in this ArtNews piece: http://www.artnews.com/2014/02/06/art-and-feminism-wikipedia-editathon-creates-pages-for-women-artists/.
Your initial linked data work could be expanded and deepened. I suspect playing around with the Wikidata more would help overcome the frustrating “timed out” problem. Is you intention to dig deeper into this data or to move on to something else?
Thanks for your feedback, Dr. Brier! I hadn’t seen the articles you linked to, so thank you. In terms of the final project, I will be moving on to something else. I do hope to continue to explore this issue, but at this point I want to do so by taking action rather than continuing to reinforce the issue of gender gaps in the data of which many are already aware. I haven’t been able to attend any edit-a-thons yet, but I hope to do so at the next opportunity as well as do some editing on my own and contribute to linked data-focused Wikipedia efforts. Perhaps at a later point I can redo my queries to see how things have changed. I do like Mandiberg’s statement in the New Yorker article regarding community and empowerment, though. I agree that taking action on these issues can bring about additional good in the world that perhaps cannot be captured by data.
What a great corpus to work with, Claire. Looking forward to hearing more as you move forward. You may want to try going to the Python User’s Group meeting (PUG) on Wednesdays from 12-2 for help with the Python libraries you’re interested in using. Also, there is a “text analysis” group starting up that you may also find to be a useful resource moving forward.
Claire, I’m impressed by the sheer size of the CRS corpus and wonder if it might make more sense to pull out a subset to work with at this stage rather than try to deal with the unwieldy size of the .txt files you will have to deal with. Perhaps you could pick one or two especially rich years for reports, or compare ones in different Congress’s, say one from 2009 and one from 2014 (not exactly clear from your blog post or the Every CRS Report website what years are included). Your commitment to open government data is admirable and I think letting the public know what can be found in such a rich corpus of digital material will be extremely beneficial. It will be interesting also to see if the CRS remains as open under the “New Regime.”
Mary, it’s been exciting to see your project grow in the past couple of weeks. There’s a lot of open-ended questions that your dataset presents, and you’ve done quite a bit to try to get a sense of where the challenges are. I would strongly encourage you to look at other projects related to transcription and digitization of historical ledgers as you move ahead. In particular, Katherine Tomasek’s work (http://journalofdigitalhumanities.org/2-2/encoding-financial-records-by-kathryn-tomasek/) may be helpful to you. Here’s her project site as well: http://www.encodinghfrs.org/. Looking forward to next steps!
Mary, Lisa’s suggestion is a good one. It’s always a good idea to see how others have handled similar datasets of historical materials. I was pleased with the final paragraph at the end of the post where you attempted to say what kinds of insights or knowledge you might derive from systematically analyzing the data in the ledger rather than the casual ways most historians would approach this kind of data. That said, I wasn’t entirely clear what you were trying to say about comparing the information in the ledger with that contained in city directories. In what ways is the ledger “better” data than city directories? One potentially fruitful approach, something that I spent a lot of time doing myself, was tracing individuals into city directories. One thing that would tell you was how geographically mobile individuals were in this post-Revolution era as the national government is coming into being and moving from NYC to Washington DC. The city directories can sometimes provide additional personal info on individuals that your ledger may not contain. And as I mentioned in my office you might want to use Burrows and Wallace’s Gotham to help set the broader context of the city’s history in those years.
Hi Iuri, nice work! The idea of using Twitter as a participatory reading platform is fascinating. Personally, I find that platforms offering collaborative annotations like the one used for Debates in the DH (http://dhdebates.gc.cuny.edu/) or Medium (https://medium.com) help me get to the point more easily. TwLetteratura seems to be somewhere in the middle ground of such platforms and live conference tweets.
I wonder if the participatory mode of reading also impacts the content of the created text. For example, do the word frequencies in the tweets differ from the word frequencies in the original text? If so, I also wonder how it would be different from a single person’s review—although single person review may not exist, and this is perhaps way out of your project’s scope.
Also, it was interesting to see the participation rate dropping; something to bear in mind when conducting a digital project.
thank you very much for your comment and sorry for my late reply.
My intention was to discover the relation between the original text and the tweets: by analizing the tweets content, I would like to define the trends followed by users to participate to a project like this. Your suggestion about word frequencies between the original text and the tweets sounds fascinating: maybe I will develop it for a paper in the future.
I agree about the participation rate data: it is very interesting, and it could be much more interesting to understand if this dropping was due to the kind of text chosen or not.
Thank you very much again, and I hope to see you again during next semester.
Interesting. I do think there’s a real divide between historians and journalists (and academics and journalists in general). In The Awl recently, a two-part conversation between an academic, Jo Livingstone, and an editor for The Guardian, David Wolf, opened with the following headnote: “Editors think that scholars are bad writers, and they say so often and rudely. Academics think that journalists are lazy thinkers, and they’re no more polite. Neither is right, I think, but the fields are so twain that nobody really bothers to think about the why or the how or the what next except super-intellectual magazines that nobody reads.”
Do you think the influence of public history on digital history makes it more copacetic with digital journalism, since both fields aim to educate (or at least reach) wide audiences in new forms and formats? And do you think digital journalism sees digital history less as an arena for chiding “lazy thinkers” and more as one for engaging journalists, however limited their “early” drafts of history may be?
See below for links to The Awl articles mentioned above.
– Part I: https://theawl.com/can-the-academic-write-part-i-24fdaf8bf422#.t9ri6jc7i
– Part II: https://theawl.com/can-the-academic-write-part-ii-b243bc91b44a#.rah22vp2b
As an observer to both fields, I hope so! The Slate listicles seem to be an example of digital journalists reaching out to digital historians in the way you describe. Your first question made me think more on the grounds of public history and data journalism (as a subset of digital journalism). Both aim to educate wide audiences, certainly, but another similarity could be that both lack “argument-driven” scholarship and reporting, as Cameron Blevins claimed for digital history. When the New Republic cited Nate Silver saying, “we’re not trying to do advocacy here. We’re trying to just do analysis. We’re not trying to sway public opinion on anything except trying to make them more numerate”, I saw a parallel in the digital history projects Blevins describes.
New Republic article: https://newrepublic.com/article/117068/nate-silvers-fivethirtyeight-emptiness-data-journalism
Eduard: This is an important project though I appreciate how you’ve struggled trying to figure out how to narrow down your inquiry and data collection. I think you’ve touched on an important response to lynching: the strong motivations for African Americans to leave the South and migrate North and West. That Great Migration has three distinct waves: the western migration (to places like Arkansas and Oklahoma) of formerly enslaved people in the Deep South after the Civil War up through the end of 19th century; the first wave of the G.M. (usually younger people and families born after the C.W.) during and after WWI; and the large migration north and west during and after the Great Depression and especially during and after WWII. If you tried to map the out migration of African Americans by county from especially violence plagued southern locations you’d have an interesting mapping project that goes well beyond the NY Times’s effort, however good that was. You might look at Isabel Wilkerson’s recent book on the G.M., The Warmth of Other Suns, for general ideas about the how and where African Americans migrated. Happy to talk more about the historical and conceptual aspects of this project, if that would prove helpful.
Thank you Professor,
Thank you for your encouragement. I really learned about the Great Migration through Wilkerson’s work and narratives. I plan to scour her sources for data, as she mentions a number of things I am interested including the positive selection of migrants and possible sources for data on migration. I think getting it down to the county level will be difficult, but I was thinking I could use the position of railroad lines to make some inferences could work in the absence of good data.
I will take you up on your offer after I make some more progress during the break.
Thanks very much Jojo!
I love the connection between flesh and data, Achim. Unfortunately, I don’t know Latour very well. Are you using him to say that we reify data, or that data have materiality and we erase it? Maybe something else completely because I don’t understand this?
We think of data as something virtual and represented, but, like everything else, it has its own materiality—on your hard drive, on a remote server in northern California connected to the internet.
@tlewek thanks for the feedback and sorry for the late reply.
I do think that as you pointed out, the materiality of data is sometimes overlooked. And that can be a way of distinguishing between the human subject and data—the technological artifact. But perhaps the distinction is not as obvious as one might think; as human beings with flesh we might be closer to our data than, say, a reality(us)-representation(data) relationship. Because the tools used to operate on data, are increasingly the same as tools that operate on actual human beings.
An artwork in view at the Glass Room (https://theglassroomnyc.org/) exhibition seems relevant. Heather Dewey-Hagborg (http://deweyhagborg.com/) is an artist who works with, among other fields, bioengineering and computation; a number of her projects engage with DNA forensics. There exist companies that provide computational predictions for someone’s appearance based on their DNA, for example (https://snapshot.parabon-nanolabs.com/). One of Dewey-Hagborg’s project, Invisible (http://biogenfutur.es/) provides an open source toolkit for erasing the user’s DNA trace. I find the parallels and merging between physical and digital interesting (bio surveillance and data surveillance; digital traces and DNA traces; statistical analysis and prediction in both cases), and perhaps revealing that the two are not that different—at least in the context of technological development.
Dewey-Hagborg’s article provides more critical context about her take on bioengineering technology (that is applied in social contexts), and many of her points strikingly resonate with critiques of big data / AI:
Sci-Fi Crime Drama With a Strong Black Lead, on the New Inquiry http://thenewinquiry.com/sci-fi-crime-drama-with-a-strong-black-lead/
Hi Carolyn, I enjoyed your follow-up. Looking at the Library of Resistance list (and formatting rules) being compiled in real time is very interesting! On one hand I feel that a familiar, shared document platform like Google Docs can serve its purpose in an immediate context. On the other hand, as this list is already over 300 items long and spreads over 12 pages only after 3 days, I can definitely see the value of your proposal. Perhaps a public Zotero group is a slightly better option as the list grows? Negotiating timeliness and robust organization seems like an important issue here.
A few things popped into my mind as I was reading, so here goes (although you are probably aware of many of this):
The Open Syllabus Project. Your focus on social mission might make OSP too comprehensive for your use, but the related texts that come below a search result in OSP’s Explorer (http://explorer.opensyllabusproject.org/) could be useful in certain ways. If they open up their API soon enough, it might also be something to explore.
Social bookmarks services could serve as proof of concept for the putting-together-and-sharing part. The design you showed us last time reminded me of are.na specifically:
Also, throwing in another reading list by Francis Tseng in the collection: http://speculatingfutures.club/
I’m excited about this project!
It sounds similar to, but way more sophisticated than something I was part of a while ago http://radicalreference.info/readyref. The creators being librarians, the “ready reference” resources were more like online subject guides or structured, annotated bibliographies than syllabi. I wonder if research guides could be adjunct to the syllabi? Or a module? I wonder what other modules there could be to syllabi?
Tom, we’ve talked a bit about your project ideas, and there’s quite a bit to go on with even a small collection of words here. If I haven’t already, I’d be remiss not to mention Tanya Clement’s work with Gertrude Stein (an example can be found here: http://tanyaclement.org/2012/01/12/sounding-steins-texts-by-using-digital-tools-for-distant-listening/). Her dissertation, I believe is available through the University of Maryland, and could be quite instructive for you. Looking forward to next steps!
Thanks for commenting, Lisa! I’ve heard about Tanya Clement before, and this looks a great place to start building out my project a bit more.
[…] Revised Zine Union Catalog Proposal […]
This proposal has come a long way since your first proposal, Jenna, and in a very good way. I will withhold my comments until after class today. However, I would challenge the group to look at other kinds of projects that create similar types of resources (the Modernist Journal Project, for example) to think about potential features, use cases, and pitfalls. You might want to talk a little bit about the collections you’re working with in your proposal and in your presentation tonight–as the materials may offer some of your class colleagues a personal stake in the project. How well do you know the people around the table? Are there collections that you could start with, or that you could point people to look at an existing list of potential zines, that may make working on your project personally and academically relevant? Another thing to consider as you move ahead is what you think the “workflow” might be from 2 perspectives: 1.) adding and curating records and 2.) information retrieval and use. Perhaps in conversation with folks tonight, you could ask them for some feedback about sites/tools/resources they use regularly and what makes the workflow (think the library catalogue, amazon, google) either appealing or frustrating. Perhaps tonight you can also use people in the class to help brainstorm where there are workshops, tutorials, or other ways of learning skills that no one around the table has at this moment.
Looking forward to hearing more about the conversation and to reading other class colleagues comments here over the coming week.
Eduard, these are important questions you are raising. Correct me if I misunderstood your intent, but I think that “almost entirely digital” sources is less about where the information comes from more about the shifting mode of documenting, circulating and analyzing history. But I do agree with your concern in that certain entities (e.g. government, corporate) hold more power in creating these data, which come in forms that are easier to use. But I also think this was the case in traditional research as well. So in addition to digital v traditional, we could also draw the line between state-provided v other sources of information. But this still is also not a comprehensive distinction, because everyone is outputting data and everywhere new boundaries are drawn—which I suppose is the challenge of history in a digital age.
Your project about lynching (and the non-abundance of lynching-related data) made me think of Mimi Onuoha’s Missing Datasets project (http://mimionuoha.com/thoughts/), where she explains: “Calling something ‘missing’ automatically implies that it should exist. . . . For every dataset where there’s an impetus for someone not to collect, there’s a group of people who would benefit from its presence.” Put another way, how can we take into account the source of the historical information, and how can we create and preserve data that powerful institutions care less about?
Hi Gregory, you may know about this but just throwing in a link to Lang-8, a “language-exchange social network” as they define themselves: http://lang-8.com/
Michael, I like the way you puzzled your way through this, which is the essence of what takes to do digital data work like this. You grabbed what you could from other work and you adapted what was available to figure out how to proceed with your own data inquiry. There are obviously no cookie cutter answers to data questions or decisions. But reading your post I still don’t have a clear sense of exactly what you think the presence or absences of ellipses means in Joyce. You haven’t explained the larger import of that absence or presence, at least insofar as I can understand it (and I say this as someone who is not a Joyce scholar in any way, shape or form). Also several actual examples of Joyce’s technique with ellipses would be useful.
Good summary of the workshop, Claire. Like you, I left with more questions than answers, but those questions pushed me to reconsider, refine, and strengthen some of the many vague ideas I had.
Interesting take, Claire. I definitely agree that we need to move away from “the vapid embrace of the digital”—the “digital” alone won’t change the infrastructures of the academy or the place of the humanities within them. In fact, this course has been at its best, I think, when it has encouraged explorations of how we do we humanities. If the digital humanities can foreground those explorations, then I think there’s some (cautious) cause for hope.
It was so hard last week to proceed as if nothing had changed. I appreciate your working your personal experiences of last week into your post. <3
This is interesting, I’m going to have to check it out. Usually search engines on museum websites are very limited to categories, artist, and dates of their works. I’ve had to go through pages and pages of digital archives just to find what I was looking for, but I’ve never been on the MoMA website for this, Modern Art isn’t something I like I prefer the MET.
Very fun and out there. Thanks!
Hey Tom thanks for this, super thoughtful!
I also appreciated Jojo’s explanation of APIs! I also would like to hear more in the future about the ethics of the Selfie Project and other examples of the usage of personally identifiable information in digital humanities projects.
Thanks for compiling this great list Jojo. I’m not sure exactly which step will put me at ease, but having all this in one place should expedite that!
Thanks, Jenna! I was wondering why it didn’t show up!!
The Chronicle of Higher Education published an article in 2013 that discussed how libraries were starting to support peer-reviewed, open access alternative publishing practices inside the academy: http://www.chronicle.com/article/Hot-Off-the-Library-Press/136973
More information can be found at Amherst University Press’s website about the mission of one such endeavor: https://acpress.amherst.edu/about/
One of the articles mentioned yesterday during class was Jill Lepore’s “Can the Internet Be Archived?”, published by The New Yorker back in January, 2015 (http://www.newyorker.com/magazine/2015/01/26/cobweb).
Lepore not only talks about the continuous “overwriting, drifting and rotting” regarding Web preservation, but also discusses the importance of the Internet Archive in San Francisco. Along the article, other topics that are brought to the table include the relationship between material and digital archives, the restrictions that come with copyright infringement and the actual working-process of the Wayback Machine.
Thanks for sharing Brian.
I initially had the question: “what does this data mean? Is there a definition file?”
I found how GTFS works: https://developers.google.com/transit/gtfs/reference/?csw=1
Very usable data, but I got a little overwhelmed. Will leave it to the professionals.
You’ve really taken the technical aspects of this project and run with it, which is quite impressive. I’d be interested to hear a bit more about what challenges the limitations in the dataset (working by building number, for example) creates for the kinds of questions you might be able to explore with this dataset. For example, what do we need to know about schools in NYC and their admissions / enrollment in order to understand how location and building as spatial identifiers are significant when considering crime rate. There’s a lot of rich material to work with here… very excited to see your work progress.
The readings about the employ of DH in History made me ask myself if – especially in History – DH are just a tool or if they are instead deeply influencing the discipline itself.
I am a scholar in Literature, and I want to study some digital projects which applied DH to Literature. During my research, I am continuously asking myself if DH are influencing not only the way to produce and ‘consume’ (a term spread over by the appearance of social networks, which turned the way we consider literary objects) Literature, but also the way Literature has been changed and is changing due to the presence of DH. I think that DH are changing Literature itself (basically, they are changing the world where we live, our perception of it, and also the style used to describe this ‘new’ world). Do you think the same is happening to History?
February 8, 2017 at 8:29 pm
See in context
January 12, 2017 at 9:23 am
December 25, 2016 at 7:45 pm
December 22, 2016 at 11:32 pm
December 22, 2016 at 8:56 pm
December 22, 2016 at 8:42 pm
December 22, 2016 at 7:42 pm
December 12, 2016 at 9:10 pm
December 5, 2016 at 2:19 am
This site is part of the CUNY Academic Commons, an academic social network for the entire 24-campus CUNY system.
Unless otherwise stated, all content on the CUNY Academic Commons is licensed under a Creative Commons license.
Built with WordPress |
Protected by Akismet |
Powered by CUNY