Data Visualization – Presented by Micki Kaufman

1 Leave a comment on paragraph 1 0 Micki Kaufman, former class visitor and creator of Quantifying Kissinger recently hosted a data visualization workshop on October 31st where she not only showed how to set projects up in Gephi and Tableau, but also showed the power of smaller script GUIs for text analysis like AntConc and Mallet. Her initial project dealt with thousands of documents, and the constraints of originally undefined metadata. How do you create raw data from documents, and the other question is – how do you visualize it in useful ways that sparks humanities based questions?

2 Leave a comment on paragraph 2 0 We started the workshop by using a dataset – in this case, Micki creatively used the sign-up data for the session and then turned it into a Google spreadsheet where we all populated it with our own attributes. The metadata included GC program, years at the grad center and departmental involvement to name a few. After populating these fields, she first went over how text files and spreadsheets were the same in terms of content editing. We then imported the created data into Gephi, and it was able to show us initial clusters in a network visualization based off of program and department for each student. From this we went on to explore multiple forms of modular analysis which would be subject to heavy change if we modified any data.

3 Leave a comment on paragraph 3 0 One of the more important questions raised with this visualization was the choice of color. Color choice is important because you need to not only present a gripping visualization, but also consider questions of disability and culture. The color red for example might mean something different across multiple cultures, and might not be the right color for a certain variable if you know your intended audience well enough. Also, how do you manage to make the visualization accessible to someone who might be colorblind?

4 Leave a comment on paragraph 4 0 After our initial visualizations in Gephi, we moved onto topic modeling. Topic modeling is a way of using programming to identify potential patterns in a certain corpus. For Micki, this was going through thousands of documents that she scraped, and finding certain trends to use in a visualization. However, topic modeling is infinitesimal – it’s not discreet and mutually exclusive. When you run a topic modeling script, you’re bound to have overlap and differing results each time which would make a good case to find average topics that appear over multiple modeling sessions.

5 Leave a comment on paragraph 5 0 We then moved onto Tableau, which is another piece of data visualization software. In tableau, we used state of the union (SOTU) data that Micki provided to play around with visualizations a bit. The visualization utilized words and word weight over the history of the addresses. In one case, we took a look at why the word “tonight” might appear more heavily in the 80s than any other year – Micki noted that it was due to the shift to television for SOTU addresses post print.

6 Leave a comment on paragraph 6 0 To close, we went over a couple of pieces of text analysis software, Micki’s 3D model of her project and the potential future of visualizations in VR. Overall it was an excellent seminar and taught everyone a great deal about not only setting up visualizations, but figuring out what you want to get out of data visualization.

This entry was posted in Student Post, Uncategorized. Bookmark the permalink. Both comments and trackbacks are currently closed.

One Comment

  1. Posted April 17, 2018 at 5:17 pm | Permalink

    Color choice is important because you need to not only present a gripping visualization, but also consider questions of disability and culture.

Additional comments powered byBackType

  • Archives

  • Welcome to Digital Praxis 2016-2017

    Encouraging students think about the impact advancements in digital technology have on the future of scholarship from the moment they enter the Graduate Center, the Digital Praxis Seminar is a year-long sequence of two three-credit courses that familiarize students with a variety of digital tools and methods through lectures offered by high-profile scholars and technologists, hands-on workshops, and collaborative projects. Students enrolled in the two-course sequence will complete their first year at the GC having been introduced to a broad range of ways to critically evaluate and incorporate digital technologies in their academic research and teaching. In addition, they will have explored a particular area of digital scholarship and/or pedagogy of interest to them, produced a digital project in collaboration with fellow students, and established a digital portfolio that can be used to display their work. The two connected three-credit courses will be offered during the Fall and Spring semesters as MALS classes for master’s students and Interdisciplinary Studies courses for doctoral students.

    The syllabus for the course can be found at

  • Categories

Skip to toolbar