Visualization, Literary Study, and the Survey Class

I hope I’ve not missed the boat on the pre-un-conference-idea-generating-posts! In brief, I’d like to meet up people interested in a web project visually weighting by color simple semantic relations in literary texts and/or putting together an NEH grant for said project. Caveat: I’m not an expert on this. Here’s my initial proposal, though in retrospect it looks rather stilted and sad:

For the past year or so, I’ve been interested in putting together a small team of like-minded folks to help bring to fruition a data visualization project that could benefit less-prepared college students, teachers in the humanities, and researchers alike. Often, underprepared or at-risk educational populations struggle to connect literary study with the so-called “real world,” leading to a saddening lack of interest in the possibilities of the English language, much less literary study. I am currently working with Doug Eyman, a colleague at GMU, to develop a web application drawing on WordNet—and particularly the range of semantic similarity extensions built around WordNet—to visually mark up and weight by color the semantic patterns emerging from small uploaded portions of text. This kind of application can not only help students attend more fully to the structures of representation in literature and the larger world around them—through the means of a tool emphatically of the “real world”—but also enable scholars to unearth unexpected connections in larger bodies of text. Like literary texts to many students, the existing semantic similarity tools available through the open source community can seem inaccessible, even foreign, to a lay audience; this project seeks to lay open the language that so many fear, while enabling the critical thinking involved in literary analysis. Ultimately, we hope to extend this application with a collaborative and growing database of user-generated annotations, and perhaps with time, to fold in a historically-conscious dictionary as well. We are seeking an NEH Digital Humanities startup grant to pursue this project fully, and I’d like the opportunity to throw our idea into the ring at THATcamp to explore its problems as well as possibilities, even gathering more collaborators along the way.

Here’s a hand-colored version of something like what I’m thinking; I used WordNet::Similarity to generate the numbers indicating degree of relatedness, and then broke those numbers into a visual weighting system. Implementation hurdles do come out pretty clearly when you see how the numbers are generated, so I’m hoping someone out there will have better insights into the how of it all.

To a related, larger point: I always have the sneaking suspicion that this has been done before–Jodi Schneider mentioned LiveInk, a program that reformats text according to its semantic units, so that readers can more effectively grasp and retain content. This strikes me as simlar, as well, to the kinds of issues raised by Douglas Knox–using scale and format to retrieve “structured information.” Do the much-better-informed Campers out there know of an already-existing project like this? I wish the checklist of visual thinking tools that George Brett proposes were already here!

To a related, larger point: I always have the sneaking suspicion that this has been done before–Jodi Schneider mentioned LiveInk, a program that reformats text according to its semantic units, so that readers can more effectively grasp and retain content. This strikes me as simlar, as well, to the kinds of issues raised by Douglas Knox–using scale and format to retrieve “structured information.” Do the much-better-informed Campers out there know of an already-existing project like this? I wish the checklist of visual thinking tools that George Brett proposes were already here…

9 Responses to “Visualization, Literary Study, and the Survey Class”

  1. David Staley Says:

    I would be very interested in discussing this idea with you. I have been thinking about such a project for years!

    David J. Staley

  2. Jodi Schneider Says:


    Earlier this week, I ran into a company (LiveInk) that does what they call Visual-Syntactic Text Formatting.

    It seems akin to what you’re doing.

    While you want “to visually mark up and weight by color the semantic patterns emerging from small uploaded portions of text”, LiveInk visually reformats texts, adding line variation, and breaking at semantic units. You might be interested in their work, which includes a couple of papers in the reading literature, as well as a product called ClipRead. I’m demoing ClipRead 2.0 (which I don’t know anything about yet since it’s still sitting in my inbox).

    Check their homepage, or if you want an online demo, they also have an old online service, to get a sense of what they do:

  3. THATCamp » Blog Archive Says:

    […] that seems to be on the minds of other THATCampers (at least per their blog posts) such as Tonya Howe and Amanda Watson. How best can we use such visualizations in our research and/or teaching? At what […]

  4. thowe Says:

    Excellent, David–I’m also looking forward to seeing the installations you’ll be sharing over the weekend. Let’s definitely put our heads together!

  5. thowe Says:

    Jodi– LiveInk is something new for me… I think what they’re doing with re-formatting text can be very useful indeed, and I’ll have to spend more time with it. Thanks for the tip!

  6. Using Wordle in the classroom (1 of 2) - Says:

    […] See also the Literature (general) page at Many Eyes, featuring 16 members and 54 visualizations. Visualization, Literary Study, and the Survey Class, by Tonya Howe – THATCamp (18 June, 2009) See also Tonya Howe’s page at Many Eyes How […]

  7. Jodi Schneider Says:

    Noticed a new paper on phrase nets for visualizing texts:

    Mapping Text with Phrase Nets
    Frank van Ham, Martin Wattenberg, Fernanda B. Viégas
    IEEE InfoVis 2009

    “We present a new technique, the phrase net, for generating visual overviews of unstructured text. A phrase net displays a graph whose nodes are words and whose edges indicate that two words are linked by a user-specified relation. These relations may be defined either at the syntactic or lexical level; different relations often produce very different perspectives on the same text. Taken together, these perspectives often provide an illuminating visual overview of the key concepts and relations in a document or set of documents.”

  8. Gabrielle Dean Says:

    I came across your post while surfing for something else entirely, but I wanted to mention a now-defunct project with some similarities to yours that you may find interesting–since you asked about precursors. The project (commercial) was called TextArc; it was sort of an early version of cloud-tagging of literary texts, which provided interesting ways of understanding relationships between textual strings and words.

    Good luck–would be interested to find out what happens.

  9. THATCamp CHNM 2009 » Blog Archive | Cerisia Cerosia Says:

    […] via THATCamp CHNM 2009 » Blog Archive. […]