Data Visualizations for Text: How to Show the Writing Process with the Writing Graph

How long it took to type each character. Try the interactive demo inside.

TEXT EDITORS (and the files they work) reveal surprisingly little about the history of editing. Sophisticated tools provide revisions to browse, while others limit you to undo/redo. By adding temporal metadata to files, apps can display more than just the product--they can show process. This post introduces the writing graph, a timeline for viewing editing activity.

THE HISTORY OF MANIPULATING TEXT is rich with innovation (e.g., water-soluble cave paint, clay tablets, printing presses, copy & paste, 💩, etc.). But for the most part, the narrative converges on a digital standard of the late 20th century: Adding and removing characters via a caret in a sequence of lines we call the “document” (metaphorical trappings included).
Vi, Word & Notes. The more things change, the more they stay the same.

A quick preface: In teaching data vis, there are always students eager to share new visualizations. These are usually tweaks on canonical visualizations, or sometimes combinations thereof (like a 4D scatterplot of small multiples). Unsurprisingly, many researchers have spent a lot of time refining the art of visual communication; visualizations are like structural joinery--it’s good to maintain suspicion of anything “new.”
Data Visualizations for Text.
word cloud of this very post. Can you find the only emoji?

Having said that, I think there is a relative dearth of text visualizations. The somewhat (in)famous “word cloud,” which maps word count to font size, comes to mind; apps like iA Writer gracefully color words by part-of-speech, and academic projects abound, but this is still a small number of visualizations compared to other domains. One reason is the ubiquity of plain text in operating systems and inter-app communication. Another is, “if it ain’t broke, don’t fix it”.

But innovation can be fruitful, and divergent design thinking can lead to emergent use cases (click for my blog post). A growing community of new media artists and creative coders would love tools that felt less utilitarian (e.g., Word) and more exploratory (e.g., Max); as a generative writer once told me, “musicians are spoiled.” It’s good to remember that niche technologies originally designed for expert needs (e.g., hands-free voice recognition for jet fighters) often find their way to the rest of us (e.g., Siri for commuters).
Max enables interactive sounds via a complex (yet playful) interface. Imagine a visual programming language designed for textual synthesis.

WHO WOULD WANT to see this dimension of the writing process? Psychologists are interested in processes; they time participants because it helps them infer mental processes, like whether people are thinking fast or slow. In many studies, response time is a dependent measure (how long does it takes you to make a moral judgment, solve a puzzle, foveate on a target, etc.) If you ever participate in an experiment, you might assume everything you do—including waiting for the “experiment” to start—is being timed. It should be said that scientists, like designers, know that a measure like time-to-completion tells you nothing more than time-to-completion; you might have paused on a word because you were conjuring synonyms, or you might have been distracted by a notification to appease a social network. That’s why experiments often analyze groups of people over multiple trials—to “wash out the noise” of any given individual or situation.
Introducing the Writing Graph

This visualization arose after talking with talented poets at Brown. As a psychologist I was intrigued to learn more about the creative process, to look inside their work. Visualizing typing over team gives readers a kind of x-ray vision into the ebbs and flows of a piece, something otherwise only conveyable in performance art. This visualization can also be used for self-reflection, to look back, for example, on old journal entries to see what content flowed and what content was rife with hesitation. More practically speaking, experts like computer programmers might use the temporal metadata to track when a particular line of code was not only committed, but authored.

THE APPROACH is simple. I draw a rectangle under each glyph, the height of which represents how long it’s been since the last activity. Activity can be defined in a number of ways (e.g., last click, last edit, etc.). Here is a proof of concept:

You can only type in a single-line—which goes off-screen—or backspace. Press enter to reset. View the source (click this if your browser plugins block this embedding).

You can imagine complicating this ad infinitum: coloring the bars by the color of the sky at the time & location of the keystroke; normalizing the bars by difficulty of reaching for different keys on desktop/mobile; showing total time per document/paragraph/line/word/character; etc.

You can also imagine looking beyond typing (there are other ways to add/remove characters), like showing how a paragraph and its alternatives were copyedited, or how the complex web of undo/redo was collapsed. Some techniques will be more expensive, requiring a re-engineering of low-level functions (e.g., rendering text layout, handing selections, etc.) we take for granted. And some will take a toll on familiarity and/or learnability.