Ever since the publication of Franco Moretti’s fascinating book Graphs, Maps, Trees: Abstract Models for Literary History in 2005, scholars in the digital humanities have used his terms “close reading” and “distant reading” to distinguish traditional modes of reading and analyzing texts from computer-based methods that reveal the structure and content of texts—particularly large corpora—in new ways. There have been heated discussions about the value of one approach over the other. An argument that has particularly caught my attention is whether the distance imposed by computer-mediated methods may strip away the richness, depth, and subtlety that have long been the hallmarks of excellence in humanistic scholarship. Can diagrams, bar charts, or network maps possibly convey the human warmth and drama of literature, religion, and history? And what happens to a researcher’s critical capacity and sensitivity when she uses a software program to study complex sources?
Moretti’s terms and the debates that have swirled around them remind me of concerns about GIS (geographic information systems) in geography, my home discipline, in the 1990s. Cultural geographers and other humanists criticized social-scientific geographers, who used GIS most heavily, for uncritically wielding a tool that was inherently problematic. They argued that the mathematical lattice of Cartesian coordinates that forms the basis of location in GIS was fundamentally positivistic, and that it woefully simplified the complexity of places and phenomena by reducing them to points, lines, polygons, and pixels. Ethical arguments about GIS were fueled by J. Brian Harley’s sharp critique of maps as instruments of oppressive power. Harley’s arguments were historically specific, focused on maps created during the age of European overseas conquest and colonialism. Nevertheless, his ideas struck a chord among geographers who worried that the expert knowledge GIS required was again concentrating power in the hands of social elites. The abstract, machine-made aesthetic of GIS maps and the artificial precision they suggested compounded these objections.
The good news is that years of debate in geography sensitized people in every branch of the discipline to the fact that all methodologies, including GIS, are socially constructed. Recently, many humanistic and social science geographers have embraced so-called critical GIS. This approach calls on analysts to be aware of whose opinions, actions, or values their questions, data, and maps reflect, and to be alert to whether their scholarship reinforces existing social inequalities or exposes them to fresh examination. Critical GIS became a cornerstone of community mapping projects, which argued for indigenous people’s land rights or celebrated local diversity. I see the ideas of critical GIS reflected in the global democratization of cartography enabled by Google Earth, OpenStreetMap, and other low-tech, online mapping interfaces. Anyone with access to a computer can now make their own maps—and they are!
Geography’s experience gives me hope that current arguments over distant versus close reading will produce positive outcomes by heightening awareness on both sides. At the same time, I believe we still have much to learn—and to appreciate—about the very human practices involved in digital scholarship. This takes me back to those two key terms, “close” and “distant” reading. The usual implication is that close reading is the more humane approach. Where close reading is familiar, part of our human heritage, distant reading still feels foreign and somewhat mysterious. It stimulates the reader’s senses, gives one ample time to cogitate, reread, add pencil marks in the margin, absorb details, read aloud, all while holding the book or manuscript in one’s hands. Close reading is an intimate, personal encounter with a text. Distant reading by definition involves a nonhuman computer. We follow the rules of software and computer operating systems, however counterintuitive, so that eventually we can see patterns and relationships that are otherwise invisible to us. Where close reading is familiar, part of our human heritage, distant reading still feels foreign and somewhat mysterious. The visualizations that distant readings produce are also abstract, such as clusters of dots on a map, bars of different length in a chart, rows of values in a table, or nodes and lines in a network diagram. These representations have no soul, no face, no tone of voice. They can be thrilling for researchers who recognize what they reveal, but they always require verbal interpretation, and even then they can seem cold, abstruse, artificial, detached.
This comparison omits two crucial aspects of digital humanities and social science research that are profoundly humane yet rarely discussed. One is the slow, careful translation of textual or visual sources into data that a computer program can understand. The other is the struggle to deal with the inevitable errors and uncertainty that are generated in the act of digital translation. Translation, error, and uncertainty have loomed large in my own experience as a practitioner of historical GIS (HGIS). I hope that explaining how they have influenced my work might help others see digital scholarship in a new light.
Acts of translation
The translation metaphor applies to HGIS at several levels. First, one can think of GIS as a mathematical language. Its underlying structure or grammar is much more rigid and limited than any verbal language because every bit of data is recorded as binary code, in ones and zeroes. This helps explain why any trait that you intend to query or display in a GIS database must be one thing or another. For example, if you draw a line in GIS to represent a river, the line declares that the river is here, not there. If the river’s actual location changes with flooding and drought, you can add data on seasonal variation, but every instance will be defined with similar precision. You can classify landscape features in any number of ways, but each feature must be described as one type or another, or as this combination, not that. Time, another attribute, can be entered in words, such as seasons, or as numbers, such as years. Behind the scenes, however, if you do not specify the day/month/year of a “time stamp,” the program will assign one for you when you ask it to display the data, or it will exclude the data from your analysis because the format is not recognized.
These constraints are least problematic with quantitative information like that in the US census. Categorization requires more thought, and becomes more like translation, when the source is written in natural language. The work of translation begins with deciding which fields, or categories, will best capture the information your source or sources contain. After designing the database comes the even closer reading of data entry. My example here is the key source for my HGIS of antebellum US iron works, a thick tome titled The Iron Manufacturer’s Guide to the Furnaces, Forges, and Rolling Mills of the United States (1859). The book is crammed full of information organized into individual entries that include each iron company’s name, location, owners, and details of production. Although the Guide was exceptionally well suited to translation into GIS, data entry progressed at a snail’s pace. Every sentence had to be dismantled into its constituent parts. Ambiguous or missing data, such as date of construction, necessitated additional archival research. Locations were as precise as a street address or as vague as “about three miles up the west branch of the Susquehanna.” Placing iron companies on the GIS base map filled most of a sabbatical winter, as I checked locations in the Guide against topographic maps and other sources.
It may sound tedious, but I loved the slow work. Inching through the entries made me deeply familiar with the Guide, its strengths and weaknesses, and the industry it described. My research assistant and I learned at an almost visceral level which regions of the country were best and least well documented in the Guide. We found a much more diverse industry than previous studies had described. Locating hundreds of iron works while reading their histories also helped me figure out why firms in some places were prone to fail while others survived. Translating the Guide into digital form was indeed close reading. It was a kind of analytical meditation.
I had a different immersive experience extracting contour lines from an 1874 topographic map of the battlefield of Gettysburg. Capturing the elevation data was the first step toward creating a digital rendering of the historical terrain for a GIS study of what commanders could and could not have seen when they made key decisions during the battle.It felt as if the landscape were entering my nervous system through my eyes and hands. No scanning software at the time could readily distinguish between the source map’s many features, which included fences, field boundaries, building footprints, dots for pine trees, and little broccoli-shaped symbols for hardwoods and fruit trees. So I traced the contours by hand onto large Mylar sheets. Much of one summer went to those hours at the light table, drawing slowly while listening to my favorite music. It felt as if the landscape were entering my nervous system through my eyes and hands. Another year of work went into processing the linear data, asking visual questions of the digital terrain, and writing my interpretation of what the GIS results showed. Many people contributed to that project. My understanding of the battlefield and what happened there, however, was rooted in the intimate experience of drawing its contours, literally retracing the representation of the ground that US Army Corps of Engineers cartographers had laid down 130 years before.
No translation is complete until the original source is expressed in another language. In HGIS, data are expressed most eloquently in maps, whose visual language works in the imagination very differently than words do. While we must acknowledge maps’ limitations, I believe they are no more flawed than textual expression. They have different flaws, and different virtues, including the capacity to present complexity in a glance. Like any visual form, map design can be highly emotional. Unfortunately, generations of academic cartographers were trained in a scientific style that was then codified in default design templates in GIS programs. As cartographers break out of that mold to expand their artistic range, particularly to create spatial narratives, I expect that maps generated with GIS will also become more expressive.
Errors and uncertainty
My last examples highlight the issues of how digital translation can introduce errors and uncertainty. In the Gettysburg project, every contour line that I traced differed slightly from the original line on the 1874 map. Across the map as a whole, my errors nudged the already imperfect contour lines this way and that. But even that double imperfection was an acceptable compromise in order to create a dataset that I thought would better approximate the historical terrain than existing alternatives—and it would be a digital terrain that would begin to answer my question about what commanders could, and could not, have seen during the battle.
A very different project has raised more acute issues related to uncertainty and error. Since 2007, I have worked with a group of historians and geographers to explore the many geographical aspects of the Holocaust. One of my particular projects has been studying SS-administered concentration camps and their associated labor camps. My team’s first task was to populate a GIS database of camps that we had received in fairly skeletal form from the United States Holocaust Memorial Museum, which had used the database to make simple maps of camp locations for the USHMM Encyclopedia of Camps and Ghettos. To study labor in the camps, as well as the dynamics of the SS camps’ explosive growth and demise during World War II, we aimed to extract more information from Encyclopedia entries. We soon discovered that the entries reflected scholars’ incomplete knowledge of SS camps. Many smaller camps lacked information about when the camp was established or closed, what kind of forced labor inmates did, how many inmates the camp held, and so forth. Even some of the best-documented camps’ histories were incomplete or vague on key points related to labor. We faced a question that is common in the historical database business: Should we analyze only the data that were completely certain, or should we include less certain data that might nevertheless be revealing? For example, if an entry said that the camp’s inmates “were reported to have worked on V-9 rockets,” should we list rocket manufacturing as a kind of forced labor at that camp and map it along with others, or omit it from maps of rocket-building activity, or give that camp a separate color on the map as being uncertain?
The SS camps project also challenged my assumptions about how well we can know, and represent, the past. The turning point came one day when Alex Yule, an undergraduate research assistant at Middlebury College, asked me bluntly, “What was Auschwitz?” That is, what did we mean when we referred to that famous place, and to what category did it belong in our database? At various and overlapping times during World War II, Auschwitz was an extermination camp where hundreds of thousands of people died, a labor camp, a work-education camp, a penal camp, a prison camp, and a transit camp where prisoners briefly stayed en route to another place. The name also referred to the adjoining town, whose origins dated to 1270 AD. The initial camp was greatly enlarged by the building of inmate barracks, crematoria, and other facilities collectively called Birkenau, or Auschwitz II. The manufacturing giant IG Farben established a large factory nearby at Auschwitz III, known as Monowitz. How should we model the ontology of Auschwitz, given its multifaceted history and geography?
Such questions are increasingly central to human geography and digital spatial history. Most of us know from personal experience that places have many meanings and names, and that they change over time. How should we model the ontology of Auschwitz, given its multifaceted history and geography?As scholars in the humanities and social sciences seek ways to represent the multiplicity of place, we will find allies in philosophy as well as GIScience and other fields of computational studies where uncertainty and error, ontology, and the meaning of space-time have long been the focus of study.
When I was in graduate school at University of Wisconsin, a wise historian told me that the longer you do research, the more you realize you do not know. I have also learned that there are many ways of knowing, and that recognizing the limitations of any particular approach is as important as appreciating what it can reveal. This came home to me most powerfully when my research group’s Holocaust maps prompted moral questions from humanists in the audience. In response to dot maps of SS concentration camps or maps tracing the sequence of construction at Auschwitz they asked, “Where are the victims? Where in your maps are the people who suffered and died in the Holocaust?” GIS mapping is superb for presenting a synoptic view of a complex historical event, but it is not designed to convey human emotion or the drama of extreme circumstances. My colleagues and I have taken up this challenge by turning to a new kind of source: video interviews and transcriptions of survivor testimony. These accounts reveal how victims responded to Nazi-made places like camps and ghettos, as well as how they tried to make their own places, however fleeting and partial, in the midst of the chaos of dislocation and violence. While working with GIS has heightened my awareness of the complexity of the past, listening to Holocaust testimony makes me realize how much we take the places in our lives for granted. We hope to apply text-mining methods to large sets (corpora) of testimony transcripts, but only after we spend much time listening closely, learning how narrated memories express place, movement, and spatial awareness. Becoming more aware, more attentive, is a primary goal of humane scholarship. If we read and look and listen closely, whatever tools we use to augment our perceptions can make us better scholars.