1. Beyond the split between ‘literacy’ and ‘numeracy’
The following considerations want to show that and why the Digital Humanities are a part of the Humanities. But with what arguments can this be explained and justified? As a narrative in explicit opposition to the scenario of a neoliberal displacement of hermeneutics by computer-generated research methods, approaches are currently being developed which identify the procedures of Digital Humanities as genuinely hermeneutic as well as critically acting Humanities. This is illustrated by new terminologies such as “Critical Digital Humanities”, “computational criticism” (Dobson X), “screwmeneutics” and “hermenumericals” (van Zundert). Even the computer itself is nobilized as a “hermeneutical instrument” (van Zundert 331). We see: the Digital Humanities are rehabilitated as part of the Humanities by means of a fusion between hermeneutics and computation.
At first glance, this seems to make sense especially when we interpret this position as a strategy to prevent Snow’s dualism of culture and science from becoming established once again within the Humanities themselves. Let us recall: C. P. Snow’s split between ‘literacy’ and ‘numeracy’, between quantifying-empirical and narrative-interpreting cultures of knowledge, is today considered to be unacceptable. On the one hand, the sciences do not only collect their data but also have data to interpret, be it by theory or narration. On the other hand, numbers and counting also play an important role in the Humanities – think of the dating of letters, books, and documents, of designing catalogs of works, concordances, and keyword indexes.
Against this background, the methodology of ‘twinning’ interpretation and computability is questionable. The reason is that the fusion of these very different methodologies – computing and interpretation – preserves a problematic narrative about what is essential for the Humanities, and what is the essence of doing research as a humanist scholar.
Note that our critique of implementing hermeneutics into Digital Humanities does not aim to play off data-driven quantification procedures against the inconclusive work of interpretation; we are not arguing from the perspective of a positivist, neoliberal understanding of science. Rather, it is a matter of critical recognition that proclaiming hermeneutics and interpretation as defining key methodologies of the Humanities has always obscured the extent to which the traditional Humanities are depending on scholarly procedures, situated beyond interpretation. The practice of academic work in the Humanities requires interpretation, but is not limited to it; humanist research needs and is also based on collecting, dating, classifying, annotating, and commenting on its objects that are often spatiotemporally situated and thus empirically investigable. To absolutize interpretation as the royal road and gateway of traditional Humanities conceals the role of precisely these basal activities. What culture is and what ‘culture’ means is not to be reduced to the ephemeral realm of fluctuating sense and meaning, but is rooted in the cultural techniques, media, and artifacts that delineate and configure the relationship to the world and to the self. Thus, the Humanities always have to deal with materials of concrete factuality, which have to be probed and processed in order to become analyzable and interpretable as objects of research in the very first place. Acknowledging the Digital Humanities as a genuine part of the Humanities means accentuating hermeneutics not as a privileged key methodology, but to keep in mind the indispensable work with the material and medial dimensions as the basis of almost all Humanities research (Unsworth).
In what follows, we want to clarify the relationship between Digital Humanities and traditional Humanities in terms of an alternative narrative, which we want to call the ‘cultural technique of flattening’ (Krämer, “Flattening As Cultural Technique”). But first, we have to clarify, what we designate ‘depth rhetoric of interpretation’, the narrative underlying contemporary attempts to unify computation and interpretation.
2. The ‘depth rhetoric of interpretation’ as a narrative in the Humanities
What does it mean to speak of ‘depth rhetoric of interpretation’? We are all too familiar with a stereotype: thoughtful, ‘rich’ thinking is associated with deepness, whereas deficient, ‘poor’ thinking is associated with superficiality. Deepness is ennobled, and superficiality is disavowed. We call this setting valid in the Humanities as well as in everyday thinking the ‘depth rhetoric of interpretation’.
Within the Humanities, Wilhelm Dilthey 1910 established hermeneutics as a method specific to the Humanities by detaching it from the mere interpretation of texts – to which Friedrich Schleiermacher previously still limited hermeneutics (Ineichen 121) – and broadening it to the interpretation and understanding of all human manifestations of life. With Dilthey, hermeneutics has advanced to become the most prominent and defining methodology of the Humanities. Without explaining it here, we want to stress an implication of the hermeneutic attitude with regard to comprehending human expressions: ‘understanding’ means to comprehend what just cannot be perceived. What follows is that to interpret is to permeate the empirical, i.e. spatiotemporally situated surface of symbolic artifacts and human practices, in order to reveal a deep region behind the sensually perceptive and superficial. What is essential lies ‘behind’ the appearances; it can no longer be grasped sensually, but is accessible only by pervading the visible surface.
This narrative of looking for something behind the phenomena already begins with Platonism, insofar as what is true lies behind the situated phenomena. And – to make a broad jump to critical modernity – it returns in a modified form in the contemporary ‘hermeneutics of suspicion’ (Ricoeur), which locates the genuine meaning of a text behind its wording (Bude). As opposed to this practice of symptomatic reading, the concept of ‘surface reading’ is currently topical and discussed (Best and Marcus).
But back to the rhetoric of depths. Of course, that’s not the whole story. Like Ariadne’s thread, diagrammatologically reconstructible cognitive procedures subsequently run through the history of Western epistemes: spatial, two-dimensional relations are used to represent and operate on abstract, mostly non-spatial issues in a way that generates cognitive insights (Krämer, Figuration, Anschauung, Erkenntnis). Already Plato himself – we have to differentiate between Plato and Platonism – developed procedures in which operating on surfaces is constitutive for acquiring knowledge: for example, his method of dihairesis, a visual binary diagrammatic decomposition of concepts, can be found in several of Plato’s dialogs and dihairesis was frequently practiced in his ‘academia’ as a diagrammatic procedure on boards (Krämer, “Is There a Diagrammatical Impulse with Plato?”; Philip).
Charles Sanders Peirce, who is usually regarded as the founder of ‘diagrammatic reasoning’ (Shin; Stjernfelt), is by no means the beginning, but rather the intermediate résumé in the evolution of diagrammatical modes of thinking. Higher-level thinking is impossible without the use of collectively developed, written and/or drawn sign systems (Giardino; Bender and Marrinan). Scientific work remains dependent on inscriptions of all kinds (Rheinberger): be it writings, tables, schemes, diagrams, graphs or maps including their hybrid forms (Châtelet; Giaquinto). Even more fundamental: The ‘cultural technique of flattening’ plays a productive role not only in the sciences: technical-architectural designs, artistic compositions, administrative bureaucracy, not to forget the ubiquity of tickets, scoreboards, and debit cards also rely on flat inscriptions and visualizations. This ‘diagrammaticity’ within cultural practices culminates in the contemporary ubiquity of computer screens and smartphones.
But why is ‘artificial flatness’ so productive, both theoretically and practically? What is the secret of its creativity and success?
3. What does ‘artificial flatness’ mean?
An information society without the use of stable, moving or animated displays is unimaginable. Screens that represent something are ubiquitous. Moreover, as media archaeology and screenology (Huhtamo) have shown, there is a long history of using flat media. But it’s more than just conceptualizing interfaces and their prehistory. The research field of diagrammatic reasoning has already brought to light that complex thinking operations – carried out with the means of writing, graphics, and images – operate in the two-dimensionality of surfaces. Everything that is, what is not yet and even: what can never be like logically impossible objects, which can be drawn but not physically constructed, can be projected into the two-dimensionality of a surface and there also be manipulated and analyzed (Krämer, “Reflections”). From cave paintings and skin tattoos to pictures, writings, tables, graphs, maps, computer screens, and smartphones, the Ariadne thread of a ‘cultural technique of flattening’ extends through the history of civilizations. Empirically, there are no flat objects; two-dimensionality is a purely conceptual construct. But by illustrating, inscribing, and labeling surfaces, the depth dimension is subtracted: We treat the surface as if it is flat: we have to turn the picture to see the screen …
The cognitive and aesthetic (Summers) creativity of artificial flatness is based on the fact that it forms a translation manual between time and space: Between the one-dimensionality of time and the three-dimensionality of living space, the medium of two-dimensionality intervenes as a middle and a third, opening up the transformation of temporal sequences into spatial structures and vice versa. Just think of the flow of speech that is transferred into discrete letter sequences of phonetic writing, which in turn can be turned into the fluidity of speech.
But the theory and history of the cultural technique of flattening are not the issues here. What is crucial for our considerations is the connection between artificial flatness and digitality. The possible creativity of Digital Humanities is based precisely on the fact that they participate in the cultural technique of flattening – using computers and computation. In understanding this connection in its historical dimension, we must realize that there is already an embryonic digitality of the alphanumeric.
4. The embryonic digitality of alphanumeric media and basic scholarly practices
The digital is usually associated with the use of computers and ‘digitization’ with (more or less) the combination of computation, datafication, and networking. But there is digitality before the computer. If we understand ‘digital’ and ‘digitality’ as decomposition and splitting of a continuum into discrete elements that can be coded and (re)combined arbitrarily, the alphabet represents a prototype of a digital system. But whatever is coded alphabetically in principle can be counted (Piper, “There Will Be Numbers”; Piper, Enumerations). The alphabet’s specificity consists in dissecting the phonetic fluxus of speech into stable letters; the acoustic is discretized into something visual and – last but not least – countable. Yet alphabetic transcription of oral language should not be understood as a kind of pure mapping or transmitting of temporal fluidity into a spatial structure. Although the speech has pauses in taking a breath, there is no precedent in oral talk for the blanks, gaps, and indenting sections in phonetic writing. The discreteness and disjunctivity of the alphabetic script suggest a perceptive objecthood and reification of language as a visually observable entity in the first place. The idea that verbal language is a system per se, separable from mimic, gesture, prosody, and deixis derives from the mediality of its alphabetic modeling (Krämer, “Writing, Notational Iconicity, Calculus”). Alphabetic writing is not a reproduction of the oral language, but its cartography.
Yet the alphabet is more than the notation of spoken language. It is an efficient register of ordering and a procedure of sorting which remains neutral with respect to the sorted content. Alphabetically structured corpora of information make knowledge intersubjectively addressable and accessible: encyclopedia, lexicons, dictionaries, but also library catalogs, concordances, and keyword indexes as well as the private note boxes of scholarly work utilize the epistemic functions of alphabetical sorting. The alphanumeric draws on artificial flatness.
Numerical counting goes hand in hand with the alphabetical listing. The number has always a functional existence in the Humanities as – conversely – interpretation has in the sciences. Without alphanumeric notation and labeling in catalogs, concordances, bibliographic information, author signatures, historical data, etc. research subjects in the Humanities cannot be obtained. It is a self-misunderstanding of the Humanities to disregard their liaison with number and counting: The medium of writing includes not only letter writing but number writing in the decimal positional system too. And humanistic scholarly practices feed on these media of inscription.
Since the 13th century, concordances have been produced, which might be considered databases ‘avant la lettre’ (Manovich). Creating computer processible concordances is started in the 20th century with the work of Josephine Miles (1911–1985) and Roberto Busa (1913–2011), both pioneers of the Digital Humanities: they began with handwritten concordances, but then delegated it to the operations of a computer, in order to get machine-produced concordances. What makes them pioneers of Humanities computing is that they use computational procedures to debate, solve and criticize theoretical questions in their respective disciplines (Sagner Buurma and Hefferman; Jones).
Associated with the embryonic digitality of alphanumeric sign systems is the epistemic use of diagrams. What a diagram is may be controversial. Yet it is essential for us that diagrams are two-dimensional projections of mostly non-spatial issues, where the spatial positioning of the visual elements on the surface is relevant for what we can recognize and infer from the diagram and what we can do with it (Bender and Marrinan; Coliva; Giaquinto; Krämer and Ljungberg).
However, diagrams do not interpret themselves – in this, they are related to numbers. The meaning of a diagram is coupled to a commenting text or is rooted in the implicit routines of a historically situated use of diagrams. It was not only with the empirical measurement and social statistics of economy and society in the 19th century that diagrams became ubiquitous. Already in the early period of science around 1600, tableaus connect the idea of the order of knowledge with the idea of spatiality, overview, and visibility (Siegel). The medievalist episteme is infused with forms of visualization (Lutz Eckart et al.).
And – with reference to antiquity – Euclid’s mathematics, with which we connect the origin of scientific mathematics, cannot do without the use of diagrams as original cognitive devices (Catton and Montelle; Manders). In short: Within the framework of the cultural technique of flattening, the diagram forms a cognitive technique and a form of reasoning.
The syntacticity of this cognitive technique not only popularized and inspired the human mental capacity for example in form of written calculation but also generated the technology of calculating machines. The symbolic machine of arithmetic is realizable in two ways: by humans with paper and pencil, and by machines with gears or current impulses. Or to move on to the first computer program: When Ada Lovelace (1843) published the first executable and ‘running’ program for Charles Babbage’s Analytical Engine she was aware that the artifice of Babbage’s model – a paper machine designed as diagram on paper – was to operate not with numbers but with graphic signs. The functionality of Babbage’s Analytical Engine is not calculation, but computation. Ada Lovelace recognized that the transition from the mathematics of numbers to the logistics of operating numerals requires making the mathematical procedures completely explicit as perceivable and schematic operations. Lovelace’s program for the computation of the Bernoulli numbers – whose publication makes her the pioneer of software development – had the form of a table. It is explained in form of a tabular surface which states the machine parts assume at different points in time during the computation.
We have hinted here at the notational iconicity of writing (Krämer, “Writing, Notational Iconicity, Calculus”) and the implicit and explicit diagrammatic of alphanumerical sign space in order to emphasize a central idea: Different from what the pejorative sense of flatness or superficiality signals and communicates, the ‘cultural technique of flattening’ is incremental to our knowledge practices. We have to understand the cultural techniques of flattening as a productive epistemic technique – and a cultural asset.
5. ‘Epistemology of latency’: the culturally unconscious manifested by data-driven processes
But how can the connection between the cognitive power of flattening and the data-driven research methods of the Digital Humanities be explained?
In what follows, we simply use the term ‘computer’ to refer to interconnected algorithms, machines, and protocols. In data-driven research methods, where large data corpora are accessible to machine processing, the computer functions like a microscope and telescope within worlds of data: Computers reveal in data corpora what mostly remains invisible to limited human perception. A kind of forensic machine (Kirschenbaum, Mechanisms) has emerged that uncovers traces in terms of patterns in data configurations. Remind the numerical character of this kind of track: computer-processable traces are mostly statistical, hence numerical constellations. Since neither traces nor data, and certainly not numbers, are self-interpreting, it is clear that only the research motives, creativity, and synthesizing work of human interpreters produce meaning and content from data. Numeric results are put into contexts and perspectives that are not already implicit in the numbers themselves. It is only through humans that numbers are connected with hypotheses, theories, and narrations.
But at this point a further distinction becomes important. What about the relation between ‘flattening out’ and semantics? To be clear about that, we propose to differentiate between two modalities of meaning and semantics: It’s the difference between intrinsic and extrinsic meaning:
(i) ‘Intrinsic meaning’ implies that reading a sign is to perform a certain operation or relating a sign to other signs given inside the system, or even outside – provided that an unambiguous mapping relation is given. This intrinsic meaning is operational and superficial; it remains an activity performed on the surface of syntactic pattern recognition and transformation.
(ii) ‘Extrinsic meaning’ consist of reading and understanding patterns of signs in such a way that they are related to something which no longer belongs to the ontology of patterns: This transition concerning the ontological character may be a matter of changing perspectives – including the switch from the observer to the participant perspective – or to involve lifeworld situatedness and embeddedness, or generally to accept the metaphorical, the ambivalence and the paradoxical. Extrinsic meaning is about contexts, which can no longer be grasped and reconstructed as empirical relations on the surface of textual, pictorial, or musical configurations. It is about the interpretation of interpretations as well as about pre-reflexive attitudes. These modalities of understanding outline the genuine field of hermeneutics.
However, the either-or between intrinsic and extrinsic meaning is a pure conceptual construction; as a description of real phenomena, this dichotomization falls short.
That we nevertheless distinguish intrinsic-operational from extrinsic-interpretative meaning makes some sense: The history of civilizations reveals a dynamic whose tendency and telos is to reconfigure activities related to extrinsic meaning as intrinsic, interpretation-independent operations. On the level of interpretative work, something is going on that is familiar from dealing with technology: being able to use, perform, and control an artifact without being competent to understand. This is characteristic of technicity – regardless of whether an operation is performed by humans or machines. We can use a device without understanding it: That’s how driving a car, using a dishwasher, applying arithmetic algorithms, or navigating the internet is done.
Our hypothesis is, that Digital Humanities explore which areas of humanistic research can be transformed, coded, and formatted in such a way that they can be analyzed as procedures of intrinsic operativity. At best, they do so in a manner that novel and innovative research questions can be asked. But this is not a sine qua non: In the pioneering days of novel methodologies, the confirmation of results previously obtained in the traditional way may already signify progress.
We see: The Digital Humanities can be understood as an extrapolation and radicalization of the productive dimensions of the ‘cultural technique of flattening’. Its critical role towards the traditional Humanities may consist in making explicit just that kind of ‘surface orientation’ which – mostly concealed by the ‘depth rhetoric’ of hermeneutics – has always formed a genuine dimension in scholarly work (Kirschenbaum, The Remaking of Reading), but which has mostly remained a blind spot in Humanities’ self-understanding.
The philosopher Hans Blumenberg (1986) has emphasized the role model of textuality for the world’s ontology by using the metaphor ‘readability of the world’. Under conditions of digitalization, this ‘readability of the world’ is transformed into a ‘machine readability and operability of the data universe’. Within this perspective, automated data analysis, text, and picture mining can reveal what is hardly visible to human eyes or not even intended by authors and originators and usually remains inaccessible and unconscious to those who read and perceive texts and pictures. The data-driven procedures of the Digital Humanities can become tools of an epistemology of latency (Kirschenbaum et al.); they unlock the self-inaccessible (Rieger), and visualize what is invisible to humans (Nassehi), make the implicit explicit (Ernst). Data-driven processes can bring to light and disclose the culturally unconscious embodied in symbolic forms and practices. Of course, this is only half the truth: it is clear that with the complexity of digitalization, black-boxing is increasing too. But in the context of our argument, it is important to note the tendency to make the implicit digitally explicit – as long as we don’t forget that there is at the same time a tendency towards opacity – culminating in the inaccessible internal models that emerge in the so-called ‘self-adapting’ algorithms. But back to computationally making something explicit: Let us exemplify this phenomenon by digitalized textual practices of computational philology, as texts still form the material of many Humanities studies (Jannidis; Burnard).
6. Metamorphosis of texts in data corpora
When texts are transformed into machine-processable data by digital coding, they fundamentally change their ontological status as objects or entities. A new modality of textuality is created. For scholarly analysis in the Humanities, the TEI encoding format based on XML is not only suitable for searching sign patterns in text corpora, but also for visualizing complex textual relationships, and is independent of the operating system’s and program’s diversity thus ensuring ‘technical interoperability’ (Jockers and Thalken). A code still represents operative writing that functions in two directions: on the one hand, it can be written and read by humans, and on the other hand, it is not only ‘readable’, but also processable by computers. This is sufficiently known. But what matters here is that Markup Languages make what is implicit in the notational iconicity of texts explicit in the grammar of TEI: the latent is made manifest by encoding.
In reading we follow semantic conventions: We distinguish headings from flow text, tables from content, proper names from other types of words; we separate meta-linguistic information (author, year of publication, publisher etc.) from the object-linguistic text. All this tacit knowledge, these unspoken maxims of text reading, linguistic knowledge, editorial and commentary annotations, in other words: everything that concerns not only the text but also its context, is brought to the surface of what is processable by machines. The metamorphosis of a human-readable text into machine analyzable data has taken place by virtue of a ‘surface technology’.
The nature and notion of ‘reading’ are thus changing radically, or to put it more precisely: The meaning of ‘reading’ expands decisively. Comprehensible reading by humans has become pattern tracking and identification by machines. Nothing else means ‘distant reading’. And this is true for ‘distant viewing’ in image studies as well.
Only coded textuality forms the direct subject of computer-philological research. That metamorphosis transforms the original, continuous text into discrete, structural models like word lists and their frequencies, vectors as elements of a matrix, data points in a coordinate system. Phenomenally, these no longer have anything to do with ‘text’ in the traditional human sense. The implicit knowledge of scholarly interaction with text – mostly practiced unconsciously in the hermeneutic setting – is made explicit and coagulates into manifest form in the visual diagrammatics of coded textuality. Relations in textual and pictorial data volumes are analyzed as surface signatures by conceptualizing similarities in data patterns as numerical expressed neighborhood relations, computing their proximity or distance measures, and visualizing their statistical results in two-dimensional graphs. Let us cursorily explain this ‘distant superficiality’ by two familiar examples of computer philology: digital stylometry and topic modeling. Our target is to show that both data-driven practices can be reconstructed as operating on textual surfaces. ‘Meaning’ that matters within this field of machine operativity is pure intrinsic meaning.
7. Digital stylometry and topic modeling
Even before the use of computers, quantifying, comparative analyses between styles of epochs, genres, and authors were used in ‘stylometry’ (Holmes; Horstmann, “Stilometrie”). However, stylometric attention was primarily focused on intentional stylistic phenomena, such as the linguistic peculiarities of texts and authors, up to the idiosyncratic literary use of punctuation. It was – in the pre-digital era – about what expresses individuality and specificity: be it the singularity of an author, an era, or a genre. But now a shift towards considering non-intentional features is becoming apparent. One of the most successful methods of digital stylometry – the Burrows delta measurement (Burrows) – examines unintentional text elements that are hardly ever manipulated by authors: Function words, for example, as the most frequently occurring words in a text (Jannidis and Lauer 180).
There have been some spectacular successes in author attribution: The play Edward III, published anonymously in 1596, was attributed with high probability to Shakespeare and not to Marlow in 1994 by using a neural network based on the distribution of the function words: but, by, for, no, not, so, that, the, to, with (Merriam and Matthews). A novel published under a pseudonym in 2013, The Cuckoo’s Calling, could be attributed with 85% probability to J. K. Rowling using the delta measurement, and she then confessed to authorship (Juola).
It would be a misunderstanding, however, that algorithmic stylometry produces the ‘stylistic fingerprint’ of an author. In contrast to the physiologically referenced fingerprint, style in the context of computational philology is a probabilistic concept (Doležel), which is calculated statistically. Moreover, writing styles are cultural artifacts, changing in the course of the literary biography of individual authors. Author attributions in digital stylometry can only be determined with probability, never with certainty; not unlike in criminal investigation (Tweedie). To summarize our point: digital stylometry measures and maps those text properties that were primarily based on unintended features. The machine makes manifest what is latent in a text – mostly unrecognized by authors and readers alike.
This is also true for topic modeling, our second example. In the horizon of distributional semantics, according to which the contexts of a word occurrence are an indicator of its meaning, the co-occurrence of words is statistically diagnosed and scored. The algorithm analyses latent regularities implicit in the text surface as a pure area of word inscriptions. (Blei; Heyer et al.).
And we know that a topic is a statistical phenomenon (Blei et al.) related to word neighborhoods in texts and should not be confused with a theme, central idea, or motif of texts. It is about similarities between word surfaces in coded data corpora. Topics can only be indicators and symptoms of thematic structures of texts and that for human interpreters only. Yet by topic modeling, overall questions can be investigated in large collections of texts that can never be received, read, and processed by humans, including non-classical texts that are not part of the disciplinary canon.
It should be kept in mind, however, that topic modeling is an unsupervised procedure in which the researchers determine the parameters and interpret the results, but have no insight into the automatic modeling itself. The form of predictability that guarantees that repeated measurements lead to the same results is only given to a limited extent. Studies on the repeatability and robustness of topic modeling results showed that they can be reproduced 50–80% only (Heyer et al. 363).
Strategies are being explored to address this lack of reproducibility. However – and this is central to the humanistic nature and signature of data-driven research –, slightly different results do not have to be a deficiency: Rather (Goldstone and Underwood), a diversity of perspectives in the results identified is precisely an expression that ‘what a text is saying’ can hardly be separated from ‘how a text is saying something’ and ‘in what context it is placed’. Different perspectives on a text corpus can claim validity precisely because they have been calculated from the text data and reveal the multifaceted richness not only of the art of interpretation but also of the ‘art of observation’ in the Humanities.
8. Digitality opens up new types of proximity to materials
The telescopic and microscopic potentials in data analysis open up not only the well-known distance reading/viewing, but a surprising closeness to the material itself. With a digital technology that presupposes precisely a maximum of disembodiment of its virtualized objects of research, a novel possibility for investigating the physicality, and materiality of the objects is simultaneously opened up: what lies hidden beneath the surfaces in real texts, pictures, and artifacts can be brought to the machine-analyzable surfaces. Where real parchment scrolls can only be unrolled at cost of their destruction, their virtual variants make exactly this possible (Liu et al.; Rosin et al.). In the project ‘Universal Leonardo’ (http://www.universalleonardo.org/) Leonardo da Vinci’s digitized paintings show a resolution that no museum can offer: it is now possible to reveal what was previously only accessible through X-ray and infrared images of the real painting. The ‘Perseus Library’ (http://www.perseus.tufts.edu/hopper/) records archaeological finds (coins, vases, statues), ‘Monasterium’ (http://monasterium.net:8181/mom/home) is a document archive that makes the whole of Europe accessible, and the Digital Mozart Edition (https://dme.mozarteum.at/) is used by hundreds of people every day.
In short, cultural artifacts no longer remain mere objects of reading and viewing, but open up, alongside ‘distant reading’ and ‘viewing’, practically hand-tactile forms of interaction (Rieger 488). With Digital Humanities a new horizon and modality of material practices is emerging in the Humanities.
9. New risks in the Humanities digital ecosystem … but problems with some much to general criticism of the Digital Humanities too
Undoubtedly, digitization in the Humanities also creates new kinds of problems. In the midst of an informational ecosystem that is constantly changing its parameters at high speed, issues related to the sustainability and reusability of research methodologies and their results – we can also say: the research data management strategies – are a delicate issue, critical in every sense: Parts of the cultural heritage are in danger of dropping out of digital reception as quickly as they entered it. For example, securing the sustainable use of digitalized editions – compared to the millennial preservation of parchments, manuscripts, and books – is a major problem, if not a minefield. Not only do primary data have to be stored, but structured data interfaces have to be provided whose digital functionalities remain usable even when projects stop working when personnel is replaced, or change the workplace. In such situations, critical digital editions degenerate into mere collections of material, collapse in their manageable functionality, and forfeit their digital value – if they have not disappeared from the ‘scene’ altogether.
What should not be underestimated is the danger that the Digital Humanities do not reflect how much hidden presuppositions go into the construction of their algorithms, tools, and modules (Geman et al.), which in turn preform the expected results. It is well-known and widely studied how strongly the biases incorporated in the training data become effective as discrimination in the later application of the learning algorithm (Noble; Eubanks). Moreover, a growing gap between the technical effort and the real meaning of the results is becoming noticeable (Bishop): What is statistically correct and interpretable by humans has not to be humanistically meaningful as well (Sculley and Pasanek 420).
A general critique is developed by Nan Z. Da’s article against digital literary studies. Her claim is that the results of quantifying analysis of large text corpora – if they are not trivial – then lack statistical robustness and exactness of measurement. For Nan Z. Da, literature is reduced to numerical relations, and delicate interpretative work is restricted to quantifying comparisons. However, the subtlety that Nan Z. Da demands as a virtue of work in the Humanities is by no means practiced by herself in the research field she criticizes. Her negative assessment includes a far too small selection of only eight projects; she does not concede that the Digital Humanities are a new cluster of scientific practices that are still in the tinkering and trial-and-error stage and, moreover, face the difficult task of combining informatics and Humanities styles of thinking. She suppresses the fact that deviating measurement results may be due to the multidimensionality of cultural artifacts, which set limitations to the demand for evidence and repeatability.
The core problem of her criticism, however, is her general suspicion that Digital Humanities want to replace interpretive work in the Humanities. But this diagnosis is incorrect, if not wrong: In fact, digital methods are not a replacement, but rather a supplement and addition to the arsenal of methods in the Humanities. Both the cultural objects – as a result of their transformation into machine-readable data – and the research questions – which relate to volumes of data no longer manageable by humans – differ from traditional Humanities objects and questions. They by no means make traditional subjects obsolete, but bring new, previously undiscovered aspects into play.
But in all these procedures it is still true: the data-based and data-driven procedures themselves are quantitative, mostly statistical computational procedures in an artificially produced and coded data universe. These procedures themselves are not performing hermeneutics; but at all stages, they are indispensably interwoven with human decisions and interpretations; they are complementary to the interpretative methods of the Humanities.
The manner in which quantifying data processing achieves its fruitfulness in the context of research questions within the Humanities will now be addressed using the example of a pioneering woman of the Digital Humanities, Josephine Miles, who is usually eclipsed by Roberto Busa, who is still considered to be the ‘founding father’ of the Digital Humanities (Jones). But there is a ‘founding mother’ too.
10. Josephine Miles a nearly forgotten female pioneer of the Digital Humanities
After the death of a colleague, Josephine Miles (1911–1985), an English professor at Berkeley, takes over his orphaned project of a concordance of the poetic works of John Dryden (1631–1700) (Montgomery and Hubbard; Wimmer). She finishes the concordance with computer assistance (Sagner Buurma and Hefferman 3) by completing 240,000 index cards distributed over 64 card index boxes – the legacy of her predecessor. Almost all words in Dryden’s poetic oeuvre are given an ‘address’ by coding the title of the poem and numbering its lines: A form of a coordinate system is created by means of which a word occurrence is clearly localized as a spatial position within the text. A list of ‘stop words’ is also created, since their frequency would invalidate their significance to the work (Miles and Teiser 75). In collaboration with a team, the data of the index cards are transferred to punched cards and entered into a computer, provided by the Department of Electrical Engineering and the Computer Laboratory of the University of Berkeley. The machine sorts and processes what was originally manual entry and puts the results into alphabetical order. The machine output is then visualized by Miles in the form of tables and reproduced photomechanically.
But it is not so much the concordance project that makes Josephine Miles a pioneer of the Digital Humanities, but rather the fact that she consistently uses quantifying methods even in her ordinary, non-computerized literary research and studies. She thus becomes a pioneer of the Digital Humanities not simply because she initiates a computer-generated concordance, but because her work in literary studies employs – independently of computers – quantifying methods of ‘surface reading’, the results of which have enabled her to revise and correct important theorems of English literary studies. This connection between surface reading, quantification, and interpretive revision of fundamental assumptions of one’s discipline is what matters for us. Since her dissertation in the 1930s, she has been concerned with the analysis of word frequencies in literary works, both of a single author and within literary epochs and across epoch boundaries (Miles, Wordsworth; Miles, Pathetic Fallacy; Miles, Major Adjectives). She meticulously and manually transfers the results of her counting into lists and tables, thus creating a form of textuality that can no longer be read like a literary text, i.e. in narratological terms. What have been originally literary texts as source material now become tabular works under her ‘counting and computing hands’. A transcription takes place that turns readable texts into machine-analyzable corpora.
Miles’ studies aim to challenge inherited interpretive schemes of her field. For example, she observes that Wordsworth does not use metaphors to describe emotions – as assumed in her scientific community – but that his poetic language is mostly literal: “that he did very little else but just state literally” (Miles and Teiser 65). The assumption of a genuine metaphoricity of poetic articulation thus proves to be a prejudice of modern poetic conception, with which, in the name of the present, the past of literary phenomena is concealed rather than revealed. Miles’ quantification work of ‘distant reading’ opens up a research field that holds the possibility of grasping a literary reality more closely and accurately than has been possible through the canonically limited readings of her discipline.
Miles develops further corrections of literary explosiveness: Blake’s romantic eccentricity and his rebelliousness, are precisely not to be taken from his language, insofar as his linguistic wording remains thoroughly a product of his time (Miles, Major Adjectives). John Donne, usually considered a sacralizing-metaphysical poet, unfolded a language that is hardly riddled with metaphor – unlike what his standardized characterization suggests –, but is of conceptual clarity and argumentative thoroughness. And the idiom of almost concrete poetry often attributed to Wordsworth is based – as she exhibits – on a highly selective, non-representative selection on the side of his interpreters and the ignoring of a large part of his work: If his entire oeuvre is taken into account, it becomes apparent that an almost more abstract, and generalizing tendency prevails in his work, while his poetic concreteness in the use of language characterizes only the one poem that is then considered representative for the entirety of his work (Sagner Buurma and Hefferman).
To summarize: Josephine Miles’ linguistic form analysis becomes a tool of literary criticism. Text surfaces are analyzed in their word relations in such a way that questions become answerable which do not relate to the text surface, but rather concern general problems of literary theory and literary history. The interpretation-neutral, quantifying distance assumed in the counting of words opens up precisely the possibility of a microscopic view of texts, which can then lead to new interpretive conclusions and, in Miles’ case, does indeed lead to them. The decomposition into text modules – in this case, words and their frequencies – creates a new object of humanistic work. This object is given coordinates, it is clearly localized in its ‘place’ and ‘position’ in an oeuvre spread out as a textual surface. What emerges is the cartography of texts from the perspective of localizable word occurrences. The immediate object remains not the literary continuous text, but are lists and tables in which the results of the counting work are recorded.
Incidentally, lists and tables are forms of textuality that have a long history in the Humanities. Transcription of the initial object ‘narrative text’ into a diagrammatically structured textual form takes place, which embodies not simply an arrangement of signs, but of data, which can still be written on and received by humans, but which is distinguished by being in principle machine-readable and analyzable. The use of the computer in the John Dryden concordance serves to mechanize extremely laborious, mindless routine work that for centuries had to be done on concordances by Humanities scholars. The innovation that makes Josephine Miles a pioneer in the Digital Humanities is not in the transfer of stupendous textual work to the machine but in the incorporation of quantifying, data-related work with textual surfaces for genuine Humanities questions.
(1) The narrative of a ‘cultural technique of flattening’ presented in this essay describes the use of artificial flatness (writing, images, diagrams, tables, maps…) as an achievement of civilization and a creative power without which neither modernity nor the dynamics of complex civilizations in general, can be understood. Digitization and the Digital Humanities are extrapolating and radicalizing these potentials.
(2) In recent attempts to designate the Digital Humanities as an integral part of the Humanities, a narrative of a fusion between computation and hermeneutics is deployed: computation is interpreted as a simultaneously hermeneutic process (Dobson) and the computer itself as a hermeneutic instrument (van Zundert). But this position reinforces a traditional but problematic self-image of the Humanities. This self-image hypostasizes interpretation and hermeneutics as key methodology of the Humanities. Within traditional Humanities, this approach remains blind to the always already given materiality, the data reference, and the ‘thingness’ of Humanities objects and related basic scholarly practices that collect, date, order, label, distinguish and annotate cultural objects and thus enable their interpretation in the first place.
(3) Can the productivity of the Digital Humanities within the Humanities be understood in a perspective that does not simultaneously imply a problematic absolutization of hermeneutics as the gravitational center of the Humanities? Such a perspective is unfolded by the narrative of the ‘cultural technique of flattening’ developed here. The use of inscribed and illustrated surfaces is a productive capacity, without which complex bureaucracy, all sciences, and many arts, architecture and technology are unimaginable. The application of two-dimensional inscriptions and drawings creates a laboratory for arts and technical designs, a workshop of writing and reasoning, and an experimental space for diagrammatic operations.
(4) The data-driven procedures of the Digital Humanities are related to these practices of operating in two-dimensionality and quantification. Historically, modes of working in the Humanities emerged in close liaison with operations involving artificial flatness. What is represented on a surface always forms an empirical, and therefore countable, fact. Thus, the number has always played an essential role in the generation and representation of knowledge in the Humanities: Concordances, library signatures, catalogs of works, historical timelines and dating tables, and diagrammatic inscriptions of all kinds bear witness to this. Alphanumeric practices are rooted in the anthropotechniques of flattening out.
(5) Datafication forms the core of contemporary digitization. The ‘readability of the world’ (Blumenberg) turns into the machine readability and analyzability of data corpora. For humans, data can be received as meaningful signs, as exemplified by the dating of days as sequences of digits that have significance only in the context of calendrical practices. This is not valid for machines. When symbolic forms (text, image, film, music…) and social interactions are transformed into large machine-analyzable data corpora, the forensic power of computer-generated surface analytics can be used to discover latent, implicit, and even unconscious structures in volumes of data unmanageable by humans.
(6) The Digital Humanities are a complement to the methodological arsenal of the Humanities. Computational methods neither replace interpretation and hermeneutics, nor do they melt with them. No number, no datum interprets itself. As part of the Humanities, the Digital Humanities have to deal critically with the scope and limits of data-driven methods. Moreover, the ‘sting of the digital’ emanating from the Digital Humanities can also consist in a critical revision of the self-image of the Humanities: The ecosystem of humanistic work is grounded not only in doing interpretation but in the always also empirical materiality and mediality of its research objects: the interactions between people, symbolic and technical artifacts, and their societal relations in history and the present. The narrative of the ‘cultural technique of flattening’ connects some dimensions of traditional scholarly work in the Humanities with its digital offspring.