<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "https://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1-mathml3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.2" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">1832</journal-id>
      <journal-title-group>
        <journal-title>Journal of Cultural Analytics</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2371-4549</issn>
      <publisher>
        <publisher-name>Center for Digital Humanities, Princeton University</publisher-name>
      </publisher>
      <self-uri xlink:href="https://culturalanalytics.org/">Website: Journal of Cultural Analytics</self-uri>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">55795</article-id>
      <article-id pub-id-type="doi">10.22148/001c.55795</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>From the Archive to the Computer: Michel Foucault and the Digital Humanities</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Schmidgen</surname>
            <given-names>Henning</given-names>
          </name>
          <xref ref-type="aff" rid="author-aff-1">
            <sup>1</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Dotzler</surname>
            <given-names>Bernhard</given-names>
          </name>
          <xref ref-type="aff" rid="author-aff-2">
            <sup>2</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Stein</surname>
            <given-names>Benno</given-names>
          </name>
          <xref ref-type="aff" rid="author-aff-1">
            <sup>1</sup>
          </xref>
        </contrib>
      </contrib-group>
      <aff id="author-aff-1">
        <label>1</label>
        <institution-wrap>
          <institution content-type="edu">Bauhaus University, Weimar</institution>
        </institution-wrap>
        <institution-wrap>
          <institution-id institution-id-type="ROR">https://ror.org/033bb5z47</institution-id>
        </institution-wrap>
      </aff>
      <aff id="author-aff-2">
        <label>2</label>
        <institution-wrap>
          <institution content-type="edu">University of Regensburg</institution>
        </institution-wrap>
        <institution-wrap>
          <institution-id institution-id-type="ROR">https://ror.org/01eezs655</institution-id>
        </institution-wrap>
      </aff>
      <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2023-03-21">
        <day>21</day>
        <month>3</month>
        <year>2023</year>
      </pub-date>
      <pub-date publication-format="electronic" date-type="collection" iso-8601-date="2022-12-30">
        <year>2022</year>
      </pub-date>
      <volume>7</volume>
      <issue seq="8">4</issue>
      <issue-title>Theorytellings: Epistemic Narratives in the Digital Humanities</issue-title>
      <elocation-id>55795</elocation-id>
      <history>
        <date date-type="received" iso-8601-date="2022-06-14">
          <day>14</day>
          <month>6</month>
          <year>2022</year>
        </date>
        <date date-type="accepted" iso-8601-date="2022-11-30">
          <day>30</day>
          <month>11</month>
          <year>2022</year>
        </date>
      </history>
      <permissions>
        <license license-type="open-access">
          <ali:license_ref xmlns:ali="http://www.niso.org/schemas/ali/1.0/">
              http://creativecommons.org/licenses/by/4.0
            </ali:license_ref>
          <license-p>
              This is an open access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0">Creative Commons Attribution License (4.0)</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
            </license-p>
        </license>
      </permissions>
      <self-uri content-type="pdf" xlink:href="https://culturalanalytics.org/article/55795.pdf"/>
      <self-uri content-type="xml" xlink:href="https://culturalanalytics.org/article/55795.xml"/>
      <self-uri content-type="json" xlink:href="https://culturalanalytics.org/article/55795.json"/>
      <self-uri content-type="html" xlink:href="https://culturalanalytics.org/article/55795"/>
      <abstract>
        <p>Michel Foucault famously introduced the method of “discourse analysis” in the humanities, especially in historiography. In his <italic>Archaeology of Knowledge</italic>, originally published in 1969, in particular, Foucault argues for making the history of knowledge the object of discourse analyses. In the context of the current surge of interest in discourse analysis in the field of computer science, however, there are hardly any references to Foucault, partly because he never defined a methodological process that could be operationalized. Nonetheless we argue for re-reading the <italic>Archaeology of Knowledge</italic> in the context of computer science and the digital humanities. As a matter of fact, there are considerable affinities between Foucault’s search for the regularities of discourse and current projects dealing with the digitization of texts, their indexing, distributional features, stylometry, etc. We show that these projects were already quite prominent in Foucault’s day, to the point that historian Emmanuel Le Roy Ladurie could assert, in 1968, that “the future historian will be a programmer.” A year later, Foucault’s <italic>Archaeology of Knowledge</italic> actively responded and constructively took up the challenge – which, given the recent advances in machine learning and computational linguistics, strikes us as a crucial move today.</p>
      </abstract>
      <kwd-group>
        <kwd>DH theory</kwd>
        <kwd>discourse analysis</kwd>
        <kwd>computer science</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec>
      <title>Introduction<xref ref-type="fn" rid="fn1">1</xref></title>
      <p>At the height of student protests in May 1968, the French historian Emmanuel Le Roy Ladurie made a remarkable announcement in the pages of the weekly magazine <italic>Le nouvel observateur</italic>. Perhaps sympathizing with the students in the streets, Le Roy Ladurie postulated <italic>la fin des érudits</italic>, the “end of the scholars.”</p>
      <p>The occasion and basis of this prophecy, however, was not the crisis of the university but the increasing use of computers in historical research. Pointing to examples from the United States and France, he explained that for historical projects employing information technology, promising perspectives were opening up: “One of the clearest directions is the analysis of vast <italic>corpora</italic> of documents whose data are of capital importance but whose scope has so far thwarted researchers’ efforts” <xref ref-type="bibr" rid="ref-161963">(Le Roy Ladurie, “La fin des érudits: L’historien de demain sera programmeur ou ne sera pas” 3)</xref>.</p>
      <p>Le Roy Ladurie cites as his example the work of the US-American medievalist David Herlihy, who, starting in the 1950s, made use of computers to study Italian land registers. He could just as well have chosen his own work on the farmers in the Languedoc from 1966 or other work done in the orbit of the <italic>Annales</italic> school that had deployed electronic computing devices to process historical and sociological data. The conclusion Le Roy Ladurie draws from these projects was to make waves: “[t]he historian of tomorrow will be a programmer or he won’t be at all” (3; see also 1973<xref ref-type="bibr" rid="ref-161964"/>).<xref ref-type="fn" rid="fn2">2</xref></p>
      <p>A year later, in 1969, Michel Foucault publishes the volume that to this day is being referred to as his book of method, <italic>The Archaeology of Knowledge</italic>. On the very first page, it evokes a central theme of the <italic>Annales</italic> school, the so-called <italic>longue durée</italic>. Foucault writes, for example: “For many years now historians have preferred to turn their attention to long periods” <xref ref-type="bibr" rid="ref-161945">(Foucault, <italic>Archaeology</italic> 3)</xref>.<xref ref-type="fn" rid="fn3">3</xref></p>
      <p>The allusions are clear later in the book as well, for example when Foucault explains that “the building-up of coherent and homogeneous corpora of documents” (10) is a decisive methodological problem of contemporary historical studies, or when, in specifying new methods in historical research, he lists “the quantitative treatment of data, the breaking-down of the material according to a number of assignable features whose correlations are then studied, interpretative decipherment, analysis of frequency and distribution” (11). Our argument is that Foucault here refers to the deployment of computers in the humanities and social sciences that Le Roy Ladurie had so prominently described a short time earlier.</p>
      <p>This reference might come as a surprise, not only because it shifts the first flourishing of what we now call digital humanities fifty years into the past <xref ref-type="bibr" rid="ref-161937 ref-161975">(Burdick et al. 121–36; Sterne 17–33)</xref>. It also seems to jar with the popular image of Foucault frequenting the archives or at least regularly visiting the <italic>Bibliothèque nationale</italic>. But a different, more fitting image emerges when we take the historical, institutional, and intellectual context more fully into account. As early as the 1950s, Foucault was interested in “the statistical theory of information” <xref ref-type="bibr" rid="ref-161946">(Foucault, “La psychologie” 136)</xref>; in 1966, he speculated about bringing together “the analysis of languages” and “information processing” <xref ref-type="bibr" rid="ref-161947">(Foucault, “Message ou bruit” 560)</xref>; and a little later, he observed that “in the thickness [<italic>épaisseur</italic>] of natural processes . . . the structure of the message,” that is to say, ultimately, encoded information, could be discovered <xref ref-type="bibr" rid="ref-161948">(Foucault, “Hyppolite” 784)</xref>.</p>
      <p>These and other remarks by Foucault become understandable against the backdrop of the flourishing of cybernetics and molecular biology in 1960s France <xref ref-type="bibr" rid="ref-161943 ref-161960 ref-161952">(Erdur; Kay 275–76; Geoghegan)</xref>. Prepared by the intense engagement with the relationship between life, language, and technology that shapes the work of Foucault’s most important academic mentors, Jean Hyppolite and Georges Canguilhem<xref ref-type="bibr" rid="ref-161939"/>, Foucault sets out in <italic>The Archaeology of Knowledge</italic><xref ref-type="bibr" rid="ref-161945"/>, to sketch, retrospectively, the methodology that his analyses of the history of psychiatry, clinical medicine, biology, political economy, and linguistics had followed—or, rather, the methodology these studies had sketched over time.</p>
      <p>As far as we know, Foucault never made use of a computer to conduct his discourse analyses. Personal computers for private use were not developed until the mid-1970s. His elaboration of the “archaeological” method, however, reflects the emerging automation of such analyses. At the same time, his reflections contain remarkable parallels with the so-called distributional hypothesis, which has been a central element in statistical semantics since the pioneering work of the linguist Zellig Harris<xref ref-type="bibr" rid="ref-161954 ref-161955 ref-161956"/> and ever since it has been popularized by John Rupert Firth<xref ref-type="bibr" rid="ref-161944"/>.</p>
      <p>Computer science today summarizes this hypothesis by formulas such as “[w]ords which are similar in meaning occur in similar contexts” (<xref ref-type="bibr" rid="ref-161973">Rubenstein and Goodenough 627</xref>; see also Harris). The Foucault of the <italic>Archaeology</italic> would not disagree. That is why the procedure he sketches—even if its methodological status remains unclear—can contribute to stimulating cooperation between computer scientists and scholars in the humanities and social sciences today.</p>
      <p>In the following, we discuss some issues that could or – according to us – should be of interest in this cooperation. Essential elements here are the understanding of “discourses,” i.e. large aggregates of utterances, as central components of cultural and social life; the problem of their delimitation in recourse to the objects specific to them; and the question of the possibility of automating the delimitation and analysis of such discourses by means of computer technology.</p>
      <p>Thus, we are not concerned with an application or operationalization of Foucault’s method, and we are not arguing that all practitioners in the digital humanities should read Foucault. Foucault’s concept of discourse is notoriously underdetermined and its meaning also changes over time. What the <italic>Archaeology of Knowledge</italic> does accomplish, however, is a sophisticated discussion of fundamental problems with the use of computer technology in the humanities, especially historiography. Our argument is that those digital humanities scholars can benefit from this discussion who are theoretically interested and who are willing to share that theoretical interest with other humanities scholars, even if the latter are not, or not yet, working in the digital field.</p>
      <p>We proceed in three steps. In parts 1 and 2 we present the method that Foucault describes in the <italic>Archaeology of Knowledge</italic>. Special attention is given on the one hand to the discursive aspects of <italic>object</italic>, <italic>style</italic>, <italic>concepts</italic> and <italic>themes</italic> and on the other hand to the regularity of discourses. In parts 3 and 4 we present and discuss two historical attempts to automate individual aspects of discourse analysis. We focus on the General Inquirer developed by a team around the Harvard psychologist Philipp Stone<xref ref-type="bibr" rid="ref-161976"/> around 1966 as well as the program developed by French linguist and philosopher Michel Pêcheux in the late 1960s to investigate the “deep structure” of discursive effects.</p>
      <p>Against this background, thirdly, we discuss the perspectives of today’s automation of discourse analysis. Our conclusion is that, while Big Data and Machine Learning have significantly contributed to improve some aspects of automated discourse analysis, tasks such as the definition of research questions or the delimitation of research objects, as well as the interpretation of research results, still belong to the historian – or rather the archaeologist in Foucault’s sense.</p>
    </sec>
    <sec>
      <title>1. Aspects of Foucault’s Method</title>
      <p>The method that is spelled out in <italic>Archaeology of Knowledge</italic> can be presented in view of four aspects. First, as regards the starting point of the method, there is what Foucault repeatedly calls the “dispersion” of discursive events. Thus he declares discourse to be a “vast field” “made up of the totality of all effective utterances [<italic>énoncés</italic>] (whether spoken or written) in their dispersion as events and in the occurrence that is proper to them” (<xref ref-type="bibr" rid="ref-161945">Foucault, <italic>Archaeology</italic> 26–27</xref>; translation amended).</p>
      <p>The repeated evocation of the dispersion of discourse is due to Foucault adopting a statistical (in the broadest sense) perspective on discursive events (see also <xref ref-type="bibr" rid="ref-161958">Herrmann 62–67</xref>). Instead of starting from individual historical actors (persons, authors), works, institutions, or disciplines, he places a mass of distributed discursive events (and in that sense, they are indeed strictly linguistic data) at the beginning. This perspectivization is joined by an epistemological motif still present in today’s debates about the digital humanities. When historians (or, to speak with Foucault: archaeologists) confront discourse as a set of data, they find themselves up against, according to Foucault, “linguistic sequences that . . . in sheer size, exceed the capacities of recording, memory, or reading” <xref ref-type="bibr" rid="ref-161945">(Foucault, <italic>Archaeology</italic> 27)</xref>.</p>
      <p>At the same time, this avoids misperceptions and misjudgments that come with the position of the individual reader and his or her limited (if not in principle, then in practice) capacities. They dismiss or, as Foucault puts it, “eclipse . . . that form of history that was secretly, but entirely related to the synthetic activity of the subject” (14)—we might consider this to be Foucault’s version of the end of the scholar as we knew her, proclaimed by Le Roy Ladurie.</p>
      <p>While the position sketched in <italic>The Archaeology of Knowledge</italic> is not anthropocentric, it remains a difficult question how order is to emerge again from the overwhelming quantity of discursive events. This is our second point: how, with the aid of which criteria and procedures, entities that can be studied at all are to be delineated in the sheer mass of discursive data.</p>
      <p>In his televised debate with Noam Chomsky, Foucault is clear that the entire archaeological endeavor aims at investigating comparatively circumscribed discursive sets. He explains his interest in the discourse analysis of scientific knowledge by citing the history of medicine in the late eighteenth century:</p>
      <disp-quote>
        <p>[R]ead twenty medical works, it doesn’t matter which, of the years 1770 to 1780, then twenty others from the years 1820 to 1830, and I would say, quite at random, that in forty or fifty years everything had changed; what one talked about, the way one talked about it, not just the remedies, of course, not just the maladies and their classifications, but the outlook itself. <xref ref-type="bibr" rid="ref-161950">(Foucault and Chomsky 150)</xref></p>
      </disp-quote>
      <p>The point where Foucauldian archaeology sets in, then, are breaks in discourse, fundamental changes in scientific utterances, that is, abrupt transformations in the schemata according to which words, parts of sentences, and finally entire texts are constructed in this domain—changes in “paradigms” in the linguistic, not in Thomas Kuhn’s<xref ref-type="bibr" rid="ref-161962"/> sense of the term.</p>
      <p>Foucault follows this remark up with the question: “Who was responsible for that? Who was the author of it? It is artificial, I think, to say Bichat, or even to expand a little and to say the first anatomical clinicians. It’s a matter of a <italic>collective and complex transformation</italic> of medical understanding in its practice and its rules” (<xref ref-type="bibr" rid="ref-161950">Foucault and Chomsky 150</xref>; emphasis added).</p>
      <p>At issue, then, are not individual and punctual discoveries, not individual scientists or authors but—not unlike in Kuhn—overarching changes in dominant forms of perception and procedures. Foucault’s <italic>Archaeology of Knowledge</italic> aims at describing, closely studying, and, as far as possible, explaining such collective and complex transformations on the level of discourses, that is, of actual utterances.</p>
    </sec>
    <sec>
      <title>2. Discursive Regularities</title>
      <p>The off-the-cuff remark on twenty medical books from different epochs is translated, in the <italic>Archaeology</italic>, in a complex schema that—our third point—includes, besides the <italic>object</italic> of a discursive formation, the questions of <italic>style</italic>, of <italic>concepts</italic>, and of the overarching <italic>themes</italic>. This schema we cannot discuss here in detail, but we can point out that two of its aspects—style and thematic—already played an important role in the digital humanities of the 1960s. The analysis of themes was the goal, for example, of the General Inquirer developed by a team around the Harvard psychologist Philipp Stone. This computerized procedure for analyzing textual content, presented in 1966, soon garnered attention among the people then in Foucault’s orbit (see <xref ref-type="bibr" rid="ref-161957">Helsloot and Hak 78</xref>).</p>
      <p>As far as “style,” is concerned, it is often understood in the digital humanities at the time as “difference in frequency distribution and matrices of transition probability of a text’s linguistic units from the corresponding [units] of language as a whole” <xref ref-type="bibr" rid="ref-161969">(Müller 161)</xref>. Foucault, as noted earlier, is not interested in the question of individual authorship, which is important, if not decisive, in stylometry to this day. In the <italic>Archaeology of Knowledge</italic>, though, he <italic>is</italic> interested in the “frequency and distribution” of historical data <xref ref-type="bibr" rid="ref-161945">(Foucault, <italic>Archaeology</italic> 11)</xref>, and what draws his attention is the “distribution” of objects in a discourse, “the interplay of their differences, . . . their proximity or distance” (46).</p>
      <p>This brings us to his interest in the internal organization of discourse, in the rules that the utterances in a certain age and about a certain object follow. This is our fourth and final point. In <italic>The Archaeology of Knowledge</italic>, this interest takes the guise of the question whether the utterances of a discursive formation are organized <italic>in</italic> this formation or whether they might not be specifically organized <italic>by</italic> it, and whether they can be said to follow specific rules: “an order in their successive appearance, correlations in their simultaneity, assignable positions in a common space, a reciprocal functioning, linked and hierarchized transformations” (37).</p>
      <p>Foucault accordingly sets out in search of the “intrinsic regularities of discourse.” Rules, he never tires to emphasize, are not situated behind or above discourses but “at the most ‘superficial’ level (at the level of discourse).” They are not located in the consciousness of individuals, nor in a “mentality” of the kind the <italic>Annales</italic> school was working on, “but in discourse itself” (62–63).</p>
      <p>In assuming such an immanentist position, Foucault is at the same time moving away from the separation between surface structure and deep structure operated in 1960s linguistics, most concisely by Chomsky. His position clearly is not far from the so-called distributional hypothesis, closely associated with the name of Chomsky’s teacher Zellig Harris.<xref ref-type="fn" rid="fn4">4</xref></p>
      <p>And indeed, Foucault in the <italic>Archaeology</italic> stresses the proximity of “rule” and “regularity.” For instance, he describes the entire set of rules of a given discursive practice as a “system of formation,” which he wants to be understood as “a complex group of relations” that in turn function as rules for the four entities cited earlier—object, style, concept, thematic:</p>
      <disp-quote>
        <p>By system of formation, then, I mean a complex group of relations that function as a rule: it lays down <italic>what must be related</italic>, in a particular discursive practice, for such and such an utterance to be made, for such and such a concept to be used, for such and such a strategy to be organized. (<xref ref-type="bibr" rid="ref-161945">Foucault, <italic>Archaeology</italic> 74</xref>; translation amended, emphasis added).</p>
      </disp-quote>
      <p>The question of the rules of discourse, it seems, thus dissolves in the question of the regularities of relationships between discursive elements. The normative aspect of discourse, we might say, is captured through distributions and relationships that can be determined statistically. Discursive regularity here follows from the frequency of discursive elements.</p>
    </sec>
    <sec>
      <title>3. Automating Discourse Analysis</title>
      <p>At the end of the 1960s, the philosopher and linguist Michel Pêcheux, a member of the <italic>Cercle d’épistemologie</italic> that was close to Foucault, presented a project for automating discourse analysis. Based on, in rough terms, a theory of discourse production as a “theory of the rule-governed variation of ‘deep structures,’” Pêcheux’s automated discourse analysis was concerned with going from a series of discursive “‘surface effects’” to a “‘deep structure,’” an “invisible structure which determines them” <xref ref-type="bibr" rid="ref-161957">(Helsloot and Hak 96)</xref>.</p>
      <p>In the course of its implementation, this endeavor encountered a difficulty that rather resembles the one of Foucault’s <italic>Archaeology</italic> and yet differs fundamentally from it. Foucault’s procedure to determine the regularities of certain sets of discourses consists, it seems, in filtering these sets as such out of a mass of texts within an iterative process of pattern matching and membership recognition. Pêcheux’s analysis of discourse, in contrast, can operate only via corpora defined in advance. These corpora have already been constituted; the issue then is to determine their “deep structure” or “regularity.”</p>
      <p>The dilemma that arises here, it seems to us, remains relevant—it belongs to the “theorytellings” of digital humanities today. Discourse analysis under the conditions of today’s technology is concerned with developing a system that would have to be, we might say, an algorithm for algorithm analysis, i.e., a meta-analysis algorithm. From Foucault’s perspective, this system would have to be capable of finding out not only the contents of certain sets of discourse but their regularities—beginning with the ability to filter these sets from an undefined mass of texts in a circular process of rule recognition and membership definition.</p>
      <p>In more concrete terms, the four aspects of discursive formations Foucault brings out—object, style, concepts, and thematic—would have to be discovered in a largely automated way. While style and thematic are not uncommon problems in the digital humanities and work on them has clearly progressed in recent years, it is still largely unclear how a discourse can be defined starting from a given object. The definition of objects is a core domain of scientific discourses, yet discourse analysis in Foucault’s sense is far from willing to take conceptual definitions from the individual sciences and make them the basis of its own studies.</p>
      <p>On the contrary: the productivity of Foucauldian archaeology very much derives from operating its own definitions of objects in order to open up new perspectives on the emergence of individual sciences—for example by showing how strongly the development of linguistics, biology, and economics in the seventeenth and eighteenth centuries depended on a specific yet largely implicit conception of the object ‘human being’ (see <xref ref-type="bibr" rid="ref-161949">Foucault, <italic>Order of Things</italic></xref>).</p>
      <p>It seems equally unclear how a discourse’s central concepts are to be identified if by “central concepts” we are not to mean simply the words most frequently used. There is in fact an essential difference between a scientific concept and a word—a point demonstrated not least of all by Georges Canguilhem, the philosopher and historian of science who shaped Foucault’s thinking in important ways. Using as his example the concept of “reflex” in modern physiology, Canguilhem showed that the conception and the definition of the phenomenon designated by this term do not depend on its use. In the seventeenth century, Thomas Willis defined the phenomenon of the organism’s reflexive reactions without evoking the optical analogy implied in the term, whereas Descartes spoke of “reflex” without having a precise physiological conception of the phenomenon. It is thus quite difficult to name and to find scientific concepts in a textual corpus in an automated way.</p>
    </sec>
    <sec>
      <title>4. Current Problems</title>
      <p>The challenges of implementing Foucault’s archaeological method can also be described from the perspective of current developments in machine learning, computational linguistics, and Big Data processing technology. We can see both chances for automating discourse analysis and some obvious limits. For tackling the task today, one option is to simplify the goal of discourse analysis altogether, thus making it a less moving target. Note that in today’s computational linguistics (or language technology), discourse analysis is understood as “language processing beyond the sentence boundary” in order to “compute information about a text in order to supplement the results of sentence processing (e.g., when supplying a referent for a pronoun from context)” or “to combine sentence-level information to larger units (e.g., when inferring a causal relationship to hold between two sentences)” <xref ref-type="bibr" rid="ref-161974">(Stede 11)</xref>.<xref ref-type="fn" rid="fn5">5</xref></p>
      <p>In this context, the required computation is based on the identification of elementary discourse units, which are hierarchically organized and between which relations can be defined. While, on the one hand, discourse analysis becomes feasible under such an interpretation, it is, on the other hand, too weak from a humanities perspective to tackle the scope of discourse-related research questions in the field, e.g., the development of discourses in specific historical periods or geographical areas, their internal changes as well as their relation to broader cultural and/or societal transformations.</p>
      <p>Probably we have to adopt the immanentist position of Foucault, namely, to admit that discourse emerges as a function which <italic>cannot be deductively inferred</italic> from features, whatever their nature is. This is the move from linguistic structuralism to linguistic distributionalism <xref ref-type="bibr" rid="ref-161935">(Biemann)</xref>, which is prefigured in the <italic>Archaeology</italic> <italic>of Knowledge</italic>.</p>
      <p>Today, the potential of this move is illustrated by the amazing progress in research of autoregressive language models. It brought forth the so-called Generative Pre-trained Transformer, GPT, which, in its latest generation is able to produce human-like text. The GPT architecture consists of two coupled recursive neural networks with millions of parameters, which can be effectively trained by the so-called attention mechanism to model human language.<xref ref-type="fn" rid="fn6">6</xref></p>
      <p>Whereas Pêcheux’s model remained tied to a Chomskian quest for underlying grammars or structures,<xref ref-type="fn" rid="fn7">7</xref> GPT can be considered as one of the most effective and uncompromising manifestations of the distributionalism paradigm as defined by Harris. The technology has the potential to become another option to narrow the gap between the scientifically interesting and the technically feasible outlined above: Given some input sequence (prompt) in the form a clause, a sentence, or a paragraph of text, GPT outputs the most likely sequence of words, where likelihood is based on the associations (the distributional semantics) learned from the training data. When prompting GPT with a specific topic the generated text can be taken as a discourse on this topic, where the generation probability along with the topical distance—both can be quantified reasonably well—are criteria to halt the generation process and delineate a discourse.</p>
      <p>Though the instances of the current GPT generation still not convince as discourse generation machines (shortcomings include: no long-distance consistency, no coherent argumentation, logically flawed), there is reasonable hope that future generations can do, this way becoming a new means to study discourse phenomena. However, currently we cannot learn (much) from GPT in terms of discourse <italic>analysis</italic>, and it is an open question whether we ever will. While symbolic text generation approaches apply schemes, heuristic rules, grammars, search strategies, or planning algorithms—a machinery that can similarly be used for text analysis purposes—the text generation principles of deep neural approaches remain implicit and hidden.</p>
    </sec>
    <sec>
      <title>Discussion</title>
      <p>What impulses, then, can Foucault’s <italic>Archaeology of Knowledge</italic> provide in the further development of the digital humanities? The most important impulse today, perhaps, can be described as the transition from <italic>digital</italic> to <italic>computational humanities</italic>. We propose this term to distinguish between research focusing on digitization (digital humanities) and research focusing on algorithmic text processing with the goal of a semantic analysis (computational humanities), using machine learning and data mining methods among others. Though Foucault’s engagement with the <italic>Annales</italic> school’s use of computers remained implicit and he seems not to have seriously considered using computers for discourse analysis himself at the time, <italic>The Archaeology of Knowledge</italic> is rather close to this understanding of computational humanities: He envisaged a research agenda that neither concentrates on simply digitizing the cultural heritage (in cooperation with libraries, archives, and museums) nor focuses on analyzing bibliographic (meta) data as to their distribution in time and space—as is being done, in programmatically interesting ways, in certain forms of “macroanalysis” <xref ref-type="bibr" rid="ref-161959">(Jockers)</xref>.</p>
      <p>In shifting attention away from discursive units that are identified by a common object or style, certain concepts, or overarching themes, toward the implicit regularity of discursive formations (and toward the notion that such formations are defined by such regularities alone), Foucault defined a problem that is very unlikely to be solved by means of the digital humanities’ tools, which apply to digitized or “born digital” documents, but instead requires a means for the unsupervised learning of meta-learning strategies, tailored to the input data (computational humanities).</p>
      <p>In our eyes, this would be one of the “mission-critical” conditions of an automatic discourse analysis: the system would have to be capable of finding out not only the contents of certain sets of discourse but their regularities—beginning with the ability to filter these sets from masses of texts within an iterative process of pattern matching and membership recognition.</p>
      <p>Since we do not see a technical solution for this problem, it seems necessary and promising to keep the human in the loop, accepting her as an indispensable part of machine-based discourse analysis, and to provide technology to tighten and improving the connection to the machine.<xref ref-type="fn" rid="fn8">8</xref> In this regard, we are developing exploratory search technology to empower humanities scholars to deal with Big Data, enabling them to ask research questions that require the analysis of hundreds of books, papers, and other data—to repeat our second point—to bring order to the chaos of discursive dispersion <xref ref-type="bibr" rid="ref-161953">(Gollub et al.)</xref>. As we have seen, Foucault also highlighted the importance of scholarly expertise, despite his alleged ‘anti-humanism,’ in particular when it comes to define the object constitutive of a given discourse.</p>
      <p>In addition to the recent deep learning technologies, which have brought great advances in text generation and semantic analysis, the classic computer science paradigms to problem solving are data parallelization and task parallelization.</p>
      <p>The practical benefits and the feasibility of data parallelism in the humanities have been clearly demonstrated <xref ref-type="bibr" rid="ref-161966 ref-161961">(Michel et al.; Kozlowski et al.)</xref>, foremost by work done in the context of distant reading <xref ref-type="bibr" rid="ref-161967 ref-161968 ref-161977">(Moretti, <italic>Graphs</italic>; Moretti, <italic>Distant</italic>; Underwood)</xref>. From a problem reduction perspective, distant reading exploits data parallelism to significantly reduce the complexity of the resulting partial solutions so that subsequent analyses can be performed without computational support. Discursive regularities may be envisioned by projecting masses of text into low-dimensional “semantic spaces,” such as timelines, geographic maps, or topic networks, to become visually explored and analyzed by human experts, providing perspectives onto a corpus which is orthogonal to the reading direction of the documents.</p>
      <p>Problem reduction via task parallelism becomes effective if subtasks can be executed independently of each other and, if the partial solutions for the subtasks can be combined to form a complete solution. Although the first condition is often only partially satisfied in humanities problems, we consider the second condition to be the more problematic. Task parallelization requires the outcomes of independent analyses to be combined, following a synthesis strategy, such as the celebrated map-reduce scheme known from Big Data processing. While such recombination schemes are successful for regularly structured problems, result aggregation usually fails for the type of research questions the humanities are dealing with. I.e., human expertise and interpretation capabilities are currently indispensable for putting together the solution pieces.</p>
      <p>In the late 1960s, <italic>Annales</italic> historian Le Roy Ladurie spoke, with regard to the increasing use of the computer in the humanities, of the ‘end of the scholars.’ Some time later, Foucault, in his <italic>Archaeology of Knowledge</italic>, showed the perspectives and problems that this development entails for the humanities, especially in the realm of discourse analysis. As the question of the delimitation of discourses and the determination of their specific objects shows, the technical achievements of the last 50 years have shifted the problems in this field, but they have not solved them.</p>
      <p>Against this background, one would like to confirm Le Roy Ladurie’s statement from 1968 in modified form: ‘The historian of tomorrow will be a computational humanist or she won’t be at all.’ At the same time, however, it should be stated that, for the time being, the historian—or rather, the archaeologist in Foucault’s sense—cannot delegate important tasks to machines, e.g., the definition of research questions, the delimitation of research objects, as well as the interpretation and evaluation of research results.</p>
    </sec>
  </body>
  <back>
    <fn-group>
      <fn id="fn1">
        <label>1</label>
        <p>This paper was written as part of the DFG research project “Process-oriented Discourse Analysis”. <ext-link ext-link-type="uri" xlink:href="https://gepris.dfg.de/gepris/projekt/326264959">https://gepris.dfg.de/gepris/projekt/326264959</ext-link>. It is based on a comprehensive study on “Discourse Analysis in the Age of Intelligent Machines” by Bernhard Dotzler and Henning Schmidgen recently published in German <xref ref-type="bibr" rid="ref-161942">(Dotzler and Schmidgen)</xref>. While the three authors of the present paper share its general argument, they happily disagree on some of its details. We would like to thank Tim Gollub, Franziska Klemstein, and Johannes Hess for helpful suggestions and critical comments.</p>
      </fn>
      <fn id="fn2">
        <label>2</label>
        <p>On the use of computing machinery in the <italic>Annales</italic> school, see <xref ref-type="bibr" rid="ref-161938">Burke 53–64</xref>, and <xref ref-type="bibr" rid="ref-161965">Lemny</xref>. For contemporary descriptions, see <xref ref-type="bibr" rid="ref-161972">Price</xref> and <xref ref-type="bibr" rid="ref-161951">Furet</xref>.</p>
      </fn>
      <fn id="fn3">
        <label>3</label>
        <p>On this book, see, for example, <xref ref-type="bibr" rid="ref-161978">Webb</xref> and 1999<xref ref-type="bibr" rid="ref-161936"/>.</p>
      </fn>
      <fn id="fn4">
        <label>4</label>
        <p>Harris begins with the two entities that are central for Foucault as well: discourse (<italic>discours</italic>) on the one hand and utterance (<italic>énoncé</italic>) on the other. This and other similarities prompt Thomas Pavel<xref ref-type="bibr" rid="ref-161971"/> (131) to draw a parallel between the conceptual apparatuses of Foucault and distributionalists such as Harris and his students. On this point, see also <xref ref-type="bibr" rid="ref-161941">Dosse 241</xref>.</p>
      </fn>
      <fn id="fn5">
        <label>5</label>
        <p>Stede also says: “Discourse processing is the acquisition of information about a text, including assigning structural descriptions to it, so that the extraction of information from a text becomes more interesting, more fruitful, or more simple.”</p>
      </fn>
      <fn id="fn6">
        <label>6</label>
        <p>The most recent version, GPT-3, was released in June 2020 <xref ref-type="bibr" rid="ref-161970">(OpenAI)</xref>. GPT-3’s full version has a capacity of 175 billion machine learning parameters and was trained on 410 billion byte-pair-encoded tokens.</p>
      </fn>
      <fn id="fn7">
        <label>7</label>
        <p>Recall Chomsky’s understanding of grammar, as a means to decide grammaticality, sufficiently specific to generate only sentences of the respective language, but also as a system whose rules can be identified by human introspection only <xref ref-type="bibr" rid="ref-161940">(Chomsky 13–14)</xref>.</p>
      </fn>
      <fn id="fn8">
        <label>8</label>
        <p>This is a central goal of our project on “Process-oriented Discourse Analysis”, https://gepris.dfg.de/gepris/projekt/326264959.</p>
      </fn>
    </fn-group>
    <ref-list>
      <ref id="ref-161935">
        <element-citation publication-type="book">
          <source>Structure Discovery in Natural Language</source>
          <person-group person-group-type="author">
            <name>
              <surname>Biemann</surname>
              <given-names>Chris</given-names>
            </name>
          </person-group>
          <publisher-name>Springer Berlin Heidelberg</publisher-name>
          <date>
            <year>2012</year>
          </date>
          <isbn>9783642259227</isbn>
          <pub-id pub-id-type="doi">10.1007/978-3-642-25923-4</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-3-642-25923-4">https://doi.org/10.1007/978-3-642-25923-4</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161936">
        <element-citation publication-type="book">
          <source>Foucaults Archäologie des kulturellen Unbewussten: Zum Wissensarchiv und Wissensbegehren moderner Gesellschaften</source>
          <person-group person-group-type="author">
            <name>
              <surname>Bublitz</surname>
              <given-names>Hannelore</given-names>
            </name>
          </person-group>
          <publisher-name>Campus</publisher-name>
          <date>
            <year>1999</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161937">
        <element-citation publication-type="book">
          <source>Digital_Humanities</source>
          <person-group person-group-type="author">
            <name>
              <surname>Burdick</surname>
              <given-names>Anne</given-names>
            </name>
            <name>
              <surname>Drucker</surname>
              <given-names>Johanna</given-names>
            </name>
            <name>
              <surname>Lunenfeld</surname>
              <given-names>Peter</given-names>
            </name>
            <name>
              <surname>Presner</surname>
              <given-names>Todd</given-names>
            </name>
            <name>
              <surname>Schnapp</surname>
              <given-names>Jeffrey</given-names>
            </name>
          </person-group>
          <publisher-name>MIT Press</publisher-name>
          <date>
            <year>2012</year>
          </date>
          <isbn>9780262312103</isbn>
          <pub-id pub-id-type="doi">10.7551/mitpress/9248.001.0001</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.7551/mitpress/9248.001.0001">https://doi.org/10.7551/mitpress/9248.001.0001</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161938">
        <element-citation publication-type="book">
          <source>The French Historical Revolution: The Annales School, 1929-1989</source>
          <person-group person-group-type="author">
            <name>
              <surname>Burke</surname>
              <given-names>Peter</given-names>
            </name>
          </person-group>
          <publisher-name>Polity</publisher-name>
          <date>
            <year>1990</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161939">
        <element-citation publication-type="book">
          <source>La formation du concept de réflexe aux XVIIe et XVIIIe siècles</source>
          <person-group person-group-type="author">
            <name>
              <surname>Canguilhem</surname>
              <given-names>Georges</given-names>
            </name>
          </person-group>
          <publisher-name>Vrin</publisher-name>
          <date>
            <year>1977</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161940">
        <element-citation publication-type="book">
          <source>Syntactic Structures</source>
          <person-group person-group-type="author">
            <name>
              <surname>Chomsky</surname>
              <given-names>Noam</given-names>
            </name>
          </person-group>
          <publisher-name>Mouton</publisher-name>
          <date>
            <day>31</day>
            <month>12</month>
            <year>1957</year>
          </date>
          <isbn>9783112316009</isbn>
          <pub-id pub-id-type="doi">10.1515/9783112316009</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1515/9783112316009">https://doi.org/10.1515/9783112316009</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161941">
        <element-citation publication-type="book">
          <source>History of Structuralism</source>
          <person-group person-group-type="author">
            <name>
              <surname>Dosse</surname>
              <given-names>François</given-names>
            </name>
          </person-group>
          <publisher-name>University of Minnesota Press</publisher-name>
          <date>
            <year>1998</year>
          </date>
          <volume>2</volume>
        </element-citation>
      </ref>
      <ref id="ref-161942">
        <element-citation publication-type="article-journal">
          <article-title>Foucault, digital</article-title>
          <source>Meson</source>
          <person-group person-group-type="author">
            <name>
              <surname>Dotzler</surname>
              <given-names>Bernhard</given-names>
            </name>
            <name>
              <surname>Schmidgen</surname>
              <given-names>Henning</given-names>
            </name>
          </person-group>
          <date>
            <year>2022</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161943">
        <element-citation publication-type="book">
          <source>Die epistemologischen Jahre: Philosophie und Biologie in Frankreich, 1960-1980</source>
          <person-group person-group-type="author">
            <name>
              <surname>Erdur</surname>
              <given-names>Onur</given-names>
            </name>
          </person-group>
          <publisher-name>Chronos</publisher-name>
          <date>
            <year>2018</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161944">
        <element-citation publication-type="chapter">
          <chapter-title>A synopsis of linguistic theory 1930-1955</chapter-title>
          <source>Studies in Linguistic Analysis</source>
          <person-group person-group-type="author">
            <name>
              <surname>Firth</surname>
              <given-names>John Rupert</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <collab>The Philological Society</collab>
          </person-group>
          <publisher-name>Blackwell</publisher-name>
          <date>
            <year>1962</year>
          </date>
          <fpage>1</fpage>
          <lpage>32</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161948">
        <element-citation publication-type="chapter">
          <chapter-title>Jean Hyppolite. 1907-1968</chapter-title>
          <source>Dits et écrits</source>
          <person-group person-group-type="author">
            <name>
              <surname>Foucault</surname>
              <given-names>Michel</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name>
              <surname>Defert</surname>
              <given-names>Daniel</given-names>
            </name>
            <name>
              <surname>Ewald</surname>
              <given-names>François</given-names>
            </name>
          </person-group>
          <publisher-name>Gallimard</publisher-name>
          <publisher-loc>Gallimard</publisher-loc>
          <date>
            <year>1994</year>
          </date>
          <volume>1</volume>
          <fpage>779</fpage>
          <lpage>785</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161946">
        <element-citation publication-type="chapter">
          <chapter-title>La psychologie de 1850 à 1950</chapter-title>
          <source>Dits et écrits</source>
          <person-group person-group-type="author">
            <name>
              <surname>Foucault</surname>
              <given-names>Michel</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name>
              <surname>Defert</surname>
              <given-names>Daniel</given-names>
            </name>
            <name>
              <surname>Ewald</surname>
              <given-names>François</given-names>
            </name>
          </person-group>
          <publisher-name>Gallimard</publisher-name>
          <publisher-loc>Gallimard</publisher-loc>
          <date>
            <year>1994</year>
          </date>
          <volume>1</volume>
          <fpage>120</fpage>
          <lpage>137</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161947">
        <element-citation publication-type="chapter">
          <chapter-title>Message ou bruit?</chapter-title>
          <source>Dits et écrits</source>
          <person-group person-group-type="author">
            <name>
              <surname>Foucault</surname>
              <given-names>Michel</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name>
              <surname>Defert</surname>
              <given-names>Daniel</given-names>
            </name>
            <name>
              <surname>Ewald</surname>
              <given-names>François</given-names>
            </name>
          </person-group>
          <publisher-name>Gallimard</publisher-name>
          <date>
            <year>1994</year>
          </date>
          <volume>1</volume>
          <fpage>557</fpage>
          <lpage>560</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161945">
        <element-citation publication-type="book">
          <source>The Archaeology of Knowledge and The Discourse on Language</source>
          <person-group person-group-type="author">
            <name>
              <surname>Foucault</surname>
              <given-names>Michel</given-names>
            </name>
          </person-group>
          <publisher-name>Pantheon</publisher-name>
          <date>
            <year>1982</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161949">
        <element-citation publication-type="book">
          <source>The Order of Things: An Archaeology of the Human Sciences</source>
          <person-group person-group-type="author">
            <name>
              <surname>Foucault</surname>
              <given-names>Michel</given-names>
            </name>
          </person-group>
          <publisher-name>Routledge</publisher-name>
          <date>
            <year>2002</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161950">
        <element-citation publication-type="chapter">
          <chapter-title>Human Nature: Justice versus Power</chapter-title>
          <source>Reflexive Water: The Basic Concerns of Mankind</source>
          <person-group person-group-type="author">
            <name>
              <surname>Foucault</surname>
              <given-names>Michel</given-names>
            </name>
            <name>
              <surname>Chomsky</surname>
              <given-names>Noam</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name>
              <surname>Elders</surname>
              <given-names>Fons</given-names>
            </name>
          </person-group>
          <publisher-name>Souvenir Press</publisher-name>
          <date>
            <year>1974</year>
          </date>
          <fpage>133</fpage>
          <lpage>197</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161951">
        <element-citation publication-type="chapter">
          <chapter-title>Le quantitatif en histoire</chapter-title>
          <source>Faire de l'histoire</source>
          <person-group person-group-type="author">
            <name>
              <surname>Furet</surname>
              <given-names>François</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name>
              <surname>Goff</surname>
              <given-names>Jacques Le</given-names>
            </name>
            <name>
              <surname>Nora</surname>
              <given-names>Pierre</given-names>
            </name>
          </person-group>
          <publisher-name>Gallimard</publisher-name>
          <date>
            <year>1974</year>
          </date>
          <volume>1</volume>
          <fpage>42</fpage>
          <lpage>61</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161952">
        <element-citation publication-type="article-journal">
          <article-title>Textocracy, or, the cybernetic logic of French theory</article-title>
          <source>History of the Human Sciences</source>
          <person-group person-group-type="author">
            <name>
              <surname>Geoghegan</surname>
              <given-names>Bernard Dionysius</given-names>
            </name>
          </person-group>
          <date>
            <month>2</month>
            <year>2020</year>
          </date>
          <volume>33</volume>
          <issue>1</issue>
          <fpage>52</fpage>
          <lpage>79</lpage>
          <issn>0952-6951</issn>
          <pub-id pub-id-type="doi">10.1177/0952695119864241</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1177/0952695119864241">https://doi.org/10.1177/0952695119864241</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161953">
        <element-citation publication-type="article-journal">
          <article-title>Exploratory Search Pipes with Scoped Facets</article-title>
          <source>2019 ACM SIGIR International Conference on Theory of Information Retrieval</source>
          <person-group person-group-type="author">
            <name>
              <surname>Gollub</surname>
              <given-names>Tim</given-names>
            </name>
            <name>
              <surname>Hutans</surname>
              <given-names>Leon</given-names>
            </name>
            <name>
              <surname>Al Jami</surname>
              <given-names>Tanveer</given-names>
            </name>
            <name>
              <surname>Stein</surname>
              <given-names>Benno</given-names>
            </name>
          </person-group>
          <date>
            <day>26</day>
            <month>9</month>
            <year>2019</year>
          </date>
          <pub-id pub-id-type="doi">10.1145/3341981.3344247</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/3341981.3344247">https://doi.org/10.1145/3341981.3344247</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161955">
        <element-citation publication-type="article-journal">
          <article-title>Discourse Analysis</article-title>
          <source>Language</source>
          <person-group person-group-type="author">
            <name>
              <surname>Harris</surname>
              <given-names>Zellig S.</given-names>
            </name>
          </person-group>
          <date date-type="publication-start">
            <month>1</month>
            <year>1952</year>
          </date>
          <date date-type="publication-end">
            <month>3</month>
            <year>1952</year>
          </date>
          <volume>28</volume>
          <issue>1</issue>
          <fpage>1</fpage>
          <issn>0097-8507</issn>
          <pub-id pub-id-type="doi">10.2307/409987</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.2307/409987">https://doi.org/10.2307/409987</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161956">
        <element-citation publication-type="article-journal">
          <article-title>Distributional Structure</article-title>
          <source>Word</source>
          <person-group person-group-type="author">
            <name>
              <surname>Harris</surname>
              <given-names>Zellig S.</given-names>
            </name>
          </person-group>
          <date>
            <month>8</month>
            <year>1954</year>
          </date>
          <volume>10</volume>
          <issue>2-3</issue>
          <fpage>146</fpage>
          <lpage>162</lpage>
          <issn>0043-7956</issn>
          <pub-id pub-id-type="doi">10.1080/00437956.1954.11659520</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1080/00437956.1954.11659520">https://doi.org/10.1080/00437956.1954.11659520</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161954">
        <element-citation publication-type="book">
          <source>Methods in Structural Linguistics</source>
          <person-group person-group-type="author">
            <name>
              <surname>Harris</surname>
              <given-names>Zellig S.</given-names>
            </name>
          </person-group>
          <publisher-name>The University of Chicago Press</publisher-name>
          <date>
            <year>1951</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161957">
        <element-citation publication-type="chapter">
          <source>Michel Pêcheux: Automatic Discourse Analysis</source>
          <person-group person-group-type="editor">
            <name>
              <surname>Helsloot</surname>
              <given-names>Niels</given-names>
            </name>
            <name>
              <surname>Hak</surname>
              <given-names>Tony</given-names>
            </name>
          </person-group>
          <publisher-name>Rodopi</publisher-name>
          <date>
            <year>1995</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161958">
        <element-citation publication-type="article-journal">
          <article-title>Totale Bibliothek und Schreibmaschine. Zum Begriff der Streuung in Foucaults Diskursanalyse</article-title>
          <source>Figurationen</source>
          <person-group person-group-type="author">
            <name>
              <surname>Herrmann</surname>
              <given-names>Hans-Christian von</given-names>
            </name>
          </person-group>
          <date>
            <day>1</day>
            <month>12</month>
            <year>2015</year>
          </date>
          <volume>16</volume>
          <issue>2</issue>
          <fpage>62</fpage>
          <lpage>72</lpage>
          <issn>1439-4367</issn>
          <pub-id pub-id-type="doi">10.7788/figurationen-2015-0207</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.7788/figurationen-2015-0207">https://doi.org/10.7788/figurationen-2015-0207</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161959">
        <element-citation publication-type="book">
          <source>Macroanalysis: Digital Methods and Literary History</source>
          <person-group person-group-type="author">
            <name>
              <surname>Jockers</surname>
              <given-names>Matthew L.</given-names>
            </name>
          </person-group>
          <publisher-name>University of Illinois Press</publisher-name>
          <date>
            <day>1</day>
            <month>4</month>
            <year>2013</year>
          </date>
          <isbn>9780252037528</isbn>
          <pub-id pub-id-type="doi">10.5406/illinois/9780252037528.001.0001</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5406/illinois/9780252037528.001.0001">https://doi.org/10.5406/illinois/9780252037528.001.0001</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161960">
        <element-citation publication-type="book">
          <source>Who Wrote the Book of Life? A History of the Genetic Code</source>
          <person-group person-group-type="author">
            <name>
              <surname>Kay</surname>
              <given-names>Lily E.</given-names>
            </name>
          </person-group>
          <publisher-name>Stanford University Press</publisher-name>
          <date>
            <day>1</day>
            <month>3</month>
            <year>2000</year>
          </date>
          <isbn>9781503617575</isbn>
          <pub-id pub-id-type="doi">10.1515/9781503617575</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1515/9781503617575">https://doi.org/10.1515/9781503617575</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161961">
        <element-citation publication-type="article-journal">
          <article-title>The geometry of culture: Analyzing the meanings of class through word embeddings</article-title>
          <source>American Sociological Review</source>
          <person-group person-group-type="author">
            <name>
              <surname>Kozlowski</surname>
              <given-names>Austin C.</given-names>
            </name>
            <name>
              <surname>Taddy</surname>
              <given-names>Matt</given-names>
            </name>
            <name>
              <surname>Evans</surname>
              <given-names>James A.</given-names>
            </name>
          </person-group>
          <date>
            <day>25</day>
            <month>9</month>
            <year>2019</year>
          </date>
          <volume>84</volume>
          <issue>5</issue>
          <fpage>905</fpage>
          <lpage>949</lpage>
          <issn>0003-1224</issn>
          <pub-id pub-id-type="doi">10.1177/0003122419877135</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1177/0003122419877135">https://doi.org/10.1177/0003122419877135</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161962">
        <element-citation publication-type="book">
          <source>The Structure of Scientific Revolutions</source>
          <person-group person-group-type="author">
            <name>
              <surname>Kuhn</surname>
              <given-names>Thomas S.</given-names>
            </name>
          </person-group>
          <publisher-name>The University of Chicago Press</publisher-name>
          <date>
            <year>2012</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161963">
        <element-citation publication-type="article-journal">
          <article-title>La fin des érudits: L’historien de demain sera programmeur ou ne sera pas</article-title>
          <source>Le Nouvel Observateur</source>
          <person-group person-group-type="author">
            <name>
              <surname>Le Roy Ladurie</surname>
              <given-names>Emmanuel</given-names>
            </name>
          </person-group>
          <date>
            <day>8</day>
            <month>5</month>
            <year>1968</year>
          </date>
          <fpage>2</fpage>
          <lpage>3</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161964">
        <element-citation publication-type="book">
          <source>Le territoire de l’historien</source>
          <person-group person-group-type="author">
            <name>
              <surname>Le Roy Ladurie</surname>
              <given-names>Emmanuel</given-names>
            </name>
          </person-group>
          <publisher-name>Gallimard</publisher-name>
          <date>
            <year>1973</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161965">
        <element-citation publication-type="article-journal">
          <article-title>‘L’historien de demain sera programmeur’: Emmanuel Le Roy Ladurie et les défis de la science</article-title>
          <source>L’histoire à la BnF</source>
          <person-group person-group-type="author">
            <name>
              <surname>Lemny</surname>
              <given-names>Stefan</given-names>
            </name>
          </person-group>
          <date>
            <month>12</month>
            <year>2017</year>
          </date>
          <ext-link ext-link-type="uri" xlink:href="https://histoirebnf.hypotheses.org/1505">https://histoirebnf.hypotheses.org/1505</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161966">
        <element-citation publication-type="article-journal">
          <article-title>Quantitative analysis of culture using millions of digitized books</article-title>
          <source>Science</source>
          <person-group person-group-type="author">
            <name>
              <surname>Michel</surname>
              <given-names>Jean-Baptiste</given-names>
            </name>
            <name>
              <surname>Shen</surname>
              <given-names>Yuan Kui</given-names>
            </name>
            <name>
              <surname>Aiden</surname>
              <given-names>Aviva Presser</given-names>
            </name>
            <name>
              <surname>Veres</surname>
              <given-names>Adrian</given-names>
            </name>
            <name>
              <surname>Gray</surname>
              <given-names>Matthew K.</given-names>
            </name>
            <name>
              <surname>Pickett</surname>
              <given-names>Joseph P.</given-names>
            </name>
            <name>
              <surname>Hoiberg</surname>
              <given-names>Dale</given-names>
            </name>
            <name>
              <surname>Clancy</surname>
              <given-names>Dan</given-names>
            </name>
            <name>
              <surname>Norvig</surname>
              <given-names>Peter</given-names>
            </name>
            <name>
              <surname>Orwant</surname>
              <given-names>Jon</given-names>
            </name>
            <name>
              <surname>Pinker</surname>
              <given-names>Steven</given-names>
            </name>
            <name>
              <surname>Nowak</surname>
              <given-names>Martin A.</given-names>
            </name>
            <name>
              <surname>Aiden</surname>
              <given-names>Erez Lieberman</given-names>
            </name>
            <collab>The Google Books Team</collab>
          </person-group>
          <date>
            <day>14</day>
            <month>1</month>
            <year>2011</year>
          </date>
          <volume>331</volume>
          <issue>6014</issue>
          <fpage>176</fpage>
          <lpage>182</lpage>
          <issn>0036-8075</issn>
          <pub-id pub-id-type="doi">10.1126/science.1199644</pub-id>
          <pub-id pub-id-type="pmid">21163965</pub-id>
          <pub-id pub-id-type="pmcid">PMC3279742</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1126/science.1199644">https://doi.org/10.1126/science.1199644</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161968">
        <element-citation publication-type="book">
          <source>Distant reading</source>
          <person-group person-group-type="author">
            <name>
              <surname>Moretti</surname>
              <given-names>Franco</given-names>
            </name>
          </person-group>
          <publisher-name>Verso</publisher-name>
          <date>
            <year>2013</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161967">
        <element-citation publication-type="book">
          <source>Graphs, maps, trees: Abstract models for a literary history</source>
          <person-group person-group-type="author">
            <name>
              <surname>Moretti</surname>
              <given-names>Franco</given-names>
            </name>
          </person-group>
          <publisher-name>Verso</publisher-name>
          <date>
            <year>2005</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161969">
        <element-citation publication-type="chapter">
          <chapter-title>Textklassifikation und Stilanalyse: Gedanken zur automatischen Beschreibung eines Produktes und seines Produktionsprozesses</chapter-title>
          <source>Literatur und Datenverarbeitung: Ein Tagungsbericht</source>
          <person-group person-group-type="author">
            <name>
              <surname>Müller</surname>
              <given-names>Werner</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name>
              <surname>Schanze</surname>
              <given-names>Helmut</given-names>
            </name>
          </person-group>
          <publisher-name>Niemeyer</publisher-name>
          <date>
            <year>1972</year>
          </date>
          <fpage>160</fpage>
          <lpage>187</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161970">
        <element-citation publication-type="article-journal">
          <article-title>Language Models are Few-Shot Learners</article-title>
          <source>CoRR</source>
          <person-group person-group-type="author">
            <collab>OpenAI</collab>
          </person-group>
          <date>
            <year>2020</year>
          </date>
          <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2005.14165v4,">https://arxiv.org/abs/2005.14165v4,</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161971">
        <element-citation publication-type="book">
          <source>Le Mirage linguistique</source>
          <person-group person-group-type="author">
            <name>
              <surname>Pavel</surname>
              <given-names>Thomas</given-names>
            </name>
          </person-group>
          <publisher-name>Gallimard</publisher-name>
          <date>
            <year>1988</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161972">
        <element-citation publication-type="article-journal">
          <article-title>Recent Quantitative Work in History: A Survey of the Main Trends</article-title>
          <source>History and Theory</source>
          <person-group person-group-type="author">
            <name>
              <surname>Price</surname>
              <given-names>Jacob M.</given-names>
            </name>
          </person-group>
          <date>
            <year>1969</year>
          </date>
          <volume>9</volume>
          <fpage>1</fpage>
          <issn>0018-2656</issn>
          <pub-id pub-id-type="doi">10.2307/2504167</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.2307/2504167">https://doi.org/10.2307/2504167</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161973">
        <element-citation publication-type="article-journal">
          <article-title>Contextual Correlates of Synonymy</article-title>
          <source>Communications of the ACM</source>
          <person-group person-group-type="author">
            <name>
              <surname>Rubenstein</surname>
              <given-names>Herbert</given-names>
            </name>
            <name>
              <surname>Goodenough</surname>
              <given-names>John B.</given-names>
            </name>
          </person-group>
          <date>
            <month>10</month>
            <year>1965</year>
          </date>
          <volume>8</volume>
          <issue>10</issue>
          <fpage>627</fpage>
          <lpage>633</lpage>
          <issn>0001-0782</issn>
          <pub-id pub-id-type="doi">10.1145/365628.365657</pub-id>
          <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/365628.365657">https://doi.org/10.1145/365628.365657</ext-link>
        </element-citation>
      </ref>
      <ref id="ref-161974">
        <element-citation publication-type="book">
          <source>Discourse Processing: Synthesis Lectures on Human Language Technologies</source>
          <person-group person-group-type="author">
            <name>
              <surname>Stede</surname>
              <given-names>Manfred</given-names>
            </name>
          </person-group>
          <publisher-name>Morgan and Claypool</publisher-name>
          <date>
            <year>2011</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161975">
        <element-citation publication-type="chapter">
          <chapter-title>The Example: Some Historical Considerations</chapter-title>
          <source>Between the Humanities and the Digital</source>
          <person-group person-group-type="author">
            <name>
              <surname>Sterne</surname>
              <given-names>Jonathan</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name>
              <surname>Goldberg</surname>
              <given-names>David Theo</given-names>
            </name>
            <name>
              <surname>Svensson</surname>
              <given-names>Patrik</given-names>
            </name>
          </person-group>
          <publisher-name>MIT Press</publisher-name>
          <date>
            <year>2015</year>
          </date>
          <fpage>17</fpage>
          <lpage>33</lpage>
        </element-citation>
      </ref>
      <ref id="ref-161976">
        <element-citation publication-type="book">
          <source>The General Inquirer: A Computer Approach to Content Analysis</source>
          <person-group person-group-type="author">
            <name>
              <surname>Stone</surname>
              <given-names>Philip J.</given-names>
            </name>
            <name>
              <surname>Dunphy</surname>
              <given-names>Dexter</given-names>
            </name>
            <name>
              <surname>Smith</surname>
              <given-names>Marshall S.</given-names>
            </name>
            <name>
              <surname>Ogilvie</surname>
              <given-names>Daniel M.</given-names>
            </name>
          </person-group>
          <publisher-name>MIT Press</publisher-name>
          <date>
            <year>1966</year>
          </date>
        </element-citation>
      </ref>
      <ref id="ref-161977">
        <element-citation publication-type="article-journal">
          <article-title>A Genealogy of Distant Reading</article-title>
          <source>Digital Humanities Quarterly</source>
          <person-group person-group-type="author">
            <name>
              <surname>Underwood</surname>
              <given-names>Ted</given-names>
            </name>
          </person-group>
          <date>
            <year>2017</year>
          </date>
          <volume>11</volume>
          <issue>2</issue>
        </element-citation>
      </ref>
      <ref id="ref-161978">
        <element-citation publication-type="book">
          <source>Foucault’s Archaeology: Science and Transformation</source>
          <person-group person-group-type="author">
            <name>
              <surname>Webb</surname>
              <given-names>David</given-names>
            </name>
          </person-group>
          <publisher-name>Edinburgh University Press</publisher-name>
          <date>
            <year>2013</year>
          </date>
        </element-citation>
      </ref>
    </ref-list>
  </back>
</article>
