<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "https://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1-mathml3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.2" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">1832</journal-id>
      <journal-title-group>
        <journal-title>Journal of Cultural Analytics</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2371-4549</issn>
      <publisher>
        <publisher-name>Center for Digital Humanities, Princeton University</publisher-name>
      </publisher>
      <self-uri xlink:href="https://culturalanalytics.org/">Website: Journal of Cultural Analytics</self-uri>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">11882</article-id>
      <article-id pub-id-type="doi">10.22148/001c.11882</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Topic Modeling the Hàn diăn Ancient Classics (汉典古籍)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Allen</surname>
            <given-names>Colin</given-names>
          </name>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Luo</surname>
            <given-names>Hongliang</given-names>
          </name>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Murdock</surname>
            <given-names>Jaimie</given-names>
          </name>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Pu</surname>
            <given-names>Jianghuai</given-names>
          </name>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Wang</surname>
            <given-names>Xiaohong</given-names>
          </name>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Zhai</surname>
            <given-names>Yanjie</given-names>
          </name>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Zhao</surname>
            <given-names>Kun</given-names>
          </name>
        </contrib>
      </contrib-group>
      <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2017-10-12">
        <day>12</day>
        <month>10</month>
        <year>2017</year>
      </pub-date>
      <pub-date publication-format="electronic" date-type="collection" iso-8601-date="2021-05-03">
        <year>2017</year>
      </pub-date>
      <volume>2</volume>
      <issue seq="4">1</issue>
      <issue-title>Articles in 2017</issue-title>
      <elocation-id>11882</elocation-id>
      <permissions>
        <license license-type="open-access">
          <ali:license_ref xmlns:ali="http://www.niso.org/schemas/ali/1.0/">
              http://creativecommons.org/licenses/by/4.0
            </ali:license_ref>
          <license-p>
              This is an open access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0">Creative Commons Attribution License (4.0)</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
            </license-p>
        </license>
      </permissions>
      <self-uri content-type="pdf" xlink:href="https://culturalanalytics.org/article/11882.pdf"/>
      <self-uri content-type="xml" xlink:href="https://culturalanalytics.org/article/11882.xml"/>
      <self-uri content-type="json" xlink:href="https://culturalanalytics.org/article/11882.json"/>
      <self-uri content-type="html" xlink:href="https://culturalanalytics.org/article/11882"/>
      <abstract>
        <p>There is a small but growing literature on large-scale statistical modeling of Chinese language texts. Ouyang analyzed a corpus of over 40,000 ancient documents downloaded from multiple sources. This was used to plot the temporal distributions of word frequencies and geographic distributions of authors. Huang and Yu modeled the SongCi poetry corpus, first converting it to tonally marked pinyin to conserve poetically important pronunciation information. Nichols and colleagues reported initial modeling of the Chinese Text Project corpus1 in a conference paper. (Further below, we describe differences between this corpus and the Handian.) With additional collaborators, this group has now conducted two studies that are currently unpublished but under review. In the first, they apply topic models to address scholarly questions about the relationships among important texts of Ancient Chinese philosophy. In the second, they use topic modeling to investigate the concepts of mind and body in ancient Chinese philosophy. Although we share similar scholarly objectives with these researchers, our approach in this paper is unique in that for the first time anywhere we bring the benefits of computational modeling of ancient Chinese texts to a robust public platform that is mirrored on both sides of the Pacific. Besides being just a useful portal to the texts, our approach foregrounds the interpretive issues surrounding topic models, and makes more sophisticated exploration and analysis of interpretive questions possible for experts and novices alike.</p>
      </abstract>
      <kwd-group>
        <kwd>literature</kwd>
        <kwd>east asian</kwd>
        <kwd>topic modeling</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
