Culture is a nuanced term that is hard to define conclusively. It is largely used to refer to the totality of values, basic assumptions and life orientations, beliefs, policies, procedures, and codes of conduct. These are shared by a group of people and affect their behaviour and the way they lead their lives. Cultural Analytics (CA), as a field, examines vast amounts of cultural data (books, images, newspapers, music, literature, etc.) to derive culturally relevant insights. Various methods and techniques (natural language processing, network analysis, visualization, and data mining) are applied to cultural components so that they are useful for research in the humanities. CA can help identify the behavioural components of human cultures and provide an accurate insight into the degree to which people conform to the current or target culture. It utilizes the corpora, metadata, and tools of text and image analysis to provide meaningful insights into the subject of research.

Cultural Analytics enables scholars to conduct research on unprecedented quantitative scales, but the vast digitized Arabic corpus has largely gone untapped. Most of the research projects in Digital Humanities have focused on Western Europe and the Americas, tools having been developed and mainly trained to handle the Latin scripts. This gap between the Arabic text corpus and CA is in fact part of a wider gap between Western and non-Western CA, making this project both an exercise in catching-up and a pioneering foray into uncharted waters.

This special issue came as the outcome of a research project I led and is titled “The Computational Study of Culture: Cultural Analytics for Modern Arab and Muslim Studies,” which was funded by Qatar National Research Fund, National Priorities Program cycle no. 10. The project uses Cultural Analytics (CA) approaches to digitize and annotate a large archive of machine-readable Arabic language cultural texts from the nineteenth century to today to make these documents useful for digital humanities research. It manages to digitize and annotate a set of early Arabic media material from the late 19th century until now. These sources include the magazines of Al-Manar, Al-Risala, Al-Muqtabas and Al-Machriq. The project manages to capture, for example, through topic modelling, a main representational paradigm framing identity formation during the inter-war and post-War Egypt, with a specific reference to the journalistic articles in Al-Manar and Al-Risala, the two leading and most effective platforms at the time. We approach the identitarian discursive practices as a cultural continuum, rather than separate moments in Arabo-Islamic Egyptian history. Through a blending of digital (distant) reading and close analytical reading of articles addressing the topic, we try to nuance the almost settled readings of both luminary figures in the Arab Nahda through broadening the perspective.

A corpus of 2.3 billion words comprising 9000 books and 5 newspapers and magazines has been compiled. A subset of this corpus has been annotated for morphological segmentation, part of speech tagging, named entity recognition and sentiment analysis. The annotated corpus is used for building predictive models. The approach we employed in our research is both quantitative and qualitative. At the quantitative level, we use tools and algorithms from computational linguistics, data mining, and information extraction to extract useful information from our corpus. This information is then used to build a graph that is used for network analysis. The qualitative side of the analysis makes use of the graph extracted in the first step, to gain, and provide, meaningful insights that can be used in the historical and socio-political aspect of the project.

Figure 1
Figure 1.The project’s NLP pipeline

Drawing on computational analysis or topic modelling, we thus probe relevant thematic discussions on the conceptualization of race, language, culture and identity by leading Arab-Islamic intelligentsia at a foundational moment that paved the way for Arab modernity. Here, topic modelling serves as a statistical tool for handling big data that is almost impossible to process through conventional methods (here thousands of pages of both Al-Manar and Al-Risala). It involves segmenting texts, revealing their underlying cultural patterns, through calculation of specific topic models, measurement of topical associations or regularities and capturing coherence values, all coupled with interpretation of valence and quality through conventional close readings. So, it is used in identifying variables or features in social science models. In other words, it is a tool for amplified reading or seeing large themes in a massive group of texts; we are speaking about thousands of pages, that are almost impossible to read individually. Although we realize the ongoing challenge in the field to examine complex ideas like colonialism, nationalism or racism through distant and close readings, given the intrinsic difficulty of pinpointing such slippery or nuanced terms, however, the capability to generalize across substantially broader aspects of culture is crucial. Still, we believe that – through blending the empirical and the theoretical and through the analytical lens of our close readings – such a tool can help in detecting the underlying patterns and collective frames surrounding the issues under discussion. At least, it helps in giving indications and thus sustaining specific anticipations about the future of such discourses and their consequent realities on the ground. It is so far an optimum tool in addressing big data; we are speaking about thousands of pages that are almost impossible to handle manually within a reasonable span of time.

The project seeks to bridge the Arabic part of the gap by building on the expertise of NLeSC and CLARIAH to develop a set of basic and more advanced tools for Arabic text-mining and computational analysis, to be used in our project case studies. The platform developed by this project is the first smart one in Arabic. It hosts hundreds of thousands of materials. Being vast and smart, the project’s platform is expected to be a unique resource for researchers in the humanities and social sciences. One major advantage of the platform is that much of the current qualitative research in the humanities will turn (at least partially) quantitative. Instead of doing research on a small scale, researchers can now make use of a large dataset with unique annotation. Adding metadata to the platform is the single most important and challenging part. While a collection may be useful in its own, an annotated collection is far more superior as it lends itself easily to questions never considered before.

The project made a major contribution to comparative cultural studies that critically re-evaluate the place of the West in relation to “the East,” and vice versa. Theoretically, and by applying a transcultural lens that highlights the critical fault lines in canonical, Western-centric concepts and theories, this project constituted an attempt to intervene significantly in previously Western-dominated debates regarding a range of foundational concepts: race/ethnicity, the public sphere, social movements, citizenship, collective memory, and identity, among others. Methodologically, the project is profoundly collaborative, deploying the expertise of a number of scholars from different continents and disciplinary traditions. And empirically, this project made a major contribution to public and scholarly knowledge through collection and curation of archival material through digital methods.

This special issue explores how we can improve current performance in terms of understanding Arab cultural and social change and how we can implement behavioural changes or transformations more smoothly. The articles in this special issue use digitized printed texts, including books, journals, and printed ephemera to expand our knowledge of the key players in the printed world during a crucial period of modern Arab history. The combination of these sources will illuminate the intellectual trajectories of those who produced such texts and their transnational formal and informal networks.

This special issue is highly multidisciplinary by nature. It combines the efforts of researchers from the fields of humanities, social science, and computational linguistics, bringing together scholars with unique expertise and skills in the field of Arabic digital humanities to expand the current state of knowledge for researchers in fields of history, cultural studies, sociology, and the arts. The aim is to bring together professionals to discuss their ongoing project paradigms. Examples of their work will be examined for their efficiency and added value to knowledge in their fields of study related to Arab culture to map out the trends and fluctuations that enable us to better understand this current phenomena in Arab culture today.

Culture can act as an archetypal example or pattern that provides a model for the disparate parts and brings them together to form a harmonious whole. In this way, a culture is created that is imposed by one group or community on another. The concept of culture varies, however, from an Eliotean “whole way of life” to an Arnoldean acquaintance “with the best that has been known and said in the world, and thus with the history of the human spirit.” In his book, The Machine in the Garden, Leo Marx argues the centrality of American “high” culture to the sustenance of human values.

This special issue uses the theoretical framework described above to look at the minorities in the Arab world (especially in Egypt) and the cultural and political constructions of empire. Some of the articles in this issue analyze their approaches and their arguments about the nature and functioning of hegemony and/or imperialist logic. They use computational tools to explore the role of culture in this, and how Arab Jews, for example, differ in their models and approaches. As indicated by Edward W. Said in Culture and Imperialism, the totality of race, gender, class, etc. constitute the main constituent factors of culture, and empire as a political and economic entity. More specifically, Said links the above-mentioned phases of culture, literature, and the intelligentsia to colonization and racism. Literary and artistic works as cultural products are linked to society in general in the past or in the present.

Such a critical context helps in unraveling and nuancing the concepts of culture, language, and race in Arab Nahḍah discourse within framework of collective identity discourses. This includes a discussion of the representation ethnic minorities (particularly the Jews) as embedded in this discourse of collective Arab identity, in comparison to the discourse of race and racism. This perspective acknowledges a broader sense of racial discrimination that is not limited to essentialist prejudice or discrimination predicated on biological differences but expands the taxonomy of race to include prejudice or discrimination based on cultural differences. Indeed, such culture(religion)-based prejudice is inseparable from racism, knowing that sometimes cultural and color racism are intertwined when cultural identity is a marker of race (Islamic, Christian/Coptic, and Jewish). So, the discourse of racialism is embedded in cultural and economic sociology which are as essential as the biological element itself. In his definition of the constituents of racism in Britain, Tariq Modood notes that “in the long history of racism it is nineteenth-century biologism that is the exception, and certainly Europe’s oldest racisms, anti-Semitism and Islamophobia, are culturalist.” These racial manifestations shall be discussed as engaged in Rashid Riḍā’s al-Manār and Aḥmad Hassan al-Zayyāt’s al-Risālah. Both magazines are selected for articles in this Special Issue due to their significance as platforms for two leading revivalist projects, Islamism and liberalism, and to their reconciliatory/hybridist approach that lent itself to conflicting ideologies in the Arab world of their time as well as today. The relevance of the works of Riḍā and al-Zayyāt is reflected in the fact that they are still invoked as influential figures by both the Salafis and the secularists, or the traditionalists and the progressive modernists in the major discourse on Arab identity.

Here it becomes essential to explore the role of “organic” intellectuals (to brorrow from Gramsci), in confronting cultural imperialism and maintaining native language and culture to counter the hegemonic culture of the empire. Such intellectuals help in unravelling this intricate web of identities against the troubling process of racialization that disconnect minorities from their cultural citizenship. A relevant example is the pervasive racial profiling of the Jews within Arab intellectual and cultural circles which involves a convergence that blurs the lines between reality and perception. Arab Jews undergo a comparable process of racialization, which raises fundamental questions about belonging and citizenship. This racialization in Egypt, fundamentally flawed in its epistemological underpinnings, stems from a skewed interpretation of citizenship within diverse Arab societies. In this context, we assert that the concept of transcultural citizenship emerges as an empowering framework that defines belonging based on lived experiences. It defies the challenges of Islamophobia in the United States, whether driven by racialized nationalism or restrictive multiculturalism. Consequently, transculturality becomes a tool of resistance for minority subjectivity, countering the state’s hegemonic subjectification (in the Foucauldian sense).

This definition aligns with the transnational context of anthropologist Aihwa Ong’s notion of cultural citizenship, a dual process of self-realization and external shaping within the realms of power connected to the nation-state and civil society. Furthermore, it extends Sunaina Maira‘s concept of “flexible citizenship,” a manifestation that responds to shifts in the institution of citizenship within nation-states and changing power dynamics on both national and global scales. Arab Jewish transcultural identity emerges organically, albeit not without critical self-examination of its limitations. It reflects Arab Jews’ complex attitude toward the Arab world, where they navigate a paradoxical relationship with their Jewish heritage while simultaneously embracing it. In essence, this narrative underscores the potential of transculturality to foster change by fostering dialogue and conversation, acting as a potential remedy for both external colonization and local extremism. Only through these interactions can we truly envision and enact meaningful transformations.

In the pre-Nahḍah era, magazines were the locomotive of cultural change and enlightenment in an Arab world suffering from both vicious colonialism and oppressive monarchy. These magazines brought dynamism to a largely stagnant Arabo-Islamic world. They also served as a platform not only for interreligious dialogues, debates and disputes, but also for a diverse process of translating world literature. No wonder that major intellectuals of the time contributed to these magazines and added to their lively and innovative discussions. Both al-Manār and then al-Risālah played key roles in reflecting the spirit of their times and in enlivening the cultural sphere in Egypt and relatively in the rest of the Arab Islamic world, given their circularity outside Egypt. Here, the cultural analytical approach, which focuses on the emerging collective patterns, is consistent with the emerging collective sense of identity observed during the recent Arab uprisings that swept the Arab world a decade ago.

Egypt was witnessing a situation of cultural upheaval triggered by the emergence of dynamic cultural journalism. Magazines were the fulcrum of change, in literary and cultural tastes, reflecting the mood of the times, and of course broadening the base of the reading public and driving the process of modernization. Magazines then, largely driven by individual projects, defied institutionalization, in the Foucauldian sense, and – by developing autonomy – served as an arena for creativity and accelerated cultural change. Both Riḍā and al-Zayyāt assumed an all-encompassing hybridist discourse that transcended the dilemmatic binaries in religion, culture and politics of the time. Riḍā, for example, did not follow any particular school of jurisprudence but was – to quote Leor Halevi in Modern Things on Trial – eclectic in his “laissez-faire Salafism.” Similarly, al-Zayyāt was an advocate of multiculturalism and literary syncretism, particularly through his translations of English and French canonical literature into Arabic.

It should be noted, however, that the two journals, like the Nahḍah project in the Arab world, which was nipped in the bud, were individual initiatives that were discontinued with the death of their owners and main editors. Here springs the felicity of the revival of their legacy through the current project. The two men, Riḍā and al-Zayyāt, dreamed of an Arabo-Islamic world united under a common roof. For Riḍā, this was initially the Ottoman Caliphate. Then, after the rise of radical and purist movements, such as Pan-Turanism and Pan-Turkism, the project of Ottoman-led Pan-Islamism gave way to a kind of Arab Pan-Islamism. Interestingly, Riḍā’s Arab Pan-Islamism was itself a nativist version of Islam in the face of a then emerging Pharaonic or Coptic nationalism, not to mention liberalism or secularism.

The article by Umar Ryad and Emad Mohamed titled, “A Topic Modelling of Muslim Religious Reform in the Colonial Age: A Computational and Digital Study of al-Manār (1898-1935)” serves as a test case illustrating the synergies between computational sciences and Islamic and Arabic studies in addressing cultural, religious, and historical inquiries pertaining to the Arab and Muslim world. The central focus revolves around exploring computational methods to systematically trace, quantify, and explicate the evolution of religious concerns within the renowned Muslim reformist journal, al-Manār (Lighthouse). This journal was published by the influential Muslim reformer Muhammad Rashīd Riḍā (1865-1935) from 1898 to 1935 in Cairo. The methodology employed encompasses both quantitative and qualitative approaches, utilizing the al-Manār-corpus. Morphological processing and topic modelling techniques are applied to analyze thematic co-occurrences of topics and lexemes relevant to Muslim thought and societies during Riḍā’s era. Through this interdisciplinary framework, the paper aims to shed light on the computational tools’ efficacy in unraveling and understanding the nuanced development of religious reformist discourse as encapsulated in the pages of al-Manār. The exploration of these methods provides a novel perspective on the intersection of computational sciences and Islamic and Arabic studies in unraveling cultural and historical dimensions.

Eid Mohamed and Talaat Mohamed in their article titled, “Racio-national Imaginary and Discursive Formation of Arabo-Islamic Identity in al-Manār and al-Risālah: An LDA Topic Modeling Study” delves into the intricacies of culture, language, and race, serving as pivotal elements in the discourse surrounding Pan-Islamist/Pan-Arabist national identification in Egypt during a period of profound transformation in the political and social landscape, setting the groundwork for the subsequent century. The methodology employed adopts a computational analysis approach, leveraging topic modeling to delve into thematic discussions that revolve around the conceptualization of race, language, culture, and identity by prominent Arab-Muslim intelligentsia. This exploration unfolds during a foundational moment that laid the groundwork for the Arab Nahḍah or modernity. The focus of the analysis is on tracing the intellectual development within the writings of two key figures: Muḥammad Rashīd Riḍā (1865-1935), whose works featured prominently in the magazine he edited, al-Manār (The Lighthouse, 1898-1935), and Aḥmad Ḥassan al-Zayyāt (1885-1968), the editor of al-Risālah (The Message, 1933-1953), a weekly magazine, both based in Cairo, Egypt. The study reveals that both figures aimed to foster a predominantly hybridized discourse, encompassing Islamist and Arabist perspectives, as manifested in the clustered paradigms of modeled topics. Ultimately, the analysis contributes to a deeper understanding of the intellectual landscape during this transformative era in Egyptian history.

The primary objective of the article titled, “Gender-Based Conceptual Reasoning Detection for the Cultural Arabic Text” by Raheem Sarwar and Emad Mohamed, is to explore the distinctions in queries raised by individuals of different genders within a religious context. Additionally, they aim to discern whether it is feasible to forecast the popularity of answers and identify the factors contributing to answer popularity. Their methodology involves the creation of a novel dataset comprising 40,000 questions and answers, annotated with gender and popularity details sourced from online question-answering platforms. Employing advanced Arabic text pre-processing techniques, they extensively employ machine learning algorithms in their experimental studies for two main tasks: predicting asker gender and forecasting answer popularity. Their investigation also extends to thematic gender variation analysis, addressing crucial research questions that augment current knowledge. Specifically, they delve into understanding the disparities between questions posed by women and men, explore the automated classification of questions based on gender, and assess the predictability of fatwa popularity, unraveling the elements contributing to the popularity of a fatwa. The outcome of their experimental analysis showcases promising results: they achieve a 98% accuracy in predicting gender, a minimal Mean Absolute Error in predicting views (popularity), and an insightful identification of topics and their associations, elucidating their gender-related relevance. To further contribute to the research community, the authors intend to make the dataset and source code publicly accessible.

The article titled, “Neither Corpus Nor Edition: Building a Pipeline to Make Data Analysis Possible on Medieval Arabic Commentary Traditions” by Cornelis van Lit and Dirk Roorda, enriches the issue through developing a suite of Python tools specifically tailored for the efficient analysis of text reuse and intertextuality within a distinct category of medieval Arabic texts, namely commentaries available in print. Their process involves taking these printed editions, scanning them, pre-processing the images, running them through an OCR engine, cleaning the output, and organizing it into a data structure that mirrors the explicit intertextual relationships among the texts. Subsequently, they conduct extensive data analysis using these structured datasets. In the realm of digital approaches to medieval Arabic texts, existing methods have predominantly fallen into two categories. On one hand, the authors have the micro-level approach, commonly referred to as a ‘digital edition,’ where individual texts are digitally represented with dense annotations, typically in TEI-XML. On the other hand, there’s the macro-level approach, termed a ‘digital corpus,’ comprising thousands of loosely encoded and sparsely annotated plain text files. These two approaches differ significantly in scale, with the micro-level dealing with tens of thousands of words designed primarily for human readability, and the macro-level extending to over a billion words, prioritizing machine readability. In their pursuit, the authors have aimed to establish a meso-level of digital analysis—neither strictly an edition nor a corpus. This involves dealing with a group of texts ranging from hundreds of thousands to millions of words. Striking a balance, the authors allow for a small but perceptible margin of error, incorporating light annotations geared towards machine readability. This approach maintains opportunities for visual inspection and manual correction. In this paper, the authors elucidate the rationale behind our methodology, highlight the technical achievements it has yielded, and share the results obtained thus far.

Mai Zaki and Emad Mohamed, in their article titled “Two translations of Mahfouz’s Awlad Haratina (Children of the Alley): A computational-stylistic analysis”, embark on a comparative exploration of two English translations of Naguib Mahfouz’s contentious novel,, Awlad Haratina (Children of Our Alley), employing computational-stylistic analysis. The objective is twofold: firstly, to demonstrate how quantifiable computational and distant reading techniques can unveil patterns of stylistic disparities between the two translations, and secondly, to contextualize these findings within the broader societal backdrop surrounding the English translations by Stewart (1981) and Theroux (1996) of this renowned modern Arabic novel. The outcomes of the analysis reveal distinctive linguistic patterns in each translation, underscoring variations in lexical variety and richness, sentence structure, readability level, and stylometric aspects, along with certain lexical choices. These results provide valuable insights when interpreted within the social milieu surrounding the creation of these two translations. The study contributes to a nuanced understanding of translation styles by incorporating computational methodologies, shedding light on the intricate interplay between language, culture, and the socio-cultural context in which translations emerge.

The Special Issue concludes with an article titled, “Poet-Composer Collaborations in Egyptian Song: A Social Network Analysis Approach to Egypt’s Musical History” by Michael Frishkopf that fascinatingly underscores music as a profoundly relational art form, operating on an extensive scale where interactions among various stakeholders, including composers, poets, arrangers, conductors, performers, and producers, shape a complex network of relationships. Recognizing the multifaceted nature of music history, which evolves through these intricate connections over time, the focus is on song production, particularly the collaborations between poets and composers. While traditional music histories often concentrate on a limited number of linear narratives, neglecting the broader network, this article contends that a comprehensive understanding of musical history demands a big-data empirical approach. In the context of Arab music, the article critiques the tendency to spotlight a few prominent stars, repeating their stories without fresh empirical research. Emphasizing that many influential musical figures may not be widely recognized celebrities, the article argues that the intricate, non-linear network of musical history cannot be adequately represented through select linear narratives. It advocates for a big-data empirical approach with a focus on a large network of collaborative relationships, leveraging Social Network Analysis (SNA) tools. The proposed methodology centers on applying SNA to a substantial dataset of poet-composer collaborations, unveiling the social structure embedded in the history of song. The interpretation of this structure is then contextualized within broader socio-cultural and historical factors. Demonstrating the step-by-step application of this methodology using a comprehensive database of Egyptian songs, the article not only contributes insights into Egypt’s musical history but also proposes a model that can be adapted to other domains. In essence, the article argues that quantitative, algorithmic methods such as Social Network Analysis, and qualitative interpretive methods intrinsic to humanistic research, are not mutually exclusive or contradictory. Instead, they harmoniously blend, each complementing and guiding the other in the pursuit of a richer understanding of musical history.

This Special Issue on the potential and limits of Arabic Digital Humanities reflects the work of several scholars in undertaking the collection, digitization, annotation, clustering and analysis of an extensive collection of modern Arabic cultural data. In order to derive culturally-relevant insights from this corpus, the issue’s articles applies various methods and techniques (natural language processing, network analysis, visualization, and data mining) tailored to the analysis of Arabic cultural texts in a manner that would make them useful for humanities research and for technical training of Machine Learning tools. The project pursues a historical understanding of how some major themes that pertain to modern Arabic modern culture, such as the East-West encounter, modernism, liberalism, and the discourse about gender, race and identity, to name a few, have developed over the years, and how they play out through the prism of computational tools.


This Special Issue was made possible by NPRP grant NPRP10-0115-170163 led by Dr. Eid Mohamed from the Qatar National Research Fund (a member of Qatar Foundation). The findings achieved herein are solely the responsibility of the authors.