Projects‎ > ‎

Language Across Cultures

This project is a collaboration with the University of Texas at Austin and Cornell University.


This project investigates the language and discourse patterns of English and Arabic texts using computerized text analysis tools. Specifically, the researchers are interested in analyzing discourse patterns in various corpora such as newspapers, speeches, and conversations to elucidate the leadership style, personality, and social status of leaders. In addition to English and Arabic, analyses will be performed on Korean, Chinese, and other languages. We will use computational tools that automatically analyze texts on hundreds of measures of language and text cohesion (using Coh-Metrix), including word characteristics, syntax complexity, lexical diversity, readability, connectives, latent semantic analysis, co-referential cohesion, mental model dimensions, and genre.

PI: James Pennebaker (University of Texas at Austin)

Co-PIs: Art Graesser, Danielle McNamara

Grant Name: Modeling discourse and social dynamics in authoritarian regimes

Grant Number: NSF 0904909

Funding Agency: National Science Foundation 

Dates: 2009-2012

Amount: $582,000 (Memphis allocation)

Abstract: The major goal of the project is to define and compare the ways discourse in natural language reflects social dynamics in English, Arabic, Chinese, and other languages through the analysis of a wide range of documents from these languages and associated cultures. These analyses are automated via the computer facilities we have developed, such as Linguistic Inquiry and Word Count (Pennebaker, Booth, & Francis, 2007) and Coh-Metrix (Graesser, McNamara, Louwerse, & Cai, 2004; Graesser & McNamara, 2011; Graesser, McNamara, & Kulikowich, 2011).  Our expectation is that these analyses of language and discourse can predict socially significant states such as leadership, status, group dynamics, familiarity of group members, social cohesion, deception, and misinformation. This research is expected not only to advance the social sciences but also answer questions that require the processing of large amounts of textual communication.
Significant Publication(s):
Graesser, A.C., & McNamara, D.S. (2011).  Computational analyses of multilevel discourse comprehension. Topics in Cognitive Science, 3, 371-398.
Graesser, A.C., McNamara, D.S., & Kulikowich, J. (2011).  Coh-Metrix: Providing multilevel analyses of text characteristics.  Educational Researcher, 40, 223-234.
Hancock, J.T., Beaver, D.I., Chung, C.K., Frazee, J., Pennebaker, J.W., Graesser, A., & Cai, Z. (2010).  Social language processing: A framework for analyzing the communication of terrorists and authoritarian regimes. Behavioral Sciences of Terrorism and Political Aggression, 2, 108-132.
Shala, L., Rus, V., & Graesser, A. C. (2010). Automated speech act classification in Arabic. Subjetividad y Procesos Cognitivos, 14, 284-292.