Linguistic Markers of Status in Food Culture: Bourdieu’s Distinction in a Menu Corpus

Peer-Reviewed By: Nathalie Cooke & Anon.
Clusters: Food
Article DOI: 10.22148/16.007 
Dataverse DOI: 10.7910/DVN/QMLCPD


Food is a core element of culture, whose link with identity and socio-economic class has made it an important area of cultural research. 1 In his ground-breaking study, Pierre Bourdieu noted that "oppositions similar in structure to those found in cultural practices also appear in eating habits." 2 His work established deep associations linking food culture, and taste more generally, with social class and other aspects of identity, demonstrating the economic and social determinants of taste and their role in representing distinctions, differences between groups.

These implicit assumptions about taste can nonetheless be difficult to tease out from social investigations, such as the careful survey analysis or explicit interviews that Bourdieu conducted in the 1960s. However, there is a more direct window onto implicit assumptions about culture: the language we use in talking about culture.  Language offers a powerful tool for observing and quantifying the sometimes unconscious way that our associations and understandings of culture reflect our social attitudes and prejudices. Studying social aspects of culture, however, requires controlling for the great number of confounds that come along with language.  It's hugely helpful therefore to focus on a single genre, allowing exploration of a single contextual domain of language, and to use sufficiently powerful data that digital methods can be used to control for other confounds.

We propose to study the reflections of Bourdieu's distinction in the language used on menus of restaurants in the United States. By focusing on menus, we remove the many possible confounds due to genre: menus offer a coherent context, one in which food producers are attempting to advertise their food to consumers, framing it in a way that reflects how they believe their consumers understand food. Menus cross socio-economic lines, offering text written by restaurants from the cheapest and most quotidian to the most expensive, luxurious, and high-status, and offering very explicit coding of price. Finally, menus are frequent, allowing us to build a corpus of menus large enough that it is possible to control for cultural effects like different ethnicities of restaurants, geographical effects like city or neighborhood, and linguistic effects like variation in numbers of words and of dishes.

A wide variety of research has drawn on the insights of Bourdieu to examine the links between, status, class, and culture on food. Building on this literature, we focus on four aspects in which the differences between the language used by inexpensive restaurants versus expensive restaurants on their menus reflects a particular aspect of the framing of distinction.

The first is the role of authenticity. A number of scholars have studied how expensive or high-status food is portrayed as authentic. 3 Beverland and colleagues interviewed luxury winemakers and consumers, and consumers of Trappist beer brewed in Belgium and the Netherlands. Both consumers and producers link authenticity with historicity (talking about the founding of the company), with the relationship to place (in particular the concept of terroir) and with a focus on traditional, hand-crafted methods of production. 4 Johnston and Bauman looked at every article that appeared in 2004 in four upscale food magazines (Bon Appétit, Saveur, Food and Wine, and Gourmet), showing they present food as authentic by talking about locality and the regions in which food is produced, by the use of handmade simplicity rather than industrial production, and by the historicism indicated via a long tradition of manufacture. 5 Lakoff performed a careful discourse analysis comparing a menu from one expensive restaurant (Chez Panisse) with one from an inexpensive one nearby (The Oriental Restaurant), noting among other differences that the expensive menu contains extensive reference to the provenance of their food on particular farms. She argues that the references on the menu to local farms and ranches "that practice ecologically sound agriculture" allow eating a meal to be an act of 'civic virtue", suggesting that this kind of authenticity is also associated with a particular aspect of morality. 6

While authenticity has mainly been considered as a property of high-status foods, recent work in our labs has also explored the role of authenticity in signaling less expensive foods. Freedman and Jurafsky investigated the advertising language on bags of potato chips and the link between language and the price of the chips. They found that advertising on expensive chips emphasizes what they called natural authenticity, by mentioning the provenance and quality of the ingredients (sea salt, Yukon Gold potatoes) and the hand processing of the ingredients (hand-rake every batch), and using the words natural and organic.  Advertising on less expensive chips, by contrast, was consistent with a different model that they called traditional authenticity, in which food was related to family members, historicity, and family tradition (old family recipe, time-honored tradition).  They suggested their results are consistent with earlier work suggesting that working class or lower-middle-class identity is likely to be based around family and tradition. Similarly, Chahuneau et al. suggested that inexpensive menus employ traditional authenticity, relating food to family, comfort, and old-fashioned tradition. 7

We therefore propose to examine menus to see if they indeed present these two aspects of authenticity.  Our hypothesis is that more expensive, higher-status restaurants will be more likely to emphasize natural authenticity and the provenance of the ingredients on the farm, while cheaper, lower-status restaurants will be more likely to emphasize traditional authenticity.

The second aspect of distinction we consider is educational capital. Since education is associated with socioeconomic status, expensive restaurants, like other advertisers of luxury products, may use "fancier" words, i.e. ones that are rarer, more morphologically complex, or drawn from high-status foreign languages. The goal of such language may be to signal status or target educated consumers, drawing consumers into believing that the product is consonant with their educational capital and flattering the consumer by expressing this shared knowledge. 8 In his 1963 Confessions of an Advertising Man David Ogilvie warns would-be ad-writers: "Don't use highfalutin language" when you're talking to a non-highfalutin audience.  Freedman and Jurafsky found that advertising text on the bags of more expensive potato chips indeed tend to use words which are longer and more complex. Silverstein analyzed wine tasting notes, suggesting that aspirational elites self-consciously use rare or difficult wine-tasting jargon words ('oinoglossia') as an attempt by to demonstrate their prestige simply through the very act of using elite words. 9

Our hypothesis is that more expensive, higher-status restaurants will use more complex language, while cheaper, lower-status restaurants will use simpler language.

The third aspect of distinction we consider is Bourdieu's concept of plenty in the working-class meal. In his study of French society in the 1960s, Bourdieu noted:

The working-class meal is characterized by plenty... ."Elastic" and "abundant" dishes are brought to the tablesoups or sauces, pasta or potatoes (almost always included among the vegetables)and served with a ladle or spoon, to avoid too much measuring and counting... 10

Bourdieu suggests a framing for working-class meals in which the "impression of abundance" is important, and in which food should be simultaneously cheap and "nourishing" or "filling".

Does this same framing of plenty associated with cheaper meals apply to modern restaurant meals? In a related finding, Lakoff in her comparison of menus from an expensive restaurant (Chez Panisse) versus a cheap one (The Oriental Restaurant) noted that the cheaper restaurant gave the diner vastly more choices (with a longer menu and choices between chicken or pork or shrimp, for example). 11  Her finding suggests that inexpensive restaurants may focus on diner choice as an aspect of plenty: not only does the restaurant focus on offering more food, but also on offering many more kinds of food.  We therefore hypothesize that menus of cheaper restaurants will be more likely to give linguistic indicators of plenty and choicethe menu will highlight the amount of food, offering more dishes and offering a greater range of choices, while menus of expensive restaurants will be less likely to use this framing.

If so, we would expect a linguistic framing of plenty emphasized more in cheaper restaurants than in expensive ones.

The fourth aspect of distinction we consider is "implicit signaling of quality": whether expensive restaurants are likely to be less explicit in signalling the quality of their products, while cheaper or middle-priced restaurants are more explicit. Bourdieu, as Ridgeway and Fisk point out, "comments ... on the insecurity of the (middle class) petit bourgeois and nouveau riche compared to the upper class 'who ... have the privilege of not worrying about their distinction.' " 12 Liberman suggests that this idea of middle class insecurity might explain the differences he found among menus of 3 restaurants (cheap, middle-priced, and expensive) in Philadelphia.  He noted that the middle priced restaurant had the wordiest menu (with a hamburger described as "Big Juicy Burger of Buck Run Farm's Grass Fed Beef on our House made Poppy Seed Bun"). Liberman argued that the elaborate language in middle priced restaurant menus is used as an index of status, and that the extensive modifications by words like juicy or peppery is a mark of status anxiety. 13 Liberman's idea is supported by research in game-theoretic models of advertising which suggests that middle-priced firms will tend to use explicit advertising and that high-priced firms will tend to use implicit or "modest" advertising, "counter-signaling", to distinguish themselves from the middle-priced firms. 14

We therefore hypothesize that explicit protestations of quality (adjectives like "real", "delicious", "crispy", "crunchy") will be more likely to appear in middle-priced or cheaper restaurants than in expensive ones.  We focus on adjectives because a wide variety of researchers have noted that adjectives ("fresh", "crispy", "fluffy", "buttery") richly populate menus and that customers are more likely to choose dishes with these words. 15

In the next sections we develop measures for these four aspects of distinction and introduce the corpus we use to evaluate them. In a later section we also introduce a historical corpus to investigate the diachronic nature of our hypothesis.


The corpus

We chose a dataset of menus large enough to investigate the full range of restaurants, from fast food to luxury restaurants, allowing the investigation of the linguistic strategies of distinction jointly while controlling for each other and many potential confounding factors like the geographical location, the type of food served at the restaurant, the total number of words, and so on.  We built our dataset by extending the corpus of Chahuneau et al. (2011), consisting of menus downloaded from the website in 2011 for restaurants in seven cities: Boston, Chicago, Los Angeles, New York, Philadelphia, San Francisco, and Washington D.C.  They randomly divided the restaurants into 80 percent for training and 20 percent reserved for evaluation and testing. All the analyses in this paper are performed on their training dataset. From this data, we used only restaurants that were characterized on Yelp as restaurants and bars; thus all delis, groceries, and caterers were removed from the dataset.

Each menu comprises a set of dishes with names, descriptions, and a price.  For each restaurant, we use two metadata variables drawn from the reviewing website the restaurant category (type of food), and the price range, a variable on a four-level scale from $ to $$$$.  We use this Yelp price range ($, $$, $$$, $$$$) as our measure of restaurant price class, considering $$$$ as upper-socioeconomic-status, $$ and $$$ as middle-status, and $ as lower status. The Yelp price range is assigned with respect to restaurant category (a pizza restaurant that is rated as $$$ will be less expensive than a French restaurant rated $$$), so we control for category whenever investigating the price range variable.  Because all the data was automatically downloaded from the web, we also performed a number of error-checking and correction operations. Dozens of restaurants did not have restaurant category or price range variables on Yelp; these were coded by hand. Seven restaurant menus were removed from the study due to significant errors in price data (caused by typos due to download errors). We also removed all restaurants with missing menu data. Finally, because we focus only on food dishes, all drinks (sodas, alcohol, coffee) are removed from the lists of dishes, by using a combination of hand-labeling and regular expressions; in total 45,018 menu descriptions of drink items were removed.

The resulting dataset consists of 6511 restaurants and 591,980 dishes. Table 1 shows the summary statistics of the data. 16 There is a large variance in the number of dishes per restaurant (with a mean of 91 dishes and a standard deviation of 83). Many restaurants have few dishes (the mode is 38) but there is a large tail of restaurants with very long menus. Each dish is described with an average of 9.1 words, with each word containing on average 6.3 letters. Table 2 shows the geographical range of the data; some cities are underrepresented in the sample, and so in our main regression we add the city as a control variable.

Total restaurants22033311844153
Average number of dishes114885048
Average description length (words)7.6109.89

Table 1. Summary statistics over the entire menu dataset. Description length is in words. Our dataset only includes food dishes, so the original menus would have been slightly longer including drinks.

City# Restaurants
Los Angeles235
New York2828
San Francisco1228
Washington, D.C.511

Table 2. Number of restaurants per city in the data set.


The hypotheses and coding

To develop a testable measure of the linguistic strategies corresponding to each aspect of distinction, we mainly measure the number of times words appear from lexicons, lists of words and phrases. Lexicons were drawn from the previous literature and also from an initial investigation of the menus, as described below. We also study summary properties like word length or sentence length.

For natural versus traditional authenticity we coded variables marking the two kinds of authenticity discussed in the Bourdieu-inspired food and menu literature.

  1. Words and phrases related to the provenance of food (natural, farmhouse, wild caught, grass fed, local, market, farmed, free range, heirloom) derived from the previous literature, 17 as well as our initial investigation of the menus.
  2. Words and phrases related to traditional authenticity drawn from Freedman and Jurafsky, as well as all mentions of family members (old fashioned, traditional, family recipe, home style, mom, mother, auntie, grandpa, uncle, daddy's)

For the concept of plenty we coded three variables indicative of Bourdieu's plenty and the related idea of the number of dishes or amount of choices the restaurant is offering the consumer:

  1. the number of dishes on the menu (we controlled for the fact that menus with more dishes also have more words by adding as a control factor the number of total words on the menu)
  2. phrases indicative of the extent of consumer choice (choice of, choose, pick, specify, your own, your way, you like, any, add, or)
  3. phrases vaguely indicative of generous portions (big, hearty, generous, huge, plenty, heaping)

For educational capital we measured the log of the average word length in letters of all words describing all the dishes on the menu.  Word length has a very strong inverse correlation with word frequency and a positive correlation with morphological complexity; long words are thus both rare and complex. Word length is thus the main measure of word complexity in measures of reading difficulty level like the Flesch-Kincaid readability test. 18 Word length is likely a less accurate proxy for morphological complexity on menus than it is in other genres, because of the multilingual nature of menus; for example words in Vietnamese are on average shorter than English, while words in Italian are longer. However, controlling for the restaurant type (which generally includes ethnicity) helps somewhat in dealing with this confound.

For implicit status we measured the number of adjectives in the menus. We ran the Stanford Part-of-Speech Tagger 19 on all menu descriptions, producing 5 million tagged words. Many menu descriptions, especially short ones, had incorrectly tagged words, presumably because the tagger was not trained on menus. The most frequent such error was caused by the fact that taggers often interpret long sequences of capitalized words standing alone as proper names; thus the participle "steamed" in the phrase "Steamed Little Neck clams" was incorrectly marked as a proper noun because of the neighboring (correctly tagged) proper noun "Little Neck". However we found that the majority tag for most adjectives was in fact correct; cases like "steamed" were tagged correctly in the vast majority of instances (an instance of the one sense per discourse 20 rule). We therefore labeled as an adjective each instance of a word whose majority tag was as an adjective.  Following usual practice, we eliminated all words that occurred fewer than 1 time per million words (i.e., had counts less than 5).  We then hand-checked the remaining 1250 adjectives, eliminating remaining tagging errors.  The result was 1065 hand-curated adjectives.

We also investigated some useful subsets of these 1065 adjectives. 164 sensory adjectives (chunky, crispy, crunchy, doughy, fluffy, rich, smoky, tangy, zesty) were selected by hand-coding the 1065, and checked to include all adjectives in Zwicky and Zwicky. 176 participial adjectives were all those among the 1065 that had participial morphology (grilled, mixed, sliced, steamed, baked, smoked). We also included two classes of positive sentiment adjectives, both drawn by extending the extreme positive emotion list that was carefully constructed by Larcker and Zakolyukina. 21 55 extreme positive sentiment words were chosen by taking the 36 adjectives on their extreme positive emotion list that occurred more than 5 times in our corpus (more than 1 part per million) and that were not specifically food descriptive, and adding the 19 more adjectives that were synonyms of these.   14 delicious words were taken by separating the 4 food descriptive words (delicious, delectable, scrumptious, luscious) and adding synonyms (tasty, gourmet, savory, mouthwatering, etc.). 22

Control factors: We added three control factors: the log of the total number of words describing all the dishes on the menu (used as an additional control for factors like the number of dishes), the city in which the restaurant was located, and the restaurant category, which consisted of a label from a set of 32 types of restaurants.  These were constructed by grouping the 85 relevant Yelp restaurant types into the following 32 categories, based on choosing restaurants with similar cuisines and similar price ranges:

Pizza, Chinese, Italian, Steakhouses, American (new), Japanese, Mexican, French, American (traditional), Sandwiches, Cafes, Fast food, Thai, Indian, Other Asian, Diners, Seafood, Middle Eastern, Latin American, Bars, Bakeries, Spanish, Korean, Mediterranean, Barbeque, Other European, Vegetarian, Ethiopian, Soul food, Southern and Cajun, Greek, Asian fusion

Each restaurant was assigned to exactly one of these 32 categories; restaurants listed in Yelp with multiple classes were assigned to whichever of those classes occurred most frequently in the entire dataset.



We used a simple linear regression to test the relationship between variables and restaurant price status.  The dependent variable is the number of dollar signs.

Our main independent variables are based on counts of words (a "bag of words" model since we ignore word order and other cues to grammatical structures).  Word counts typically have a very long-tailed distribution so we apply the standard transformations to achieve a more Gaussian distribution. Thus each feature of a word count c was included as log (1+c), and length variables (length of dish product descriptions in words, and length of words in letters) were included as the log of the length (in letters or words).

Because all word count features drawn from a particular product description were collinear with the length of the product description (the more words in the description, the more chances for each type of word to appear), we next used a linear regression to remove effects of log length from each lexicon variable.  We then included the residuals from this regression (rather than the raw counts) as variables to represent the effects of each lexicon of interest.

Linear regression was then used to predict the restaurant price (an integer from 1-4, representing $, $$, $$$, $$$$) from the variables of interest and the set of control variables, via the lm package in R.

Because the difference between the four categories ($, $$, $$$, $$$$) may not be linear, we also ran an ordinal logit using the polr function in R, predicting the four categories ($, $$, $$$, $$$$) as ordinal categories values from the independent variables. The results were the same as those from the linear regression, and so in the remainder of the paper we describe only the linear regression.

However, for the four different subtypes of adjectives considered below, we additionally ask whether some adjective classes were particularly associated with particular price classes (for example whether markers of traditional authenticity were especially associated with lower-middle priced ($$) restaurants). We therefore also ran multinomial logistic regressions (using glm in R) predicting a binary variable comparing the class of interest (in this case $$) against the other three classes ($, $$$, $$$$) and report on the role of the predictor in question as an independent variable, after controlling for all the same variables described above (restaurant category, number of dishes, number of words in different classes, etc.).



We describe the results of each aspect of distinction separately; full regressions are shown in the appendix.

Restaurant price is significantly associated with the language of natural authenticity (p < 2 x 10-16); expensive restaurants have descriptions like "Local Albacore Tuna Nicoise: summer beans, heirloom tomatoes, yellow tomato aioli, soft boiled farm egg" or "Bison Burger: 8 oz. Blue Star Farms, grass fed & pasture raised".

Traditional authenticity is associated instead with cheaper restaurants (p = .000716), especially lower-middle-priced ($$) restaurants which offer dishes like "Homemade Tiramisu: family recipe", "Old Fashioned Beef Stew", or "Annie's Famous Pot Roast: Homemade, just like Mom's." Figure 1 shows the raw counts and confidence intervals.

To test the hypothesis (suggested from Figure 1) that it was lower-middle priced ($$) restaurants that make extra use of traditional authenticity, we also ran a logistic regression predicting class ($$ versus $, $$$, $$$$) after controlling for restaurant category, number of dishes, and number of words.  The framing of traditional authenticity indeed was more likely to be used in the menus of lower middle priced restaurants (p = 0.000303).

Figure 1. Means and 95% confidence intervals for raw values of natural versus traditional authenticity by restaurant price status. The values are counts per average dish description. Thus for example on $$$ menus, mentions of natural authenticity occurred on average 6 times in every 100 dish descriptions.

Educational capital was also associated with restaurant status. More expensive restaurants were indeed more likely  (< 2 x 10-16) to use longer words like accompaniments, complements, magnificent, inspiration, exquisitely; cheaper restaurants used shorter forms like sides instead of accompaniments or complements. Figure 2 shows raw means and standard errors:

Figure 2. Means and 95% confidence intervals for raw values of word length in letters by restaurant price status.

To see the association of foreign words with expensive restaurants, we examined the words most likely to occur in the $$$$ class, which we defined as those words with the highest log likelihood ratio between the counts in the $$$$ class and the summed counts in the other three classes, using the "weighted log-odds-ratio, informative Dirichlet prior" method. 23 We found that 28 of the 100 words most associated with the expensive restaurants were foreign, including 9 French words (les, de, le, fois, gras, crème, mousse, tarte, pommes), 6 Italian words (e, con, risotto, pancetta, burrata, polenta, parmigiano), and 13 Japanese words (tempura, uni, wagyu, sushi, yuzu, sashimi, miso, shabu, kobe, ponzu, wasabi, ninja, and sake). The fact that the French and Italian vocabularies are full of function words (the short grammatical words like de or con) suggests that at least some of these product descriptions are written largely or even entirely in French or Italian. In summary, more expensive restaurants had longer words and more high-status foreign words.

As expected cheaper restaurants have more dishes on the menu (< 5.7 x 10-15) and are more likely to use the framing of choice and plenty. They emphasized generous portion sizes, making use of words like generous, hearty, and big (p = .0023) ("Baked Meat Lasagna: Made with our hearty meat sauce," "Tuna Supreme: A generous scoop of tuna"). Cheaper restaurants are more likely to emphasize the extent of diner's choice with descriptions like "baby lamb chops, grilled to your liking", "PEI mussels: choose your style") (< 2 x 10-16).

Figure 3. Means and 95% confidence intervals for raw values of choice and plenty by restaurant price status. 3b shows values of plenty for all restaurants; note the very large variance for the $$$$ restaurants; 3c shows values for the restaurants after removing all steakhouses.

Figure 3 shows raw means and variances. Note from Figure 3b that while plenty was significantly associated with cheaper restaurants, there is still enormous variance in the most expensive $$$$ class. We examined the uses of plenty framing in $$$$ restaurants and found that most of these were steakhouses; here's a sample usage from a Chicago steakhouse:

Big Shoulders: and you thought our petite porterhouse was big! this is a full forty ounces! that's eight ounces short of three pounds! are you up to the challenge?

Figure 3c shows the vastly reduced variance in a smaller version of the dataset that removed the 81 steakhouses.  To further investigate this suggestion of an association between steakhouses and plenty, we also ran a linear regression predicting the log number of mentions of words suggesting plenty from the restaurant category.  Seven classes of restaurants were more likely to use these framings: steakhouses (p = 2.30 x 10-7), traditional American food (p = 4.17 x 10-9), fast food (p < 2 x 10-16), sandwiches (p = 6.28 x 10-11), bars (p = 0.00048), pizza (p = 0.0044) and barbecue (= 0.00019).  Our findings thus confirm the association of plenty and choice with cheaper restaurants; we return in the discussion section to the implications of the association of plenty with steakhouses and other types.

Finally, we examined the role of adjectives. Conforming to our hypothesis, adjectives in general were negatively associated with price; cheaper restaurants used more adjectives than expensive ones (< 6.1 x 10-5).  Figure 4 shows the raw means and variances for the four specific sub-classes of adjectives we consider.

Figure 4. Means and 95% confidence intervals for raw values of four classes of adjectives by restaurant price status. Sensory adjectives and participles are more associated with middle-priced restaurants, while adjectives of positive sentiment (either related to food or not) are associated with the cheapest restaurants.

As Figure 4 suggests, while adjectival protestations of quality are used less on expensive restaurants, different classes of adjectives are associated with different groups of cheaper restaurants. Sensory adjectives are strongly associated with lower-middle-priced restaurants (p = 0.000182) in a logistic regression comparing $$ with $,$$$,$$$$ after controlling for category, number of dishes, length and the other adjectives). We see examples like "Crisp Golden Brown Belgian waffle with Fresh Fruit"). Participles are similarly associated with lower-middle-priced  ($$) restaurants (same regression, = 1.66 x 10-8). Both positive sentimental adjectives (excellent, great, wonderful, = 9.09 x 10-8) and positive food adjectives (delicious, tasty, gourmet, = 1.76 x 10-7) are associated with lowest priced ($) restaurants in a logistic regression comparing $ with $$,$$$,$$$$ after controlling for category, number of dishes, length and the other adjectives.


Studying menus over time

Our study suggests a number of ways that modern menus reflect Bourdieu's ideas of distinction. But it is important to understand to what extent these factors are long-lasting trends that describe attitudes toward food culture in the United States, and to what extent they solely characterize recent food culture in 2011, the date the menus were collected.  The fact that Bourdieu's own analyses date from data he collected in the 1960s, half a century ago in France suggests that these trends have persisted for a while. Have cheaper restaurants always emphasized traditional authenticity, and expensive ones natural authenticity?

To answer this question we turn to a second dataset: the New York Public Library's Buttolph Collection, which contains over 17,000 digitized and transcribed menus from 1852-2015. 24  The menus come from a wide variety of locales, but the majority of those whose source is labeled come from New York City restaurants or from steamships. The collection is strongest around the turn of the last century, so we extracted all menus from 1892-1921. Because the dataset has enormous variation across time and space and the metadata is often lacking, our analysis here is necessarily preliminary, leaving for future investigation many questions like establishing the correct control variables (restaurant category, exact location) or of designing a proper sample. Nevertheless, we chose to do a preliminary analysis, controlling for at least some variables.  For example because the dataset contains enormous numbers of duplicate menus from the same restaurant on different days (210 menus in the year 1914 alone come just from different lunches and dinners at the Waldorf Astoria Hotel), we eliminated all cases of identical dishes from the same restaurant in the same year, resulting in 2858 distinct menus for this period. We defined a simplified 2-point pricing scale (cheap versus expensive), first using the median price of the dishes on a menu to assign a cost score to each menu, then computing an annual median of the menu costs for each year in the data, and labeling a menu in a given year "expensive" if its cost score was above the median for that year, and "cheap" otherwise.

We then asked which of the aspects of distinction discussed above characterized menus from a century ago. Multinomial logistic regression was used, with the price of a menu (cheap versus expensive) as the dependent variable, and each of the textual indicators of distinction discussed above as independent variables. Only five variables were significantly associated with price. More expensive menus had longer words (< 2 x 10-16), and cheaper menus displayed more linguistic signs of traditional authenticity (p = 3.45 x 10-7) and choice (= 1.02 x 10-6), and used more adjectives (= 1.07 x 10-9). One control variable was also significant: cheaper menus used more words overall (= 0.000474). None of the other variables were significantly different between the cheap and expensive restaurants (although there were large differences in the presence of French vocabulary, which we return to below). Figure 6 shows the raw means and variances for these variables.

Figure 5. Means and 95% confidence intervals for raw values of four variables in historical menus from 1892-1921 from the New York Public Library Buttolph Collection. These four variables all pattern exactly as on modern menus. Other variables signaling distinction (number of items, plenty, sensory adjectives, natural authenticity, sentiment adjectives, delicious words) show no interaction with restaurant price in these menus.

Early examples of traditional authenticity on inexpensive menus look very much like modern menu items, with examples like "homemade sausage with hot slaw" from the 1899 menu of the Hotel Baltimore, or "old fashioned corn bread with maple syrup" from the 1900 menu of Mandel's Tea Rooms.  Customer choice also appears on cheaper menus in the previous century, such as Haan's advertising "terrapin any style" in 1900 (turtle was very popular at that time), Hazeltine's 1914 "toasted bread any kind", or the 1913 menu of Smith's Restaurant which allows the customer to "choose one dish from each column."

The results of our investigation of menus from a century ago, while certainly preliminary, suggests that at least some of the aspects of Bourdieu's distinction have been visible on restaurant menus for well over a century.  The fact that lower-priced restaurants emphasized traditional authenticity suggests that the framing of inexpensive food as old-fashioned or homey has itself a long history in the US.  The fact that expensive restaurants use longer words points to the early role of educational capital in marking status.

We did not find evidence that expensive restaurants were more likely to mention farms or pastures. Restaurants of that period certainly did mention farms or the way the food was raised ("Deerfoot Farm sausages", "buttermilk from Darling Farm", "special raised turkey"), but these were not more likely to be expensive restaurants.  What words were then most likely to occur in the most expensive restaurants? We examined the words with the highest log likelihood ratio between the counts in the restaurants in the top quartile of priced class and the summed counts for all other restaurant price classes, again using the "weighted log-odds-ratio, informative Dirichlet prior" method of Monroe, Colaresi and Quinn (2008). Of the 20 words ranked as most characteristic of the most expensive restaurants, 14 were French words (creme, salade, de, la, a, au, etc.). The remainder were mainly expensive products (lobster, squab, champagne, and, perhaps surprisingly to the modern reader, chicken).  The high prevalence of French words on expensive menus reflects the well-known role of French language and cuisine as a signifier of high status in American cuisine of the 19th and 20th centuries.

The fact that the expensive restaurants also use fewer words and fewer adjectives suggests that the use of implicit or "modest advertising" to mark high status dates back at least 100 years. Individual words, however, have changed their association with status over time, in the kind of trickle-down of status suggested by Veblen and Simmel. 25 In the historical corpus, for example, gourmet is associated with more expensive restaurants, like the "braised lamb gourmet" from the 1917 Waldorf Astoria or the "tomato gourmet salad" from the 1920 Hotel Manhattan. In modern menus, however the word gourmet is associated only with cheaper restaurants, and you can order a "gourmet large pepperoni" from any number of pizza delivery restaurants in the 2011 dataset. While in many uses gourmet surely retains some of its earlier sense of epicurean expertise,  on menus the word has undergone semantic bleaching, leaving it mainly as a vague protestation of quality.



We investigated a large dataset of restaurant menus coded with textual, quantitative, and metadata information to understand how words on menus subtly reflect aspects of our food culture. Our results suggest that the words used in menu descriptions of American restaurants reflect the many aspects of Bourdieu's distinction.  Expensive restaurants use the language of natural authenticity, focusing on the provenance of their food, while cheaper restaurants focus on traditional authenticity, highlighting old-fashioned American dishes and moms and grandmas.  The link between the natural and pastoral with expensive menus is also consistent with a wide variety of previous research, and is presumably linked with the huge changes in American food that occurred in the 1960s and 1970s. 26 Researchers have noted a contemporaneous increase in emphasis on the pastoral in the related field of wine reviewing, and suggested that the association of the pastoral genre with the urban upper class could also be at least in part a response to the increasing role of technology and related conflicts of modernity. 27

The use of these two metaphors for authenticity acts as a way of targeting consumers, using values that appeal differently to the two groups; food origin and purity for higher socio-economic status consumers, and family and tradition for lower socio-economic status consumers. 28 Our work shows that authenticity is not a monolithic concept: it can be used to mean different things to different consumers, and that firms use these different kinds of authenticity in a coherent way for product differentiation.  Furthermore, at least one aspect of this framing, the association of tradition with cheaper restaurants, dates back well over a century.

Modern expensive restaurants are more likely to use longer words and to use foreign words from three high-status foreign languages: French, Italian, and Japanese. How do complex and foreign words acquire their linguistic association with high status? French of course has long been associated with high status in cuisine; Italian and Japanese are more recent high-status culinary languages.  The use of more complex words is a sign of educational capital, which has long been associated with high status. Complex words are also associated with more formal genres 29 and more formal or ceremonial language is also associated with high status or luxury.  It's possible as well that these complex words are an attempt to appeal to the reader's ego; a reader who can understand these complex or foreign words is implicit being flattered at their ability to understand the "code".  Presumably, a more complex menu also requires more educated or trained wait staff to discuss the menu with customers. This association of complex words, and especially French words, with high status was visible also in menus from a century earlier.

Cheaper restaurants focus on plenty by emphasizing the size of portions (generous, Texas-sized) and the amount of choices they offer in number of dishes and options within those dishes. This framing is most associated with steakhouses, traditional American food, fast food, pizza, sandwiches, barbecue, and bars.  Except for steakhouses, these are not fancy restaurants, but everyday places that seem to cater to the eater more concerned with value. Furthermore, these are all restaurants that serve American food, focusing on American ethnic foods (sandwiches, hamburgers, and barbecue); thus Chinese restaurants, while prevalent in the cheap ($) categories, do not seem to make use of the metaphor of plenty. Steakhouses are the exception to the generally low prices of restaurants using this metaphor. Although expensive in price, steakhouses seem to draw on this same framing of the all-American working class, suggesting we should expect to see other aspects of working class framing in steakhouses. The fact that only part of this framing of plenty (the emphasis on choice, but not the emphasis on large portions) was present a century earlier suggests a change in this aspect of distinction.

Finally, expensive restaurants generally use more implicit language, using fewer of the many kinds of adjectives used by cheaper restaurants, and this use of fewer words and adjectives was also present in menus a hundred years ago.  H. Paul Grice's model of cooperative language behavior may explain how the avoidance of such modifiers sends a signal about quality. Grice proposed that language users automatically obey certain communicative maxims. 30 A rational communicator of the type described by Grice would only mention that food is crisp or fresh to fulfill a communicative goal like convincing the reader, which only makes sense if the reader doesn't already think the food is crisp and fresh. A high status restaurant, however, wants freshness and crispness to be already presumed, and therefore the crispness should go unmentioned. No expensive restaurant would use a description like this one from a cheap restaurant: "a flavorful, colorful, and delicious salad mixture of crispy bacon bits". For an expensive restaurant that the food is flavorful and delicious should go without saying.

To see these Gricean inferences in a qualitative way, we examined the foods most commonly associated with one of these adjectives, "real".   Exactly which foods a menu claim to be "real" depends on the price. The least expensive ($) restaurants are most likely to promise "real whipped cream", "real mashed potatoes", and "real bacon".  In slightly more expensive ($$) restaurants, "real" is used mainly to describe "real crab" and "real maple syrup". By contrast, "real" is barely used at all for more expensive ($$$ and $$$$) restaurants. For a pricy restaurant to call its crab "real" would be to suggest that its realness might be in question and has to be defended.

The avoidance of over-explicit adjectives by high status restaurant is also consonant with the game-theoretic models described above, 31 and with consumer studies that find that consumers perceive excessive ads as overcompensating for problems in a low-quality product. 32 The visual counterpart of minimal language is the use of white space in spare ads, which both consumers and creative directors associated with prestige and quality, based on a link between white space and the mid-century minimalist movement in art and the "less is more" movement in architecture, all of which associated spare, clean, minimal designs with prestige and the upper class in North America. 33 Finally, this use of overly explicit descriptions by less expensive restaurants might also be modeled as a kind of overcompensation, in which a group that is anxious about its status overcompensates in the cues for that status; this kind of overcompensation is common both in linguistics, in the hypercorrection seen in speakers of non-standard styles, 34 or the overcompensation seen in other social categories like masculinity. 35

While our results are generally consistent with prior literature, there are some differences.  Our finding that inexpensive restaurants are more likely to emphasize consumer choice as compared with expensive restaurants seems inconsistent with work finding that European-Americans of higher socio-economic status (SES) emphasize personal choice more than those of lower SES. 36 It is possible that choice is simply more implicit in high-SES restaurants, which may be more willing to make special versions of dishes (for example omitting particular ingredients), have waiters discuss choices orally, or put choices on a chalkboard without highlighting this fact on the written menu. Investigating this paradox is an important direction for future research.

Our study has a number of limitations, such as its focus solely on major metropolitan areas and its focus on firms solely in the United States. Strauss's (2005) comparative study of the language of television advertising shows that generic terms like oishii or umai ('delicious' or 'tasty') are very common in Japanese advertising, but generic terms like 'delicious' are less common in American or Korean advertising, which are instead more likely to discuss specific positive qualities that cause the food to be delicious. Thus the finding that vague positive words like delicious or tasty are associated with lower priced restaurants or dishes is likely specific to marketing in the United States context. Further cross-cultural research is clearly called for. Understanding the role of the socio-economic status of the consumer, and also how these restaurant meals fit into larger patterns of food purchase and consumption, also remain important directions.

Despite these caveats, our results highlight the important role that investigating the "linguistics of the everyday" should play for our understanding of culture. Quotidian aspects of life are useful windows onto culture, not just because our attitudes toward daily life reflect our implicit beliefs about identity and socio-economic class, but also because they may come pre-annotated with economic variables, as does the language of restaurant menus or food reviews. Computational techniques can thus be key in helping explore aspects of culture and society.



Category#% Category#%
Pizza73011.20%Middle Eastern1211.90%
American (new)4947.60%Steakhouses811.20%
American (traditional)4737.30%Mediterranean590.90%
Sandwiches2263.50%Asian fusion460.70%
Other2003.10%Other European340.50%
Fast food1993.10%Vegetarian320.50%
Latin American1652.50%Southern130.20%
Coffee & Tea1322.00%Soul food130.20%
Other Asian1231.90%

Table A1. Broad restaurant categories for all restaurants in the dataset. Categories were created by hand-grouping Yelp categories by cuisine and price similarity.

consumer choice-0.5953830.06144-9.691<2e-16***
natural authenticity2.3089470.15947714.478<2e-16***
traditional authenticity-0.453950.134103-3.3850.000716***

Table A2.  Coefficients for the linear regression predicting restaurant price status (an integer 1-4 corresponding to $, $$, $$$, $$$$) from control variables and variables of interest, using the lm package in R. The adjusted R-squared was 0.4458. All variables from the count lexicons (adjectives, plenty, etc.) are residuals after regressing out the effects of description length in words (loglength).

Sensory Adjectives/Adverbs: airy, aromatic, astringent, beefy, bitter, bittersweet, blazing, bloomy, bold, bright, briny, brisk, burnt, buttery, cheesiest, cheesy, chewy, chocolaty, chunky, citrusy, clean, coarse, cold, colorful, complex, cool, creamy, crisp, crispier, crisply, crispy, crumbly, crunchy, crusty, dark, darkest, delicate, dense, doughy, drier, dry, earthy, effervescent, explosive, faint, fatty, feathery, fiery, finely, fizzy, flaky, flowery, fluffy, foamy, fragrant, fresh, freshest, freshly, frosty, frothy, fruity, fudgy, funky, fuzzy, garlicky, gentle, glassy, golden, gooey, grainy, grassy, gummy, herbaceous, herbal, hot, hottest, icy, juicy, leafy, lemony, light, lighter, lightest, luscious, lush, luxurious, malty, meatiest, meaty, meltingly, mild, mildly, milky, minty, moist, numbing, nutty, oaky, peachy, peppery, perfumed, pink, piquant, plump, porky, puffy, rich, richer, richest, richly, ripe, robust, salty, saucy, sharp, sharper, sharply, shiny, silken, silky, slender, smoky, smooth, smoother, soft, soupy, sour, spicer, spicey, spicier, spicy, spongy, spreadable, stinky, strong, stronger, succulent, sultry, supple, sweet, sweetest, syrupy, tangy, tawny, tender, tenderly, thinly, toasty, velvety, verdant, vibrant, vinegary, warm, warmer, wet, winy, zesty
Positive Sentiment: amazing, appealing, awesome, beautiful, best, better, dazzling, delightful, divine, dynamite, excellent, exceptional, exciting, extraordinary, fabulous, famous, fancy, fantastic, favorite, fine, finest, gorgeous, great, greater, greatest, greatest, heavenly, incredible, incredible, incredibly, irresistible, lavish, legendary, lovely, magical, magnificent, marvelous, outrageous, outstanding, perfect, popular, sensational, spectacular, splendid, striking, stunning, sublime, superb, terrific, unforgettable, unique, wonderful
Positive Food Sentiment: appetizing, delectable, delicious, flavorful, gourmet, luscious, mouthwatering, savory, scrumptious, tastiest, tasty, toothsome, yummy
Plenty: big, bigger, biggest, bottomless, bountiful, colossal, endless, enormous, generous, generously, gigantic, ginormous, heaped, heaping, hearty, hefty, huge, largest, loaded, loads, lots, mammoth, massive, mega, oversized, overstuffed, piled, plentiful, plenty, refills, unlimited, and more, king sized, texas sized, thick cut, tons of, with more
Choice: choice, choose, any, add, or, specify, substitutions, specifications, options, pick, your way, your own, your liking, your style, your favorite, you like, you want, you request, way you, you may, select your, select from, you select, select one, select any, select or, select a, select up, select two
Traditional Authenticity: home*, traditional*, timeless, family recipe, all american, our founder, old fashioned, old school, american favorite, america's favorite, all time favorite, old favorite

Table A3.  Lexicons

