Analyzing Streams of Language

Cheryl Geisler, New York: Pearson, 2004

Page 38            from Chapter 3 Segmenting the Data

Units in Text

Written texts have a variety of characteristics, some associated with conventions of publication, others with conventions of typography, and still others associated with the rhetorical interactions with readers at a distance. All can serve useful purposes as units of analysis for textual data.


Page 39

Text

Perhaps the most obvious unit for analyzing textual data is the text itself. Unlike the stream of conversational data that must often be bounded in some arbitrary fashion for the purposes of analysis, written texts often have well-established boundaries. In a classroom, for example, students generally write and bind (with staples or paperclips) individual texts separately: The boundaries of individual student “papers” are seldom hard to determine. In published formats, conventions exist for separating individual texts: the chapters of an edited volume, the articles in a magazine or journal, the stories in a newspaper.

From the writer’s point of view, many phenomena occur at the level of the text: the quality of the text, the genre of the text, the implied audience for the text. From the reader’s point of view, texts also have a variety of characteristics that can be examined: their persuasiveness, their familiarity, their importance, and so on. If you are concerned with any of these or similar phenomenon, from either the writer’s or the reader’s point of view, you may find the text itself a good unit of analysis with textual data.

Date of Publication

Published texts have convention dates of publication that can be used as units of analysis. You might wish to select all texts published in a certain year. Or you might want to compare the characteristics of texts published in one year with those published in another year. If you suspect that your phenomenon shifts over time or you want to limit your selection of texts to a certain slice of the historical record, you may want to use publication date as a unit of analysis.

Publication Venue

Texts that are published as part of larger compilations—newspapers, journals, magazines, and so on—can be selected or analyzed by publication venue. You might want to select all the advertisements in Wired magazine, for example; or you might want to try to analyze the style of Time magazine. If you are concerned with a phenomenon associated with the kind of readership or affinity grounds that such publication venues represent, you may want to use publication venue as your unit of analysis.

Organization

Texts that are produced within a specific organization often provide evidence of the culture of that organization. You might want to select all the texts in the archives of a specific corporation, for example, or contrast two different corporations. If you are interested in a phenomenon associated with specific cultural con texts, you may want to choose the organization as the unit of analysis.

Author

The author of a text is often an important unit of analysis. When you use author as the unit of analysis, you look not at specific texts by an author, but at a body of work


Page 40           

by that author. In a class, for example, rather than looking at one paper from each student, you look at a portfolio of work produced by that student. Or, with published texts, you might want to look at all the speeches made by a given candidate.

A lot of phenomena of interest are associated with authorship: the author’s style; the impact of contextual and/or biographical factors on the individual; the development of the author over time. If you are interested in focusing on the phenomenon of the individual, you may want to use author as your unit of analysis, ]

Sentences and Paragraphs, Pages and Lines

Texts are structured by their layout with a variety of characteristics, any of which can be used as a unit of analysis. As units they can serve useful purposes as ways of selecting data when the phenomenon of interest is assumed to be regularly distributed through the text and you simply need some way of selecting part of the data.

You might choose, for instance, every third sentence, every fifth paragraph, every other page, or the first 10 lines of each section. When you are looking for a way of analyzing the distribution of meaning in a text, sentences, paragraphs, pages, and lines are not useful as units of analysis because they are arbitrarily related to meaning. But they can be quite handy as a way of making a stratified or random selection of textual data.

Sections

Longer texts are often divided into sections, each with its own label. You can use the section as the unit of analysis when you want to look at kinds of rhetorical moves that tend to happen in certain places. You might, for example, look at the opening section of research articles to examine citation patterns since these openings often contain reviews of the literature. Or, looking for the same phenomenon, you might examine all sections in which citations are made. Opening sections are also good places for looking at phenomena related to the voice of a piece, or the relationship defined between author, reader, and context.

Other times, you will want to skip opening sections and choose text from middle sections. Letters, for example, tend to have routinized openings that pre cede getting to the real issues. Using sections as a unit of analysis is closely related to genre issues discussed in the next section, but require less preliminary analysis to determine the rhetorical function of a piece of text.

Genre Components

Most texts belong to families of texts we call genres. While genres are not rigid, texts in certain genres do tend to share common features and common structures. Genres represent a typified response to a typified rhetorical situation. They thus exhibit many typified features: typified moves, typified relationships to audience, typified reading patterns, typified publication venues.

You can use the whole genre as a unit of analysis—looking at all letters to the editor, for example. Or you can use specific information you have about a genre to


Page 41

select or analyze specific genre-related features—the abstracts of research articles, the response of readers to scientific articles, and so on.

Metadiscourse

Metadiscourse is the part of discourse that talks about the discourse: the metadiscourse. If you can imagine that a text has a primary channel in which in formation is conveyed, the metadiscourse forms a background channel through which the writer talks to the readers to tell them how to understand and interpret the text.

There are two primary kinds of metadiscourse. Textual metadiscourse directs the reader in understanding the text. Textual connectives such as first, next, and however help readers recognize how the text is organized. Illocution markers like in summary, we suggest, and our point is point to the kind of work the writer is trying to do. Narrators such as according to, many people believe that, and so-and-so argues that let readers know to whom to attribute a claim. Textual metadiscourse is directly related to the rhetorical awareness exhibited in the text, and can be used as a unit of analysis when you are concerned with rhetorical sophistication.

A second kind of metadiscourse is interpersonal, and serves to develop a relationship between writer and reader. Validity markers such as hedges (might, perhaps), emphatics (clearly, obviously), and narrators (according to) give the reader guidance on how much face value to give to the claim with which they are associated. Other attitude markers like surprisingly and unfortunately, communicate the writer’s attitude toward the situation and invite the reader to share the same stance. Commentaries such as as we’ll see in the following section and readers are invited to peruse the appendix are more extended directions to the reader. Interpersonal metadiscourse is directly related by the degree to which a text shows evidence of audience awareness. Interpersonal metadiscourse can vary by genre, by rhetorical sophistication, and by the degree of comfort a writer has with the audience addressed. If you are interested in a phenomenon related to audience, you may wish to look at interpersonal metadiscourse as a unit of analysis.