What is content analysis?

General references for scene setting

Daniela Crăciun
Bard College Berlin

Weather written or spoken, social and political actors interact through language. Thus,
researching any type of phenomena and processes ultimately involves analyzing written or
transcribed texts. The technological developments from the last couple of decades have
translated into an ever-increasing proportion of political, social and cultural activity recorded
as digital text. In turn, this has given political scientists new possibilities of testing and
building theories by transforming text to data.

The question then becomes: how do political scientists analyze texts to answer salient
questions about the societies we live in? The answer is: content analysis. Content analysis is
the umbrella name given to a family of research methods for systematically extracting
information from textual data for scientific purposes. As defined by Krippendorff (1980,
p.21), the founding father of this scientific method, “content analysis is any research
technique for making replicative and valid inferences from data to their context”. That
is, text data can be used to make both descriptive and causal inferences about
political phenomena and processes around the world.

How is this done practically? Any content analysis research design involves a
number of steps:

(1) Obtaining a body of texts. In political science these can be
any kind of texts involving political actors (e.g. speeches, party manifestos, laws,
regulations, policies, parliamentary/government records, minutes, blogs, social media
posts, news, etc.). As many documents are now digitized, rather than stored in archives, web scraping tools have considerably improved the efficiency of conducting
this task.

(2) Preparing the texts for analysis. This is a dull but crucial task because of the
‘garbage in, garbage out’ principle. In other words, if the quality of the texts analyzed
is subpar, the results will also be. This is hardly a specific issue of content analysis or
computer assisted content analysis. All other data analysis methods (e.g. statistics,
qualitative case analysis, ethnography) have the same problem.

(3) Coding the texts and transforming them into data. This can be done either
manually (using a coding sheet), automatically (using unsupervised methods of
computer assisted content analysis) or a combination of the two (using supervised
methods of computer assisted content analysis). Natural language processing
algorithms have given rise to a variety of ways to transform text to data with the help
of computers.

(4) Interpreting and narrating the results. Once the analysis is complete,
researchers should make abductive inferences from the data to its context and report
to others what the texts “mean, refer to, entail, provoke or cause” (Krippendorff,
2004, p. 85).
How do we decide which texts to collect, how to prepare them for analysis, how to
code them and reduce their complexity, how to interpret them and relate their
results? The decisions made at each step must be informed by the aim of the
research project. In other words, the central research question that one is trying to
answer. However, the steps that need to be followed in a content analysis research
design stay the same and researchers can think of them as a recipe to be followed.

Reference

Krippendorff, K. (1980). Content Analysis: An Introduction to its Methodology.
London: Thousand Oaks.General references for scene setting