Special Issue: Applied Quantitative Text Analysis

Romanian Journal of Political Science, Vol. 20, no. 2, Winter 2020 For social scientists in general, and political scientists in particular, text has always been a resource of reference. In an increasingly overflooded world with written text and with welcome advances in computer-assisted techniques to analyze it, researchers have now at their disposal the conceptual, analytical and visual tools to systematically extract meaning from political text for social scientific purposes. The Romanian Journal of Political Science opens a Call for Papers for a Special Issue on Applied Quantitative Text Analysis, looking for original contributions of substantive and methodological papers using various forms of quantitative text analysis applied to relevant political science questions. Papers applying various text analysis methods include the following:
  • classical content analysis methods
  • dictionary-based methods
  • classification and machine learning methods
  • scaling methods
  • topic models
  • qualitative techniques of human coding and annotation for the purposes of validation for automated approaches
Papers problematizing fundamental issues of quantitative text analysis are encouraged, on topics such as inter-coder agreement, reliability, validation, accuracy, and precision. Applications of quantitative analyses to substantive political science problems are encouraged. Examples of topics include, but are not limited to, the following:
  • social media analysis of relevant political issues
  • analysis of political speeches
  • impact of political issues through newspapers
  • gradients of political ideology
  • policy positions and preferences
  • classifications of populist speeches
  • estimating voting behavior from online presence
The deadline for submission is January 31st, 2020. All submitted manuscripts will go through double blind peer review. Decisions on accepted manuscripts will be made no later than June 2020. The Special Issue will be published in Winter 2020. Guest Editors Zoltán Fazekas Iulia Cioroianu Daniela Crăciun About the Journal The Romanian Journal of Political Science (PolSci) is the first peer-review Romanian political science journal, edited and published twice a year by the Romanian Academic Society (SAR). The journal publishes a diverse range of political science articles, especially from fields currently under-covered, such as comparative politics, public policy, political economy, or political psychology. Papers are theory-grounded and based on solid empirical work. PolSci is in accreditation with the Social Science Citation Index. According to ISI Web of Knowledge, PolSci’s 5 Year Impact Factor was 0,246. About Guest Editors Zoltán Fazekas is Associate Professor of Business and Politics, with focus on quantitative methods in the Department of International Economics, Government and Business at the Copenhagen Business School. Zoltán holds a Ph.D. in Political Science from the University of Vienna (2012) and his research is broadly at the intersection of political psychology, political communication, and comparative politics. He studies political attitude formation, the role and content of political coverage, and the interaction between political elites and the public. His work has been published, among others, in the American Journal of Political Science, Journal of Communication, British Journal of Political Science, and Political Psychology. His methodological interests and expertise lie in hierarchical modeling, quantitative text analysis, and computational tools for social sciences. Iulia Cioroianu is a Prize Fellow in the Institute for Policy Research at the University of Bath. She holds a Ph.D. in Political Science from New York University and an M.A. from Central European University. Before joining the IPR, she was a research fellow in the Q-Step Centre for Quantitative Social Sciences at the University of Exeter, and a pre-doctoral fellow in the LSE Department of Methodology. Iulia is a social data scientist who studies the effects of social media and online information exposure on political competition and polarization, using natural language processing, quantitative text analysis, machine learning and survey experiments. Her work has been funded by IBM and the UK Economic and Social Research Council and has been published in political science and computer science journals and conference proceedings and featured in Sage and NCRM podcasts and research methods videos. Daniela Crăciun is Lecturer at Bard College Berlin (Germany). She earned a Ph.D. in Political Science from Central European University, an Erasmus Mundus M.A. in Global Studies from the University of Leipzig (Germany), Jawaharlal Nehru University (India) and Wroclaw University (Poland), and a B.A. in Marketing with Media and Cultural Studies from Canterbury Christ Church University (UK). Daniela’s research interests lie in the areas of research design, conceptualization and content analysis. Her Ph.D. dissertation analyzed national higher education internationalization strategies from around the world using computer-assisted text analysis to lift empirical data to a conceptual level. Daniela has been a Visiting Scholar doing research or teaching at the University of Yangon (Myanmar), the Federal University of Sao Carlos (Brazil), and the Center for International Higher Education at Boston College (USA). Her postdoctoral research explores issues of graduate employability.

How can I submit a presentation proposal?

The Workshop targets presentations applying various text analysis methods, which include the following:

  • classical content analysis methods
  • dictionary-based methods
  • classification and machine learning methods
  • scaling methods
  • topic models
  • qualitative techniques of human coding and annotation for the purposes of validation for automated approaches

Presentations problematizing fundamental issues of quantitative text analysis are also encouraged, on topics such as inter-coder agreement, reliability, validation, accuracy, and precision.

Each presentation will benefit of 30 min time slot, which we plan to include 10 minutes of discussions. Participants are offered the opportunity to develop their presentation in a manuscript to be submitted to the Special Issue: Applied Quantitative Text Analysis, coordinated by the key-note-speakers, and hosted by the Romanian Journal of Political Science, Vol. 20, no. 2, Winter 2020.



  • October 30th, 2019 Extended abstracts submission
  • November 15th, 2019 Final program announcement
  • November 28-29th, 2019 Methods workshop
  • January, 31st, 2020
  • Manuscript submission
  • June, 2020
  • Final decision on publication

Two categories of participants[1] are welcomed: presenters and attenders. In fact, we opened attendance to persons who do not present upon the request we received.

Contact: methods@e-uvt.ro


[1] The workshop charges a 40 EUR participation fee. PhD students can apply for fee waivers.

What is content analysis?

General references for scene setting

Daniela Crăciun
Bard College Berlin

Weather written or spoken, social and political actors interact through language. Thus,
researching any type of phenomena and processes ultimately involves analyzing written or
transcribed texts. The technological developments from the last couple of decades have
translated into an ever-increasing proportion of political, social and cultural activity recorded
as digital text. In turn, this has given political scientists new possibilities of testing and
building theories by transforming text to data.

The question then becomes: how do political scientists analyze texts to answer salient
questions about the societies we live in? The answer is: content analysis. Content analysis is
the umbrella name given to a family of research methods for systematically extracting
information from textual data for scientific purposes. As defined by Krippendorff (1980,
p.21), the founding father of this scientific method, “content analysis is any research
technique for making replicative and valid inferences from data to their context”. That
is, text data can be used to make both descriptive and causal inferences about
political phenomena and processes around the world.

How is this done practically? Any content analysis research design involves a
number of steps:

(1) Obtaining a body of texts. In political science these can be
any kind of texts involving political actors (e.g. speeches, party manifestos, laws,
regulations, policies, parliamentary/government records, minutes, blogs, social media
posts, news, etc.). As many documents are now digitized, rather than stored in archives, web scraping tools have considerably improved the efficiency of conducting
this task.

(2) Preparing the texts for analysis. This is a dull but crucial task because of the
‘garbage in, garbage out’ principle. In other words, if the quality of the texts analyzed
is subpar, the results will also be. This is hardly a specific issue of content analysis or
computer assisted content analysis. All other data analysis methods (e.g. statistics,
qualitative case analysis, ethnography) have the same problem.

(3) Coding the texts and transforming them into data. This can be done either
manually (using a coding sheet), automatically (using unsupervised methods of
computer assisted content analysis) or a combination of the two (using supervised
methods of computer assisted content analysis). Natural language processing
algorithms have given rise to a variety of ways to transform text to data with the help
of computers.

(4) Interpreting and narrating the results. Once the analysis is complete,
researchers should make abductive inferences from the data to its context and report
to others what the texts “mean, refer to, entail, provoke or cause” (Krippendorff,
2004, p. 85).
How do we decide which texts to collect, how to prepare them for analysis, how to
code them and reduce their complexity, how to interpret them and relate their
results? The decisions made at each step must be informed by the aim of the
research project. In other words, the central research question that one is trying to
answer. However, the steps that need to be followed in a content analysis research
design stay the same and researchers can think of them as a recipe to be followed.


Krippendorff, K. (1980). Content Analysis: An Introduction to its Methodology.
London: Thousand Oaks.General references for scene setting

Online Exposure to Political and Ideological Content

Evidence from Surveys, Web Browsing Histories and Social Media Data

Iulia Cioroianu, University of Bath

In order to understand change (and stability) in political opinions and behaviour, it is necessary to measure the information individuals are exposed to. The internet and social media allow users to interact, collaborate, create and share information in virtual spaces and communities, and have radically changed the political information environment, including the types of content the public is exposed to as well as the exposure process itself. Individuals are faced with a wider range of options (from social and traditional media), new patterns of exposure (socially mediated and selective) and alternate modes of content production (e.g. user-generated content). This talk provides an overview of the main data collection, processing and text analysis methods which can be used to measure and analyse the political information consumed and shared in this dynamic and interconnected online environment.

The methods presented were used by the ExpoNet project team to study online information exposure over the course of the Brexit Referendum campaign. By linking three types of data (surveys, individual web browsing histories and social media data), we were able to: a. evaluate the popularity of different topics and issues during the campaign; b. examine whether online news exposure exhibits signs of segregation and selectivity by capturing exposure to both traditional news sources and news shared via social media platforms; c. examine what types of individuals are more likely to exhibit selective tendencies; d. compare the topics and ideological leanings of articles read during in referendum campaign with those of articles shared on Twitter.

The presentation provides an overview of the ways in which various methods (web scraping,
social media data collection, storage and processing, keyword and dictionary methods, cosine similarity, supervised classification and topic modelling) were combined in a large-scale research project. Moving closer to a causal identification strategy, I also present ongoing work on a web application which informs users about the ideological leaning of the articles they read and allows researchers to account for self-selection effects in information exposure.

How can we measure meaning?

Citizen portrayals in the news media

Zoltan Fazekas, Copenhagen Business School

The media play a crucial role in the dissemination of politically relevant information, through
simple news articles, opinion pieces or political talk shows among others. The content presented on various news platforms serves as input for many citizens, potentially influencing how people learn about politics or how they perceive political realities. Accordingly, we strive to understand what the news media talks about, why, and how they present information. With technological advances and growing digital archives, quantitative text analysis tools have been used extensively to advance the study of political communication.

In this talk I will first review two quantitative text analysis approaches often used to study
political media content: sentiment analysis and topic models. The main aim is to categorize the type of insights these approaches can offer and focus on the validation requirements. After mapping what these approaches cannot tell us, I introduce word embeddings as a tool to capture a distributional theory of language. This can facilitate our attempts to retrieve the meaning of various words of interest by relying on their context and use. The talk concentrates on an applied example where I contrast the features of the methods reviewed.

Analyzing the Brexit-related news coverage from 2016, the study deals with the potential
measurement of exclusionary (media) populism through systematic differences in the portrayal of citizens. While previous research has treated the outgroup as homogenous and analyzed the broad category of migrants, we focus on a case with multiple (potential) outgroups. Both migrants and EU citizens were established as outgroups: both were portrayed in a very similar manner in the news, but very dissimilar to U.K. citizens. These stark distinctions appear despite the lack of strong sentiment differences in the citizen portrayals and the presence of common broader political topics discussed when these mentions are made. Finally, the results bring further evidence for a convergence between tabloids and broadsheets, as differences in the degree of exclusionary media populism are negligible.