About CAT Scanner

Computer-aided text analysis is a family of techniques that bridge qualitative and quantitative research. By measuring phenomena of interest through the language used in texts, researchers can conduct quantitative analyses based on the rich qualitative data sources. This technique has been used in a broad range of literatures to measure myriad constructs of interest.

CAT Scanner is the brainchild of Dr. Aaron F McKenny and Dr. Jeremy C Short, who developed the tool to provide participants in a CARMA short course on content analysis with a free alternative to the expense of paid text analytic packages, the time consumed by manual coding, and the learning curve of writing custom code. 

Peer-reviewed management research indicates that data generated using CAT Scanner are consistent with those generated by other popular text analytic packages, such as Linguistic Inquiry and Word Count (LIWC) and Diction. CAT Scanner has also been used for text analytic measurement in a number of peer-reviewed manuscripts.

All this being said, researchers should choose the tool that is most appropriate for the research question being asked and the theory being used. That isn’t always going to be CAT Scanner. Click below to see a (admittedly non-comprehensive) guide to choosing the right text analytic tool). Naturally, we hope you like and find our tool useful. However, as advocates for rigorous text analyses in organizational science and beyond, it’s more important to us that you find and use the right tool for the job.


What is the right tool for me?






CAT Scanner Functionality

CAT Scanner includes three basic tools facilitating research using a variety of text analytic techniques.


When collecting texts from images or PDFs, garbage characters (e.g.,↑, ¤, or ¬) may be created as a result of the impreciseness of many optical character recognition tools. These garbage characters reduce the validity of computer-aided text analyses by breaking up otherwise coherent words. CAT Scanner can systematically eliminate special characters from text files, resulting in clean texts, and by extension cleaner data.


When developing and validating custom dictionaries for computer-aided text analysis it is a best practice to use a combined deductive (theory-driven) and inductive (data-driven) approach. CAT Scanner’s inductive word list generation tool identifies all words that are used three or more times in text files, facilitating the inductive aspect of dictionary development.


CAT Scanner allows the user to use dictionaries developed by others or to develop their own custom dictionaries to measure constructs of interest.