Text Analysis Info

Overview on software that analyses texts and other sources of human communication


Language - linguistics

Last update: 20. April 2017



program: Catma 5 (Computer Aided Textual Markup & Analysis)
author: Jan Christoph Meister, University of Hamburg, Germany
distributor: University of Hamburg, Department of Languages, Germany
documentation: manual
download: from github also source code
operating system(s): MS-Windows, Mac OS-X 10.6 or newer
description: CATMA is released under the GNU general public license v3. The newest CATMA version is implemented as a web application. The development of CATMA was inspired by TACT (short for Textual Analysis Computing Tools), a DOS based tool set for textual analysis created at Toronto University.


Langsoft Text Analysis Software

program: Langsoft Text Analysis software
author: Hristo Georgiev
distributor: Langsoft
documentation: none
download: Trial version for Windows and Linux/MS-DOS
operating system(s): MS-Windows, Linux, MS-DOS
description: Langsoft offers software for parsing, spelling, machine translation, questioning and thesauri. The parsing program handles texts in English, French and German, the spelling program supports Italian also. The machine translation program is for English - German (both directions). English, German, French and Spanish are supported for the thesaurus program.

Profiler Plus 5.8.4

program: Profiler Plus 5.8.4
author: Michael D. Young
distributor: Social Science Automation
documentation: none
download: Trial version data are limited, free version for unfunded academic research. You have to create an account.
operating system(s): MS-Windows
description: A general purpose content analysis engine designed for leadership analysis. Profiler+ searches a sentence from left to right for ordered sets of tokens (words and/or punctuation) that have been identified as indicators of a trait, of another measure of interest or perhaps of a particular type of communication. Profiler+ examines each token in turn and queries a database to determine if the token serves as the anchor for any target sets. If the token does serve as an anchor in one or more target sets the program determines if the other tokens in the set are also present in the sentence in the appropriate order. If all the tokens in a set can be matched then the indicated actions are taken - in the simplest case a code is written to a file. Any remaining target sets that have not been eliminated are ignored.


program: Semato
author: Pierre Plante, Lucie Dumas and André Plante
distributor: University of Montreal, Canada
documentation: online
download: no longer available
operating system: runs as a web service
description: The whole web site is in French, there is no English version available. (C'est Quebec..) Semato is a program that allows the use of quantitative, qualitative, and mixed models.


program: SATO 4.0
author: François Daoust
distributor: University of Montreal, Canada
documentation: manual
download: test
operating system: DOS
description: SATO allows the annotation of multilingual documents, has a query language ensuring the systematic location of textual segments defined by the user, the production of an index; word lists sorted by alphabet or by frequency; the categorisation of words, word-compounds or phrases; the definition of variables to carry out multiple enumerations and lexicometric analyses; dictionary functions, if necessary, of the devices for morphological derivation; an index of legibility (GUNNING).

T-Lab plus 2017

program: T-Lab plus 2017
author: Franco Lancia
distributor: T-lab
documentation: in English, Italian, French and Spanish online.
download: Test version (multilingual) and also the manual.
operating system: MS-Windows
description: T-LAB software is an all-in-one set of linguistic and statistical tools for text analysis which can be used in the following research fields: co-occurrence analysis, thematic analysis, comparative analysis, and lexical tools. Available versions are in English, French, Italian, German and Spanish. Currently the automatic lemmatization is available for the following languages: English, French, Italian, German, Spanish and Portuguese; moreover, without automatic lemmatization, T-LAB allows the analysis of texts in all languages supporting ASCII/ANSI format.
There is a limit on the file size of 30 MB, for most analyses this will not be exceeded.