Text Analysis Info

Overview on software that analyses texts and other sources of human communication


Content - quantitative without category system

Last update: 31. October 2014

The following progams do not work with category system. They mostly analyse the co-occurences of the words in the text, some perform multivariate statistical analyses like factor analysis, cluster analysis, or multi-dimensional scaling (MDS). Other use neural networks, but some companies just don't mention what technique their software uses.

CatPac II

program: Catpac II
author: Joseph Woelfel
distributor: Galileo Company
operating system: MS-Windows
documentation: manual
download: no
description: none yet

Hamlet II 3.0

program: Hamlet II 3.0
author: Alan Brier
download: free for personal use
operating system: MS-Windows, Linux
documentation: manual and a tutorial as PDF files
description: The main idea of HAMLET (c) is to search a text file for words in a given vocabulary list, and to count joint frequencies within any specified context unit, or as collocations within a given span of words.
Individual word frequencies (fi), joint frequencies (fij) for pairs of words (i,j), both expressed in terms of the chosen unit of context, and the corresponding standardised joint frequencies
sij = (fij) / (fi + fj - fij)
are displayed in a similarities matrix, which can be submitted to a simple cluster analysis and multi-dimensional scaling.
A further option allows comparison of the results of applying multi-dimensional scaling to matrices of joint frequencies derived from a number of texts, using Procrustean Individual Differences Scaling (PINDIS).
Further procedures are included to help to determine the broad characteristics of word usage in a text:

  • KWIC offers Key-Word-In-Context listings for any given word-string.
  • WORDLIST generates lists of words and frequencies.
  • COMPARE lists words common to pairs of texts, and is useful in generating vocabulary lists, including synonyms, for use in comparing a number of texts.

TextAnalyst 2.3

program: TextAnalyst 2.3
author: MicroSystems
distributor: Megaputer
download: no
operating system: MS-Windows
documentation: web demo and a white paper
description: TextAnalyst is a unique intelligent text processing tool capable of automated semantic analysis, summarisation, and navigation of unstructured natural language texts. In addition, TextAnalyst can help you perform clustering of documents in your textbase, semantic information retrieval, and focus your text exploration around a certain subject.


T-Lab 9.1

program: T-Lab 9.1
author: Franco Lancia
distributor: T-Lab
documentation: in English, Italian, French and Spanish online.
download: demo version (multilingual) with registration and also the manual.
operating system: MS-Windows
description: T-LAB software is an all-in-one set of linguistic and statistical tools for text analysis which can be used in the following research fields: co-occurrence analysis, thematic analysis, comparative analysis, and lexical tools. Available versions are in English, French, Italian, German and Spanish. Currently the automatic lemmatization is available for the following languages: English, French, Italian, German, Spanish and Portuguese. Without automatic lemmatization however, T-LAB allows to analyse texts in all languages supporting ASCII/ANSI format.
There is a limit on the file size of 30 MB, for most analyses this will not be exceeded.