Last update: 9. May 2025

 

The programs AnyText and ATA (Ashton Text Analyzer), Eric Jonson's programs, Kura, Lexa, MicroOCP, MonoConc were removed because the links are dead and no more information seems to be available.

aconcorde

program: aconcorde
author: Andew Roberts
distributor: Andrew Roberts
documentation: none
docwnload: free version for Windows, Mac OS, and Linux
operating system(s): MS-Windows, Mac OS, Linux with source provided
description: aConcorde is a multi-lingual concordance tool. Originally developed for native Arabic concordance, it posses basic concordance functionality, as well as English and Arabic interfaces. Written in Java, so will run on any platform that has the Java Runtime Environment installed.

Analysis 2.94

program: Analysis 2.94
author: Giovanni Lo Conti
distributor: Giovanni Lo Conti (mc4386@openaccess.it
)
documentation: none
download: free version
operating system: MS-Windows, Digital Unix, Acorn RiscOS
description: Analysis is a program which allows several types of analysis about the text: concordances, KWIC, KWOC, indexes of readability, co-occurrences, lemmatization, statistics about the sentences, non intelligent abstract; Summary; meaningful and sense; Incipit; explicit; frequency; for many procedures it is possible to delimite the range or compare the text with an electronic dictionary; it is provided whith Help, Help on line, and Wimp.

Note: This program's last version is from 2001.

AntConc 4.3.1

program: AntConc 4.3.1
author: Laurence Anthony
distributor: Laurence Anthony
documentation: Tutorials including videos, text materials available in English, Japanese, Korean, Arabic, and German.
download: free version
operating system: MS-Windows, Mac OS-X, Linux
description: This is a free concordance program.

AntPConc 1.2.1

program: AntPConc
author: Laurence Anthony
distributor: Laurence Anthony
documentation: help file
docwnload: free version for Windows, Mac OS, and Linux
operating system(s): MS-Windows, Mac OS, Linux
description: A freeware parallel corpus analysis toolkit for concordancing and text analysis using UTF-8 encoded text files. Laurence Anthony developed a big set of tools.

AskSam 7

program: Ask Sam 7
author: Ask Sam Software Development
distributor: Ask Sam Software Development
documentation: none
download: trial version Trial version
operating system: MS-Windows, Mac OS-X, IOS
description: AskSam is a fast information retrieval program and allows searching in E-mails and PDF-files. The new professional version allows programming (e.g. with Visual Basic).
Note: last verison was published in 2012

Casual Conc 4.0.1

program: Casual Conc 4.0.1
author: Yasu Imao, Osaka University
distributor: Yasu Imao, Osaka University
documentation: manual
docwnload: free version for Windows, Mac OS, and Linux
operating system(s): Mac OS 12.3 or later, different versions for older Mac OS versions are available
description: CasualConc is a concordance program that runs natively on macOS 11.3 or later. The original version (pre version 1.0) was designed for casual use (preliminary analysis or non-research purposes), so the name is CasualConc. The current version (3.0.x) is probably good enough for more extensive use. It can generate KWIC concordance lines, word clusters, collocation analysis, and word count. This program is only tested with English text just because that is the only language I can understand other than Japanese (though I heard from some people that CasualConc works ok with other European languages, such as Greek, Italian, etc.). Technically, it should be able to handle any language macOS can. If you use CasualConc with languages other than English, let me know how well it works.

CATMA 7

program: Catma 7 (Computer Aided Textual Markup & Analysis)
author: Jan Christoph Meister, University of Hamburg, Germany
distributor: University of Hamburg, Department of Languages, Germany
documentation: Compact manual
download: from github also source code
operating system(s): MS-Windows, Mac OS-X 10.6 or newer
description: CATMA is released under the GNU general public license v3. The newest CATMA version is implemented as a web application. The development of CATMA was inspired by TACT (short for Textual Analysis Computing Tools), a DOS based tool set for textual analysis created at Toronto University.

Clic 2.1.2

program: Clic 2.1.2
author: CLiC is a collaborative project between the University of Birmingham and the University of Nottingham. (Arts and Humanities Research Council grant reference AH/P504634/1)
distributor: Yasu Imao, Osaka University
documentation: user guide
docwnload: none, works with a browser
operating system(s): web application
description: The CLiC web app has been developed as part of the CLiC Dickens project, which demonstrates through corpus stylistics how computer-assisted methods can be used to study literary texts and lead to new insights into how readers perceive fictional characters.

Collocate 2.0

program: Collocate 2.0
author: Michael Barlow
distributor: Athelstan
documentation: is in the test version file
download: demo not found. The demo processes data in the same manner as the full version, but the results are limited to the top 5 items.
operating system(s): MS-Windows
description: Collocate is a new software program that can be used to find collocations or terms in a corpus. There are three main components: 

  • Search for a word (phrase) within a set span (e.g. 4 words). The program lists all the collocations containing the searchword and provides frequency and/or statistical information (Log Likelihood, Mutual Information).
  • Produce an n-gram list for the corpus.
  • Extract collocations from the corpus as a whole. 
corpkit

program: corpkit
author: Daniel MacDonald
distributor: Daniel McDonald
documentation: manual
docwnload: free version for Windows, Mac OS, and Linux
operating system(s): Windows(runnin gpython script), Mac OS, Linux
description: corpkit is a tool for doing corpus linguistics. It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses. Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.

Corpus Presenter 2025

program: Corpus Presenter 2025
author: Raymond Hickey
distributor: Raymond Hickey
documentation: manual
download: full and free version
operating system(s): WinXP and newer
description: Corpus Presenter is a suite of programs designed to work with both existing corpora and any files which users might wish to examine for linguistically interesting structures. It has all the options of standard corpus software, i.e. it can generate concordances, word lists and perform a whole range of text retrieval tasks and generate reverse dictionaries of words in texts. It does not require that texts are prepared in any way, e.g. by indexing them in advance.

ELAN 6.9

program: Elan 6.9
author: unknown author: Andew Roberts
distributor: Max Planck Institute for Psycholinguistics,Nijmegen, The Nederlands
documentation: manual
docwnload: free version for Windows, Mac OS, and Linux
operating system(s): MS-Windows, Mac OS-X
description: ELAN is a freevideo annotation software. The source code is also available. There is a special version for Apple's M-processors. An annotation can be a sentence, word or gloss, a comment, translation or a description of any feature observed in the media. Annotations can be created on multiple layers, called tiers. Tiers can be hierarchically interconnected. An annotation can either be time-aligned to the media or it can refer to other existing annotations. The content of annotations consists of Unicode text and annotation documents are stored in an XML format (EAF).

IMS Open Corpus Workbench (CWB) 3.5

program: IMS Open Corpus Workbench (CWB) 3.5
author: Stephanie Evert
distributor: IMS Stuttgart, Germany
documentation: set of manuals
docwnload: free version for Windows, Mac OS, and Linux
operating system(s): Windows(running python script), Mac OS, Linux
description: The IMS Open Corpus Workbench (CWB) is a collection of open-source tools for managing and querying large text corpora (up to 2 billion words) with linguistic annotations. Its central component is the flexible and efficient query processor CQP.

kwords

program: kwords
authors Václav Horký, Pavel Vondricka, Václav Cvrcek
distributor: Václav Horký, Pavel Vondricka, Václav Cvrcek
documentation: documentation
docwnload: none, nline use
operating system(s): works with a browser
description: The KWords application is used for text analysis and identification of keywords, i.e. units that have unexpectedly high relative frequency in the target text compared to the reference corpus. Compared to the first version, the second one has a number of improvements: it allows to analyse not only word forms but also lemmata and other units, it works with more than 30 languages and it also allows to perform keymorph analysis (see Fidler & Cvrcek 2019).

KWIC Concordance 5.0

program: KWIC Concordance 5.0
author: Satoru Tsukamoto, former at College of Humanities and Sciences, English Department, Nihon University, Japan
distributor: several download sites
documentation: none
download: free
operating system(s): MS-Windows, Mac OS
description: The KWIC Concordance is a corpus analytical tool for making word frequency lists, concordances and collocation tables by using electronic files. This program offers the capability of handling markup schemes, such as COCOA, SGML, the Helsinki corpus, the Penn-Helsinki Parsed Corpus of Middle English (Phase 1) (Phase 2) etc. This is freeware software.

Langsoft Text Analysis Software

program: Langsoft Text Analysis software
author: Hristo Georgiev
distributor: Langsoft
documentation: none
download: Trial version for Windows and Linux/MS-DOS
operating system(s): MS-Windows, Linux, MS-DOS
description: Langsoft offers software for parsing, spelling, machine translation, questioning and thesauri. The parsing program handles texts in English, French and German, the spelling program supports Italian also. The machine translation program is for English - German (both directions). English, German, French and Spanish are supported for the thesaurus program.

ParaConc

program: ParaConc
author: Michael Barlow
distributor: Michael Barlow
documentation: short paper, the link dead
docwnload: demo version
operating system(s): MS-Windows, nothing is specified on the web page

Profiler Plus 5.8.4

program: Profiler Plus 5.8.4
author: Michael D. Young
distributor: Social Science Automation
documentation: none
download: free version for unfunded academic research. You have to create an account.
operating system(s): MS-Windows
description: A general purpose content analysis engine designed for leadership analysis. Profiler+ searches a sentence from left to right for ordered sets of tokens (words and/or punctuation) that have been identified as indicators of a trait, of another measure of interest or perhaps of a particular type of communication. Profiler+ examines each token in turn and queries a database to determine if the token serves as the anchor for any target sets. If the token does serve as an anchor in one or more target sets the program determines if the other tokens in the set are also present in the sentence in the appropriate order. If all the tokens in a set can be matched then the indicated actions are taken - in the simplest case a code is written to a file. Any remaining target sets that have not been eliminated are ignored.

PyXMLConc

program: PyXMLConc
author: Ingo Kl
distributor: Ingo Kl
documentation: manual
docwnload: free version written in Python
operating system(s): Windows(running python script), Mac OS, Linux
description: PyXMLConc is a very simple concordancer. It is supposed to be used in exploratory analysis of XML-annotated corpora. Its primary feature lies in the automatic detection of XML tags and attributes. The search/concordancing function supports regular expressions.

SATO 4.5

program: SATO 4.5
author: François Daoust
distributor: University of Montreal, Canada
documentation: manual
download: test
operating system: DOS
description: SATO allows the annotation of multilingual documents, has a query language ensuring the systematic location of textual segments defined by the user, the production of an index; word lists sorted by alphabet or by frequency; the categorisation of words, word-compounds or phrases; the definition of variables to carry out multiple enumerations and lexicometric analyses; dictionary functions, if necessary, of the devices for morphological derivation; an index of legibility (GUNNING).

Semato 3.0

program: Semato 3.0
author: Pierre Plante, Lucie Dumas and André Plante
distributor: University of Montreal, Canada
documentation: online
download: no longer available
operating system: runs as a web service
description: The whole web site is in French, there is no English version available. (C'est Quebec..) Semato is a program that allows the use of quantitative, qualitative, and mixed models.

SCP 5.0.9 - Simple Concordance Program

program: SCP 5.0.9 - Simple Concordance Program
author/distributor: Aland Reed
download: free software
documentation: help file as a PDF-file
operating systems: MS-Windows, Mac OS-X, Linux Ubuntu
description: This free program lets you create word lists and search natural language text files for words, phrases, and patterns. SCP is a concordance and word listing program that is able to read texts written in many languages. There are built-in alphabets for English, French, German, Greek, Russian, etc. SCP contains an alphabet editor which you can use to create alphabets for any other language.

Textalyzer

program: Textalyzer
author: Bernhard Huber
distributor: SEOScout
documentation: self explaining
download: none
operating system: runs on a web site
description: Textalyser is a free text analysis tool that counts words, sentences, syllables, and lexical density. It also computes the Gunning readability index. A small but nice tool that counts syllables correct at least for English, French, and German. You can cut and paste text or specify a web page.

T-Lab 10.6

program: T-Lab 10.6
author: Franco Lancia
distributor: T-lab
documentation: in English, Italian, French and Spanish online.
download: Test version (multilingual) and also the manual.
operating system: MS-Windows
description: T-LAB software is an all-in-one set of linguistic and statistical tools for text analysis which can be used in the following research fields: co-occurrence analysis, thematic analysis, comparative analysis, and lexical tools. Available versions are in English, French, Italian, German and Spanish. Currently the automatic lemmatization is available for the following languages: English, French, Italian, German, Spanish and Portuguese; moreover, without automatic lemmatization, T-LAB allows the analysis of texts in all languages supporting ASCII/ANSI format.
There is a limit on the file size of 30 MB, for most analyses this will not be exceeded.

Textstat 3

program: Textstat 3
author: Matthias Hüning
distributor: Matthias Hüning
documentation: tutorial
download: version 2.9 freeware, version 3 is beta and must be compiled by yourself
operating system: MS-Windows, Mac OS-X, Linux. needs Python
description: TextSTAT is a simple programme for the analysis of texts. It reads ASCII/ANSI texts and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. The programme is distributed as freeware. Source code in Python is also available for free. The user interface is provided in English, German, Dutch, Potugese, Spanish, Catalan, French, Italian, Galician, Finnish (Suomi), Polish, or Czech.

Wordless 3.5.0

program: Wordless 3.5.0
author: Ye Lei
distributor: Ye Lei
documentation: documentation
docwnload: download section
operating system(s): MS-Windows, Mac OS, Linux
description: Computes frequencies of words, lexical density, dispersion and values of readability formulas. Wordless is an integrated corpus tool with multilingual support for the study of language, literature, and translation.

Wordsmith 9.0

program: WordSmith 9.0
author: Mike Scott
distributor: Mike Scott, Liverpool University
documentation: manual in English, French, and German
download: download page for different versions shows a sample of the results only
operating system: MS-Windows
description: WordSmith is the sucessor of MicroConcord.

WordWanderer

program: WordSWanderer
author: Marian Dörk and Dawn Knight
distributor: Marian Dörk and Dawn Knight, Newcastle University, UK
docwnload: source code
operating system(s): runs within a browser, seems to be written in Javascript
description: We are experimenting with visual ways in which we can enhance people's engagement with language. By fusing the information we can obtain from corpus searches, concordance outputs and word clouds we are aiming to enable and encourage people to notice and wander through the words they read, write and speak.