The programs AnyText and ATA (Ashton Text Analyzer), Eric Jonson's programs, Kura, Lexa, MicroOCP, MonoConc were removed because the links are dead and no more information seems to be available. A lot of new programs were added.
aconcorde |
program:
aconcorde
author: Andew Roberts
distributor: Andrew Roberts
documentation: none
docwnload:
free version for Windows, Mac OS, and Linux
operating system(s): MS-Windows, Mac OS, Linux with source provided
description: aConcorde is a multi-lingual concordance tool. Originally developed for native Arabic concordance, it posses basic concordance functionality, as well as English and Arabic interfaces. Written in Java, so will run on any platform that has the Java Runtime Environment installed.
Analysis 2.94 |
program:
Analysis 2.94
author: Giovanni Lo Conti
distributor: Giovanni Lo Conti (mc4386@openaccess.it)
documentation: none
download:
free version
operating system: MS-Windows, Digital Unix, Acorn RiscOS
description: Analysis is a program which allows several types of analysis about the text: concordances, KWIC, KWOC, indexes of readability, co-occurrences, lemmatization, statistics about the sentences, non intelligent abstract; Summary; meaningful and sense; Incipit; explicit; frequency; for many procedures it is possible to delimite the range or compare the text with an electronic dictionary; it is provided whith Help, Help on line, and Wimp.
Note: This program's last version is from 2001.
AMALGrAM 2.0 |
program:
AMALGrAM 2.0
authors and distributors:
Nathan Schneider
documentation:
readme instructions
download:
source code to compile yourself
operating system(s): Mac OS and other Unix systems
description: AMALGrAM (A Machine Analyzer of Lexical Groupings And Meanings) analyzes English sentences for multiword expressions (MWEs) and noun and verb supersenses. AMALGrAM, an English supersense tagger written in Python
AntConc 4.3.1 |
program:
AntConc 4.3.1
author:
Laurence Anthony
distributor:
Laurence Anthony
documentation:
Tutorials including videos, text materials available in English, Japanese, Korean, Arabic, and German.
download:
free version
operating system: MS-Windows, Mac OS-X, Linux
description: This is a free concordance program.
AntPConc 1.2.1 |
program:
AntPConc
author:
Laurence Anthony
distributor: Laurence Anthony
documentation:
help file
docwnload:
free version for Windows, Mac OS, and Linux
operating system(s): MS-Windows, Mac OS, Linux
description: A freeware parallel corpus analysis toolkit for concordancing and text analysis using UTF-8 encoded text files. Laurence Anthony developed a big set of tools.
AskSam 7 |
program:
Ask Sam 7
author:
Ask Sam Software Development
distributor:
Ask Sam Software Development
documentation: none
download:
trial version Trial version
operating system: MS-Windows, Mac OS-X, IOS
description: AskSam is a fast information retrieval program and allows searching in E-mails and PDF-files. The new professional version allows programming (e.g. with Visual Basic).
Note: last verison was published in 2012
Casual Conc 4.0.1 |
program:
Casual Conc 4.0.1
author:
Yasu Imao, Osaka University
distributor: Yasu Imao, Osaka University
documentation:
manual
docwnload:
free version for Windows, Mac OS, and Linux
operating system(s): Mac OS 12.3 or later, different versions for older Mac OS versions are available
description: CasualConc is a concordance program that runs natively on macOS 11.3 or later. The original version (pre version 1.0) was designed for casual use (preliminary analysis or non-research purposes), so the name is CasualConc. The current version (3.0.x) is probably good enough for more extensive use. It can generate KWIC concordance lines, word clusters, collocation analysis, and word count. This program is only tested with English text just because that is the only language I can understand other than Japanese (though I heard from some people that CasualConc works ok with other European languages, such as Greek, Italian, etc.). Technically, it should be able to handle any language macOS can. If you use CasualConc with languages other than English, let me know how well it works.
CATMA 7 |
program:
Catma 7 (Computer Aided Textual Markup & Analysis)
author:
Jan Christoph Meister, University of Hamburg, Germany
distributor:
University of Hamburg, Department of Languages, Germany
documentation:
Compact manual
download:
from github also source code
operating system(s): MS-Windows, Mac OS-X 10.6 or newer
description: CATMA is released under the GNU general public license v3. The newest CATMA version is implemented as a web application. The development of CATMA was inspired by TACT (short for Textual Analysis Computing Tools), a DOS based tool set for textual analysis created at Toronto University.
Clic 2.1.2 |
program:
Clic 2.1.2
author: CLiC is a collaborative project between the University of Birmingham and the University of Nottingham. (Arts and Humanities Research Council grant reference AH/P504634/1)
distributor: Yasu Imao, Osaka University
documentation:
user guide
docwnload: none, works with a browser
operating system(s): web application
description: The CLiC web app has been developed as part of the CLiC Dickens project, which demonstrates through corpus stylistics how computer-assisted methods can be used to study literary texts and lead to new insights into how readers perceive fictional characters.
Collocate 2.0 |
program:
Collocate 2.0
author:
Michael Barlow
distributor:
Athelstan
documentation: is in the test version file
download:
demo not found. The demo processes data in the same manner as the full version, but the results are limited to the top 5 items.
operating system(s): MS-Windows
description: Collocate is a new software program that can be used to find collocations or terms in a corpus. There are three main components:
- Search for a word (phrase) within a set span (e.g. 4 words). The program lists all the collocations containing the searchword and provides frequency and/or statistical information (Log Likelihood, Mutual Information).
- produce a corpus for list, n-gram, extract collocations
Collogram 1.0.2 |
program:
Collogram 1.0.2
author and distributor:
Dongkwang Shin
documentation:
documentation (in Korean)
download:
download
operating system(s): unclear
description: The analyses of MWUs in the existing MWU programs have often been based on the repetition of ‘N-gram’ patterns rather than a specific MWU list. In comparison, ColloGram named from the compound, Collocation and N-gram or Program, bases its analysis on a Multiword Unit (MWU) list from the Corpus of Contemporary American English (COCA) which so far (1990-2015) consists of a 5 hundred-million-word corpus.
ConcGram 1.0.2 |
program:
ConcGram 1.0.2
author and distributor:
Chris Greaves, The Hong Kong Polytechnic University
documentation:
user manual
download:
download as a ZIP-file also
source code
operating system(s): MS Windows
description: ConcGramCore users can also have the options to select the desire segmentation methods. A simple segmentation method separate English words by punctuations, white spaces and paragraph marks. ConcGramCore also utilises the Stanford Part-of-Speech Tagger engine for more accurate segmentation and for compatibility to handle segmentation of other languages such as Arabic and Chinese (with modifications on the code). ConcGramCore processes corpora in batch. The output is saved automatically.
CorefAnnotator 2.1.1 |
program:
CorefAnnotator 2.1.1
author and distributor: Niels Reiter
documentation: none
download:
different versions
operating system(s): written in Java , verison 8 requires, source code provided
description: This is an annotation tool for coreference. It's built on top of Apache's UIMA, and works with long documents and long coreference chains.
corpkit |
program:
corpkit
author:
Daniel MacDonald
distributor: Daniel McDonald
documentation:
manual
docwnload:
free version for Windows, Mac OS, and Linux
operating system(s): Windows(runnin gpython script), Mac OS, Linux
description: corpkit is a tool for doing corpus linguistics. It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses. Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.
Corpus Explorer 2.0 |
program:
Corpus Explorer 2.0
author and distributor:
Jan Oliver Rüdiger
documentation:
online manual, the link to the PDF is dead
download:
download
operating system(s): MS-Windows
description: The website is in German. CE allows the processing of huge corpora, has an integrated webcrawler, generates frequencies, n-grams, KWICs, co-occurrences, phrases, distributions etc. SDK for NET-languages is available.
Corpus Presenter 2025 |
program:
Corpus Presenter 2025
author:
Raymond Hickey
distributor: Raymond Hickey
documentation:
manual
download:
full and free version
operating system(s): WinXP and newer
description: Corpus Presenter is a suite of programs designed to work with both existing corpora and any files which users might wish to examine for linguistically interesting structures. It has all the options of standard corpus software, i.e. it can generate concordances, word lists and perform a whole range of text retrieval tasks and generate reverse dictionaries of words in texts. It does not require that texts are prepared in any way, e.g. by indexing them in advance.
Corpus Tools |
program:
Corpus Tools
author and distributor:
Kaspar Welbers
documentation:
documentation
download:
download
operating system(s): requires R
description: The corpustools package offers various tools for anayzing text corpora. The backbone is the tCorpus R6 class, which offers features ranging from corpus management tools such as pre-processing, subsetting, Boolean (Lucene) queries and deduplication, to analysis techniques such as corpus comparison, document comparison, semantic network analysis and topic modeling. Furthermore, by using tokenized texts as the backbone, it is made easy to reconstruct texts for a qualitative analysis and/or validation of the results of computational text analysis methods (e.g., topic browsers, keyword-in-context lists, texts with highlighted segments for search results or document comparisons).
DART 3.0 |
program:
Dart 3.0 The Dialogue Annotation and Research Tool
author: Martin Weisser
distributor: Martin Weisser
documentation:
manual
docwnload:
MS Windows 64 bit version
operating system(s): MS Windows
description: The Dialogue Annotation and Research Tool is an annotation tool and linguistic research environment that not only makes it possible to annotate large numbers of dialogues automatically, but also provides facilities for pre- and post-editing dialogue data, as well as conducting different types of analysis on annotated and un-annotated data in order to improve the annotation process. It uses an own speech act taxonomy.
DocuScope 3.1.0 |
program:
DocuScope 3.1.0
authors and distributors:
David Kaufer and
Suguru Ishizaki
documentation:
Corpora and Rhetorically Informed Text Analysis - The diverse applications of DocuScope
download:
download
operating system(s): MS Windows, Mac OS
description: DocuScope™ is a dictionary-based text analysis and visualization tool that supports the rhetorical analysis of texts from both a quantitative and qualitative perspective using a home-grown dictionary.
ELAN 6.9 |
program:
Elan 6.9
author: unknown author:
Andew Roberts
distributor:
Max Planck Institute for Psycholinguistics,Nijmegen, The Nederlands
documentation:
manual
docwnload:
free version for Windows, Mac OS, and Linux
operating system(s): MS-Windows, Mac OS-X
description: ELAN is a freevideo annotation software. The source code is also available. There is a special version for Apple's M-processors. An annotation can be a sentence, word or gloss, a comment, translation or a description of any feature observed in the media. Annotations can be created on multiple layers, called tiers. Tiers can be hierarchically interconnected. An annotation can either be time-aligned to the media or it can refer to other existing annotations. The content of annotations consists of Unicode text and annotation documents are stored in an XML format (EAF).
IMS Open Corpus Workbench (CWB) 3.5 |
program:
IMS Open Corpus Workbench (CWB) 3.5
author:
Stephanie Evert
distributor: IMS Stuttgart, Germany
documentation:
set of manuals
docwnload:
free version for Windows, Mac OS, and Linux
operating system(s): Windows(running python script), Mac OS, Linux
description: The IMS Open Corpus Workbench (CWB) is a collection of open-source tools for managing and querying large text corpora (up to 2 billion words) with linguistic annotations. Its central component is the flexible and efficient query processor CQP.
kwords |
program:
kwords
authors Václav Horký, Pavel Vondricka, Václav Cvrcek
distributor: Václav Horký, Pavel Vondricka, Václav Cvrcek
documentation:
documentation
docwnload: none, nline use
operating system(s): works with a browser
description: The KWords application is used for text analysis and identification of keywords, i.e. units that have unexpectedly high relative frequency in the target text compared to the reference corpus. Compared to the first version, the second one has a number of improvements: it allows to analyse not only word forms but also lemmata and other units, it works with more than 30 languages and it also allows to perform keymorph analysis (see Fidler & Cvrcek 2019).
KWIC Concordance 5.0 |
program:
KWIC Concordance 5.0
author: Satoru Tsukamoto, former at College of Humanities and Sciences, English Department, Nihon University, Japan
distributor: several download sites
documentation: none
download:
free
operating system(s): MS-Windows, Mac OS
description: The KWIC Concordance is a corpus analytical tool for making word frequency lists, concordances and collocation tables by using electronic files. This program offers the capability of handling markup schemes, such as COCOA, SGML, the Helsinki corpus, the Penn-Helsinki Parsed Corpus of Middle English (Phase 1) (Phase 2) etc. This is freeware software.
LancsBox X 5.0.3 |
program:
LancsBox X 5.0.3
author and distributor:
Vaclav Brezina (lead), William Platt (developer), Emil Tangham (online support).
documentation:
manual
download:
different versions
operating system(s): MS Windows, Mac OS, Linux
description: LancsBox X is a powerful tool for the analysis of language: millions and billions of words. Pesonal note: Everything is on one web page.
Langsoft Text Analysis Software |
program:
Langsoft Text Analysis software
author: Hristo Georgiev
distributor:
Langsoft
documentation: none
download:
Trial version for Windows and Linux/MS-DOS
operating system(s): MS-Windows, Linux, MS-DOS
description: Langsoft offers software for parsing, spelling, machine translation, questioning and thesauri. The parsing program handles texts in English, French and German, the spelling program supports Italian also. The machine translation program is for English - German (both directions). English, German, French and Spanish are supported for the thesaurus program.
MMAX 2 |
program:
MMAX 2
author and distributor:
Mark-Christoph Müller
documentation:
Quick Start guide
download:
download on GitHub
operating system(s): writte in Java, OS independent
description: MMAX is an annotation tool written in Java.
Orange 3.38.1 |
program:
Orange 3.38.1
authors and distributors:
Bioinformatics Laboratory, Faculty of Computer and Information Science, University of Ljubljana, Slovenia
documentation:
Getting started and video tutorials
download:
Orange 3.38.1
operating system(s): MS Windows, Mac OS, free software
description: Orange is a comprehensive, component-based software suite for machine learning and data mining, developed at Bioinformatics Laboratory, Faculty of Computer and Information Science, University of Ljubljana, Slovenia, together with open source community.
ParaConc |
program:
ParaConc
author: Michael Barlow
distributor: Michael Barlow
documentation:
short paper, the link dead
docwnload:
demo version
operating system(s): MS-Windows, nothing is specified on the web page
Pareidoscope 0.11.0 |
program:
Pareidoscope 0.11.0
author and distributor:
Thomas Proisl
documentation:
read me file
download:
different versions
operating system(s): written in Python
description: The Pareidoscope is a collection of tools for determining the association between arbitrary linguistic structures, e.g. between words (collocations), between words and structures (collostructions) or between structures.
Praaline 0.9.0701 |
program:
Praaline 0.9.0701
author and distributor:
George Christodoulides
documentation:
none yet
download:
download on GitHub open source
operating system(s): MS Windows, Mac OS, Linux, different versions
description: Praaline is a system for managing, annotating, visualising and analysing spoken language corpora.
Profiler Plus 5.8.4 |
program:
Profiler Plus 5.8.4
author: Michael D. Young
distributor:
Social Science Automation
documentation: none
download: free version for unfunded academic research. You have to create an account.
operating system(s): MS-Windows
description: A general purpose content analysis engine designed for leadership analysis. Profiler+ searches a sentence from left to right for ordered sets of tokens (words and/or punctuation) that have been identified as indicators of a trait, of another measure of interest or perhaps of a particular type of communication. Profiler+ examines each token in turn and queries a database to determine if the token serves as the anchor for any target sets. If the token does serve as an anchor in one or more target sets the program determines if the other tokens in the set are also present in the sentence in the appropriate order. If all the tokens in a set can be matched then the indicated actions are taken - in the simplest case a code is written to a file. Any remaining target sets that have not been eliminated are ignored.
PyXMLConc |
program:
PyXMLConc
author: Ingo Kl
distributor: Ingo Kl
documentation:
manual
docwnload:
free version written in Python
operating system(s): Windows(running python script), Mac OS, Linux
description: PyXMLConc is a very simple concordancer. It is supposed to be used in exploratory analysis of XML-annotated corpora. Its primary feature lies in the automatic detection of XML tags and attributes. The search/concordancing function supports regular expressions.
RSTTools 3.0 |
program:
RSTTools 3.0
author and distributor:
Michael O'Donnell, WagSoft Linguistic Software
documentation:
online manual
download:
download version for MS Windows, Mac OS, and Linux
operating system(s): MS Windows, Mac OS, Linux, requires Tcl/Tk, free software
description: the RST Tool, a graphical interface for marking up the structure of text. While primarily intended to be used for Rhetorical Structure (cf. Rhetorical Structure Theory (RST): Mann & Thompson 1988), the tool also allows the mark-up of constituency-style analysis, as in the Generic Structure Potential (GSP - cf. Hasan 1984; Halliday & Hasan 1985).
SATO 4.5 |
program:
SATO 4.5
author:
François Daoust
distributor:
University of Montreal, Canada
documentation:
manual
download:
test
operating system: DOS
description: SATO allows the annotation of multilingual documents, has a query language ensuring the systematic location of textual segments defined by the user, the production of an index; word lists sorted by alphabet or by frequency; the categorisation of words, word-compounds or phrases; the definition of variables to carry out multiple enumerations and lexicometric analyses; dictionary functions, if necessary, of the devices for morphological derivation; an index of legibility (GUNNING).
Semato 3.0 |
program:
Semato 3.0
author:
Pierre Plante, Lucie Dumas and André Plante
distributor:
University of Montreal, Canada
documentation:
online
download: no longer available
operating system: runs as a web service
description: The whole web site is in French, there is no English version available. (C'est Quebec..) Semato is a program that allows the use of quantitative, qualitative, and mixed models.
SCP 5.0.9 - Simple Concordance Program |
program:
SCP 5.0.9 - Simple Concordance Program
author/distributor: Aland Reed
download:
free software
documentation:
help file as a PDF-file
operating systems: MS-Windows, Mac OS-X, Linux Ubuntu
description: This free program lets you create word lists and search natural language text files for words, phrases, and patterns. SCP is a concordance and word listing program that is able to read texts written in many languages. There are built-in alphabets for English, French, German, Greek, Russian, etc. SCP contains an alphabet editor which you can use to create alphabets for any other language.
Sketch Engine |
program:
Sketch Engine
author: Lexical Computing CZ s.r.o
distributor: Lexical Computing CZ s.r.o
documentation:
user guide
docwnload:
trial version for 30 days
operating system(s): runs using a browser
description: Sketch Engine is the ultimate tool to explore how language works. Its algorithms analyze authentic texts of billions of words (text corpora) to identify instantly what is typical in language and what is rare, unusual or emerging usage. It is also designed for text analysis or text mining applications
SPPAS |
program:
SPPAS
author and distributor:
Brigitte Bigi
documentation:
book
download:
SPPAS is a free and open source software package protected by public licenses, in French and English.
operating system(s): MS Windows, Mac OS, Linux, open source written in Python
description: SPPAS is able to produce automatically speech annotations from a recorded speech sound and its orthographic transcription. Version 5 is expected to be release in late 2025, early 2026.
stylo: R package for stylometric analyses 0.7.5 |
program:
stylo: R package for stylometric analyses 0.7.5
authors and distributors: Maciej Eder, Mike Kestemont, Jan Rybicki, Steffen Pielström
documentation:
First steps - How To
tutorial:
tutorial on Youtube
operating system(s): Stylo requires R
description: The website contains all necessary information on one page: tutorials, manuals, general information and installations issues. This package provides a number of functions, supplemented by a GUI, to perform various analyses in the field of computational stylistics, authorship attribution, etc.
TEITOK |
program:
TEITOK
author and distributor:
Maarten Janssen
documentation:
several links to publications that use TEITOK
download:
download on GitLab
operating system(s): written in PHP/Javascript
description: TEITOK is a web-based platform for viewing, creating, and editing corpora with both rich textual mark-up and linguistic annotation, initially developed at the Centro de Linguística da Universidade de Lisboa, later at CELGA-ILTEC, and currently maintained at the ÚFAL institute of Charles University, Prague.
Textalyzer |
program:
Textalyzer
author: Bernhard Huber
distributor:
SEOScout
documentation: self explaining
download: none
operating system: runs on a web site
description: Textalyser is a free text analysis tool that counts words, sentences, syllables, and lexical density. It also computes the Gunning readability index. A small but nice tool that counts syllables correct at least for English, French, and German. You can cut and paste text or specify a web page.
T-Lab 10.6 |
program:
T-Lab 10.6
author:
Franco Lancia
distributor:
T-lab
documentation:
in English, Italian, French and Spanish online.
download:
Test version (multilingual) and also the manual.
operating system: MS-Windows
description: T-LAB software is an all-in-one set of linguistic and statistical tools for text analysis which can be used in the following research fields: co-occurrence analysis, thematic analysis, comparative analysis, and lexical tools. Available versions are in English, French, Italian, German and Spanish. Currently the automatic lemmatization is available for the following languages: English, French, Italian, German, Spanish and Portuguese; moreover, without automatic lemmatization, T-LAB allows the analysis of texts in all languages supporting ASCII/ANSI format.
There is a limit on the file size of 30 MB, for most analyses this will not be exceeded.
Textstat 3 |
program:
Textstat 3
author:
Matthias Hüning
distributor:
Matthias Hüning
documentation:
tutorial
download:
version 2.9 freeware, version 3 is beta and must be compiled by yourself
operating system: MS-Windows, Mac OS-X, Linux. needs Python
description: TextSTAT is a simple programme for the analysis of texts. It reads ASCII/ANSI texts and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. The programme is distributed as freeware. Source code in Python is also available for free. The user interface is provided in English, German, Dutch, Potugese, Spanish, Catalan, French, Italian, Galician, Finnish (Suomi), Polish, or Czech.
Txm 0.8.4 |
program:
Txm 0.8.4
author:
Heiden, Serge and others
distributor: unclear
documentation:
manual
docwnload:
free versions for different operating systems
operating system(s): Windows, Mac OS, Linux
description: All information is in French. La plateforme TXM combine des techniques puissantes et originales pour l’analyse de corpus de textes structurés et annotés au moyen de composants modulaires et open-source.
UAM Corpus Tool 6.2j |
program:
UAM Corpus Tool 6.2j
author and distributor:
Mick O'Donnell
documentation:
manual
download:
version for MS Windows and Mac OS
operating system(s): MS Windows, Mac OS
description: The UAM CorpusTool is a state-of-the-art environment for annotation of text corpora. So, whether you are annotating a corpus as part of a linguistic study, or building a training set for use in statistical language processing, this is the tool for you.
Wmatrix 7 |
program:
Wmatrix 7
authors: P. Rayson and others
distributor: University of Lancaster, UK
tutorial:
documentation
docwnload:none, browser application
operating system(s): runs in a browser
description: Wmatrix is a software tool for corpus analysis and comparison. It extends the keywords method to key grammatical categories and key semantic domains.
WordCuncher |
program:
WordCruncher
authors:
Brigham Young University
distributor:
Brigham Young University
documation:
user guide
operating system(s): MS Windows, iPhone
description: WordCruncher can do: phrase compare, character usage, frequency distribution,vocabulary dispersion, phrase list creator
Wordless 3.5.0 |
program:
Wordless 3.5.0
author: Ye Lei
distributor: Ye Lei
documentation:
documentation
docwnload:
download section
operating system(s): MS-Windows, Mac OS, Linux
description: Computes frequencies of words, lexical density, dispersion and values of readability formulas. Wordless is an integrated corpus tool with multilingual support for the study of language, literature, and translation.
Wordsmith 9.0 |
program:
WordSmith 9.0
author:
Mike Scott
distributor:
Mike Scott, Liverpool University
documentation:
manual in English, French, and German
download:
download page for different versions shows a sample of the results only
operating system: MS-Windows
description: WordSmith is the sucessor of MicroConcord.
WordStatix 1.2.0.0 |
program:
WordStatix 1.2.0.0
author: Massimo Nardello, Modena (Italy)
distributor: Massimo Nardello, Modena (Italy)
documentation:
manuals
docwnload:
free versions for different operating systems including the source code
operating system(s): Windows(running python script), Mac OS, Linux
description: WordStatix is a free and multiplatform software useful to create concordances, that are lists of the words used within a document along with their recurrence and context. The document may be structured in chapters, numbers or in any another way. The software allows to track specific words by prefix or suffix, to skip those which are meaningless (like articles or prepositions) or numbers, to create a simple statistic of the recurrence of all words or of some of them, possibly within the different sections of the document, and to create three kind of diagrams to visualize the statistical data in different ways
WordWanderer |
program:
WordSWanderer
author: Marian Dörk and Dawn Knight
distributor: Marian Dörk and Dawn Knight, Newcastle University, UK
docwnload:
source code
operating system(s): runs within a browser, seems to be written in Javascript
description: We are experimenting with visual ways in which we can enhance people's engagement with language. By fusing the information we can obtain from corpus searches, concordance outputs and word clouds we are aiming to enable and encourage people to notice and wander through the words they read, write and speak.
Yedda |
program:
Yedda
author and distributor:
Jie Yang
documentation:
conference paper presented on ACL conference 2018
download:
download on GitHub
operating system(s): requires Python 3.6 or above
description: YEDDA (the previous SUTDAnnotator) is developed for annotating chunk/entity/event on text (almost all languages including English, Chinese), symbol and even emoji. It supports shortcut annotation which is extremely efficient to annotate text by hand. The user only need to select text span and press shortcut key, the span will be annotated automatically.