Universiteit Leiden

nl en

Tools & software

This page offers a selection of the most common tools and software for digital research.

General Directories (including TDM software & tools)

  • DiRT (Digital Research Tools)
    The DiRT Directory is a registry of digital research tools for scholarly use.
    Resources range from content management systems to music OCR, statistical analysis packages to mindmapping software.
    The DiRT directory is supported by the Andrew W. Mellon Foundation

  • PORT (Postgraduate Online Research Training)
    PORT is the public research training platform from the School of Advanced Study of the University of London. It contains a variety of training resources tailored toward postgraduate study in the Humanities. Most of the training resources are free.
    Tools included in the resource on Quantitative Methods are: semantic data, text mining, visualisation, linked data, cloud computing. For each tool a series of case studies has been provided alongside a tool audit. Free login.

  • TaPOR
    Text Analysis Portal for Research

Tools & software

  • Textpresso    
    Information extracting and processing package for biological and biomedical literature.
    Textpresso is part of WormBase at the California Institute of Technology, California and supported by a grant from the National Human Genome Research Institute at the US National Institutes of Health 

  • GATE (General Architecture for Text Engineering)
    Developed by the University of Sheffield

  • Ontotext
    Provides tools for text mining, semantic annotation, data integration, and semantic curation

  • WMatrix
    Leiden University campus license 

    Parsers

  •  PDFMiner-Python PDF parser and analyzer
    Tool for extracting information from PDF documents.
    Includes a PDF converter that can transform PDF files into other text formats.

  • Stanford parser
    Statistical parser

  • Alpino
    Dependency parser for Dutch, developed in the context of the PIONIER Project Algorithms for Linguistic Processing.

A selection of most commonly used open source and Leiden University licensed tools & software

Popular progamming languages used for TDM

  • Python
    Widely used general-purpose programming language. Has a large standard library providing tools for data analysis and data modelling. An introduction to the basic concepts and features of the Python language and system can be found in the Python Tutorial

  • Perl
    Includes powerful tools for processing text that make it ideal for working with HTML, XML, and all other mark-up and natural languages.

  • Sharing code on GitHub


Quantitative data analysis software
 

  • R
    For Statistical Computing and graphics

  • Mallet
    Java-based package for statistical natural language processing, document classification, clustering, topic modelling, information extraction, and other machine learning applications to text

  • WinStats

  • SPSS


Qualitative data analysis software

  • Atlas ti
    Tool for data analysis and management.
    Tutorial by University Library of the University of Illinois at Urbana-Champaign
    University Leiden campus license


Data cleaning


OCR


Visualization

  • Textexture For visualizing text as a network

  • Gephi 
    For visualization of network analysis
    Introduction to Network Visualization with Gephi by Martin Grandjean, University of Lausanne

  • QGIS

  • Tableau public
    Free to use version of the commercial data analysis and visualisation software called Tableau Desktop. Makes interactive charts, graphs and maps from your data

  • OpenHeatMap
    Data can be used to make static and interactive animated maps. 
    By using spreadsheets from Excel or Google Docs you can map any dataset that is linked to an array of locations such as IP addresses, street addresses and longitude and latitude coordinates.

  • Google Fusion Tables
    For making charts,maps and network graphs.