A list of text mining methods (Word frequencies, Machine learning, Network and citation analysis, Visualizations) with definitions, associated tools and project examples.
View the Resources & Support tab of this guide to find a list of trainings, tutorials and books that can assist with using these tools.
Links and brief descriptions of tools used for the production of digital scholarship, including data organization and optimization, data visualization and GIS/Mapping.
Access tools developed by the NLPG such as the Stanford CoreNLP, coreference resolution system; a high speed, high performance neural network dependency parser; part-of-speech tagger; named entity recognizer; and algorithms for processing Arabic, Chinese, French, German, and Spanish text.
GDELT Project monitors the world's broadcast, print, and web news from nearly every country in over 100 languages and identifies the people, locations, organizations, themes, sources, emotions, counts, quotes, images and events driving our global society, creating a free open platform for computing.
Nearly 14 million books from the HathiTrust Library are currently available for analysis, offering various levels of immediate access. Check the HTRC tab on this guide for more information.
Lexos is a web-based tool from Wheaton College to help you explore your favorite corpus of digitized texts. Local installations for Windows, Mac and Linux also available.
Media Cloud is an open source platform for studying media ecosystems. Media Cloud is a joint project by the MIT Center for Civic Media and the Berkman Klein Center for Internet & Society at Harvard University.
Project by the University of Illinois Technology Services and the National Center for Supercomputing Applications (NCSA) with the goal of making social media data, analytics, and visualization tools accessible to researchers and students of all levels of expertise. Create a free account to access tools.