Skip to Main Content

Digital tools for research

Find information on digital tools to analyse and visualise data and text.

Introduction to data mining and text analysis

Data mining is the use of computational techniques to find patterns or relationships within large sets of organised or "structured" data.

Text analysis is similar to data mining, but uses large collections of text or "unstructured data" to identify patterns or connections.

For more detail, see the definitions set out by the Australian Law Reform Commission: 


How does Text Mining Work? (1:34 mins) by Elsevier (YouTube)

Key considerations

There are a number of factors to be aware of when conducting data mining and text analysis. 

See this ARDC guide below for an overview, and check the resources available below for questions of Copyright, Licencing and Permissions.


Copyright

There is no Australian copyright exemption for text and data analysis, as explained in this Australian Law Reform Commission discussion paper. Even publicly accessible arrangements of datasets are still protected by copyright and may require permission for use in a text analysis or data mining project. See the ARDC guide linked above for details on copyright subsisting in data.


Licencing

Data and database publishers vary widely in the degree to which they permit text and data mining of their collections. First consult the licence in the LibrarySearch record for the database, as illustrated in the image below:

 

If the Show License option does not appear, or if the information does not mention data mining, contact the Library Research support team.

Websites and social media platforms have terms of service which may include clauses around data mining and text analysis. Check the website terms of service or terms of use to determine what is allowed for the site you intend to use.

The Australian Research Data Commons has two easy-to-follow flowcharts that illustrate the licencing process.


Permissions

For some data, you may need to acquire special permission from the rightsholder before performing analysis on datasets.

Be aware that if you are granted permission to use data for your research, this may not extend to use for publication. It is easier to seek permission for all uses of the data upfront.

For tips on permission seeking for researchers, please see the Copyright guide section for researchers


Ethics

Even when access is permitted, in performing text and data mining, it is important that researchers respect the rights of the owners of the content, and abide by their terms of the access. Researchers also need to respect the privacy of the subjects of research, and be aware that data mining may reveal confidential details. Information on the responsibilities of researchers can be found on this page on Academic Integrity.

Web-based tools

Voyant

Orange

Jupyter

Leximancer

Coding tools and tool indexes

Python

R and R Studio

Visual data analysis

Tool collections and indexes

Data sources

Subscribed data sources

Access to ProQuest TDM Studio

Log in using an existing Proquest profile account if you have one.

Log in using an existing My Research account or TDM Studio Visualization account if you've registered before.

Otherwise, here is how to create a password for your account

  • Go to the Proquest TDM Studio home page
  • Click on the 'Log in to TDM Studio' button
  • Use the 'Forgot Your Password' option to create a password
  • Now you can log in using your email and a new password

    Open data sources