LAC Session Type
Paper
Name
Navigating the forest with tree diagrams to visualize library research support
Description

Purpose and goals

Libraries use data visualization to display bibliographic and library collection data, often selecting a simple bar chart and line graph to represent this data. This study uses the fan-chord diagram — one of several visualizations from the tree family — to examine research interests and journal needs of faculty, post-doctoral scholars, graduate students, and research scientists at The Ohio State University. Tree visualizations effectively illustrate branching knowledge and can display the inter-relationships between grant funding, journal usage, research publications, and more. Filtered by local taxonomies, the fan-chord diagrams created for this project show a misalignment for various disciplines between the top journals researchers choose to publish their results and the journal titles they reference.  This misalignment is also present for journal titles citing the original published research. Used in tandem with more common visualizations, tree diagrams provide libraries with a more holistic view of journal use, helping to answer the simple question “to what extent does our collection meet our author’s research needs?” (Morris 2022) 

Design, methodology, or approach

A list of NIH (National Institutes of Health) funded projects of Ohio State University researchers was downloaded from the NIH RePORTER, filtering for FY2010 to FY2022. Associated research publications published between 2018 and 2022 were then downloaded from the same database. Lists of referenced publications and citing publications were next generated for each research publication using the sample Python script provided for NIH iCites database API.  A second Python script was then written to gather the journal titles, journal abbreviations, authors, author affiliations, publication years, and major medical subject headings (MeSH) for each PMID on the research publications, reference publications and citing publications lists. A third script also normalized author affiliations by assigning Scopus affiliation identifiers.  Unique starting points or nodes were then assigned to journal titles on the research publication lists and end points or nodes were assigned to journals on the reference publications and citing publications lists using Tableau Prep. All lists were then related in the Tableau data source window and directory-level data for local authors was added to facilitate meaningful filtering. Last, to create curved lines between starting and ending points on the fan-chord diagrams, a simple Excel file with 100 points was added to the data model.

Findings

The resulting visualizations provide a colorful, interactive profile of journal usage for NIH funded researchers in each academic unit at The Ohio State University. Filters allow users to choose an academic unit to display and up to 50 branches showing the top journals Ohio State researchers use to reference papers or the top journals citing Ohio State research. The top MeSH terms assigned to the referenced papers and the citing papers display in word clouds and text appears under the department name to summarize the amount of NIH funding awarded to researchers affiliated with the unit, the number of NIH projects, the number of research publications associated with the funding and the number of journals publishing the Ohio State authored research. A bar chart shows the top journals publishing NIH funded research authored by researchers in the selected unit and lollipop charts show the top journals referenced by these researchers and citing this research. When filtered, the dashboards clearly indicate that for various disciplines, the top journals publishing Ohio State researchers' results are not in alignment with the journal titles Ohio State researchers' reference or the journal titles citing this research.

Action & Impact

Present day data visualization is grounded in mathematics and the base formulas used to construct curves for fan-chord diagrams and other tree family visualizations may be applied towards other more advanced charts. Code libraries are available to create these visuals in python and R, but these tools require time and effort to master and additional skills and knowledge to embed interactivity. Instructions documenting the steps and calculations required to build tree and chord diagrams in Tableau are freely available and once built, these advanced visualizations can be combined with other visualizations to present a more coherent picture. Using raw data to develop local Tableau dashboards allows academic libraries to readily share and filter data using meaningful local taxonomies.

Ongoing requests from researchers for instructions outlining how to create tree family diagrams in Tableau, R, python, and other tools indicate interest in this type of visualization continues to grow. Learning the math required to construct the visualization for this project has helped the author teach others how to build this visualization.

Practical implications or value

Using alternative approaches to present data, academic librarians can better answer at the discipline level whether our collections meet the needs of campus researchers. Tree family visualizations offer a mechanism to display relationships and information flows. When used in tandem with other visualizations, tree diagrams enhance the presentation of assessment results, inspiring conversation, and data-informed action.
 

Keywords
Data visualization, organizational performance, open data