Let's say you're writing a paper and you need to write a literature review. Or, you've got a set of papers that form some basis of your knowledge in the field but you want more papers to read. You could go Google scholar and search up keywords or you could go the references in your paper and individually sift through this large body of work by reading the abstract of each paper... OR! You could use network visualization of bibliometric information to visually see papers that are most cited or most linked to other papers in this body of work. You could use an efficient numerical method to direct your literature review.
This blog serves as a tutorial to use generate bibliometric maps using VOSViewer, which is a "software tool for constructing and visualizing bibliometric networks". Other bilibometric tools and software can be found here.
Unfortunately, creating a bibliometric map is not as easy as searching keywords or copy pasting a link to an arxiv paper, then the software program scrapes the entire internet across all journals to generate a network. You, the user, have to supply the data that underpins the network visualization, which makes sense now in retrospect. VOSViewer may not want to get in the business of being a publication search database, constantly scraping the internet for citation counts and linking to other articles. Those services already exist, like Web of Open Science, PubMed, and in this tutorial, Dimensions. So we're going to use Dimensions (a searchable database of publications) to generate data necessary for bibliometric mapping.
Setup:
Download VOSViewer
There's not really an install process. You just extract the zip file and run the .exe file
VOSViewer has a getting started page with a video that explains the utility of VOSViewer but they totally skip the steps of creating a network. Also, their manual didn't have this step-by-step process for creating a network from my impatient reading
Sign up for a Dimensions account. Dimensions will send the data to you through the email you sign up with
Generating the data the underpins your network visualization using Dimensions:
I started out with two papers I was interested in expanding from
In dimensions, I searched for the papers using the title of the article. When you search for the paper based on title, change the search setting to title, the second option. *Insert screenshot*
Click on the result which is your paper of interest.
Scroll down to the publication references and press "show all".
Export these references using the second option "export for bibliometric mapping"
Go back to the original paper page and scroll down to citations.
We're going to do the same with citations. Show all and then export.
These csv files will be sent to your email, which you can download from the email link and unzip.
It would be convenient if you extracted all the csv files into the same folder directory.
Repeat this process of downloading references and citations for each of the core papers of interest.
Visualizing bibliography using VOSViewer:
Open up VOSViewer by clicking on the .exe file
Choose "Create a map based on bibliographic data" (the second option), click "Next"
Choose "Read data from bibliographic databse files" (the first option), click "Next"
Under the "Select files" page, tab over the "Dimensions"
Click on the ellipses and select the csv files generated by Dimensions. Load them up! Click Next
For the "Choose type of analysis and counting method", choose "Citation" under "Type of analysis" and "Documents" under "Unit of analysis". Click "Next"
I use the default values for the next few pages of "Threshold", "Number of documents", and "Verification" pages before clicking finish
I can imagine wanting to use different values for threshold if you only want cited papers (set the threshold to 1 or above) or different values for number of documents if you have many many papers in your dataset (set number to maximum number of papers you want displayed in your network.
Finish!
How to interpret and use the network visualization:
I’d really recommend watching VOSViewer videos directly as the creators talk a lot more about interpretation and toggling network features to visualize aspects of the network you might be interested in.
The size of the nodes depends on whether you toggled the settings of your network to be citations or number of links. I think it might make more sense to toggle the settings to display number of citations as node size as the number of links is represented by the edges in the network.
From the generated network visual where I scaled nodes based on number of citations, you’ll notice that shuster (1981) is the largest node and is a connection between the sigel / enright papers I’m interested in. That leads me to conclude that 1. shuster had a lot of citations and 2. shuster may have been foundational in this line of work.
I’m also seeing some more recent works, like zhan 2021 and wang 2022 that I need to make sure I’m checking out!
Seems like there are clusters with recurring authors like jovanovic and ning that I should follow on google scholar, as they have a history of work in this space.
This might be a cool figure to publish in your paper in the literature review section!