Network Visualization of Characters’ Locations in Hwang Sok-Yong’s Novel “Princess Bari”


The 17 main characters' location network by degrees

Partial view of my initial edge table of cities each character travels to

R code used to create a weighted edge list by aggregating matching source and target values and summing their edges

The degree report showed an average degree of 6.706
(For the full report, please click here.)
Executive Summary
Hwang Sok-yong is a celebrated South Korean novelist born in former Manchukuo (now Changchun, China) who returned to Korea in 1945 after the end of Japanese colonization. He embarked on a literary career after first publishing the short story “The Pagoda” in 1970, based on his experiences as a Marine in the Vietnam War where he was responsible for “clean-up,” or erasing proof of civilian massacres by burying the dead.
His 2015 novel “Princess Bari” is named after the protagonist of the famous Korean myth who’s abandoned by her father for being a girl and later resurrects her dead parents with the flower of life. However, the overlap basically begins and ends with the character’s name as this book actually follows a Korean woman named Bari in more recent times (starting in the ’80s). This modern Bari is born and raised in Chongjin, North Korea, where she lives with her grandmother, father, mother, and sisters Sook, Hyun, Jin, Sun, Mi, Jung, her dogs Hindungi and Chilsung. Then, there’s also the dear family friend Uncle Salamander who keeps appearing throughout the book, and toward the end of the book appear the less prominent characters of Bari’s coworkers Zhou, Xiang, and Chen.
I wanted to visualize the network of the characters’ locations throughout the book. Some end up in/travel through the same cities while others do not at all.
Results
In the book, Great-Great-Grandmother, Grandfather, Hindugi, and Grandmother live and pass away much earlier than the others. With Grandmother living the longest and traveling the most of that group, it makes sense that she’s the connecting point between the two communities. And then of the characters in the pink community, it makes sense that Father, Mother, Hyun, Mi, Bari, Jung, Chilsung, Jin, and Uncle Salamander are connected as they are family and often find themselves in the same cities. Then, Xiang and Zhou have many connections as they travel to several with Bari. And it’s unsurprising that Chen and Sook are found to be a bit isolated as they had little movement throughout the book.
Tools Used
Programming language: R
OpenRefine (for data cleaning)
Gephi (to create the network graph and analysis)
Method
1) Created an edge table in csv format
I first flipped through the book and created an edge table in csv format of each character and the cities they travel through and/or end up in, which was 67 rows long.
2) Cleaned data in OpenRefine
Then, I used the data cleaning and transforming software OpenRefine to transform my data so the header of each column would be a city name with the characters who’d lived in or passed through that city located in that column.
3) Transformed data using R
Then, I transformed the data again using R so that I could create a weighted edge list by aggregating matching source and target values and summing their edges. The new weighted edge list was written into a new csv file with the new column headings “source,” “target,” “type,” which are required for processing in the network visualization software Gephi.
4) Network visualization in Gephi
Then, I imported this newest spreadsheet into Gephi, turned on the labels to show each character’s name, adjusted the text size so the labels would be clear without looking overbearing, and chose the layout ForceAtlas 2 to take a look.