Producing an Automatic Transcript for a YouTube Video

(For the README file, code, and csv file please click here.)

Executive Summary

In July 2023, I was working on an article about intergenerational trauma in Korea due to various episodes occurring throughout history. One of the sources I refer to in my piece is the "U.S.-Korea Relations: Retrospective on the Jeju April 3 Incident, Human Rights, and Alliance" presentation at the Wilson Center think tank in Washington, DC, that took place in Dec. 2022.

Initially, I found myself manually typing out comments made by the panelists, but since the video is approximately 3 hours long - even though I only intended to pull just a few quotes - I decided typing out quotes manually was still too time-consuming. I googled and found https://github.com/jdepoix/youtube-transcript-api, which was my initial starting point for using Python code, but I tweaked the code so that a) timestamps also appeared every 30 seconds in the transcript and b) the transcript is printed to a txt file.

The final transcript can be seen in the attached file "transcript_with_timestamps.txt".

Tools Used

  • Programming language: Python

Previous
Previous

Data-Cleaning and Creating a Power BI Dashboard

Next
Next

Dashboard Featuring HDFC Listings Built with R Shiny/ShinyDashboard/Leaflet