Semester Award Granted
Spring 2025
Submission Date
May 2025
Document Type
Thesis
Degree Name
Master of Science (MS)
Thesis/Dissertation Advisor [Chair]
Xingquan Zhu
Abstract
Genres are used to classify movies so that they can be grouped with others that have similar themes and structures. These classifications are categories created by humans. In the process of creating a movie, a script is often the first creation to write and share ideas about a topic. The script contains large amounts of text that is used to describe the dialog, setting and direction of the film. Although the script contains important information for the film, the amount of text can present a challenge for machine learning algorithms. Often in studies on film classification, if text is used, the script is not chosen, but rather a review or a synopsis instead. The text in these is often smaller and easier to train on. However, a script presents some benefits as an input for model training. Being an earlier production in the process of creating a film, the script could be the only data available. The script could be available before the film is even created. In this study, we propose a method for using movie scripts to build models to classify a movie’s genre, as either a “Comedy” or a “Drama.” Our goal is to understand whether and to what extent movie scripts can be used to predict its genre. To better handle learning in this problem space, a model optimization process is proposed using data cleaning, feature extraction through feature selection, and performance comparison. Multiple classifiers are created using this process. Feature selection algorithms are compared using the data subsets each creates. Multiple data sets and models are created, and the performances of these classifiers are compared and discussed.
Recommended Citation
Cuomo, Michael Roman, "MOVIE GENRE CLASSIFICATION USING SCRIPT TEXTS" (2025). Electronic Theses and Dissertations. 80.
https://digitalcommons.fau.edu/etd_general/80