Part I, The Data: Analyzing the Indian Film Industry through Irfan’s Guftagoo Interviews

Sudev Sheth
2 min readJan 24, 2021

--

Actor Naseeruddin Shah for Guftagoo

By Sudev Sheth, Nitin Rao, and Rachel Hong

The Guftagoo series was published between 2012–2020, and all interviews are available on YouTube. The complete series comprises 368 videos. For the purposes of analyzing the film industry, we have shortlisted 208 relevant interviews. The reduced list was produced by deleting entries for politicians, academics, and activists who had nothing to say about cinema, theater, or music. Readers may access our own metadata-rich CSV used for the visualizations here.

The dataset was generated as follows. First, we generated a list of interviews by snowballing through the Guftagoo playlist. We then created metadata categories including date of birth, state of birth, date of death, years active, date of publication, profession, gender, and whether they had any affiliation with the National School of Drama (NSD) or the Film Institute of India (FTI). These categories were manually filled by our research team. Once we had a complete list, we watched every single interview and created keywords corresponding to three unique categories of data: Industry, Category A, and Category B.

For Industry, coders selected one or more values from the following list:

Data Options for ‘Industry’ Category

For Category A, we selected one or more from the following list:

Data Options for ‘Category A’ Category

For Category B, our two coders came-up with their own keywords to describe the content of interviews. A hallmark of qualitative interviews is the range and depth of topics explored, and Category B keywords provided us the most flexibility to capture this diversity without constraints.

For example, the noted actress Jaya Bachchan had the following keywords associated with her line in the dataset:

Industry: Bollywood/Bengali/Theater/Television

Category A: FTII/Naming/Actor/Early Childhood/Family/Business/Bangladesh/Celebritism/Politics/Language/Hindi Language/Environment/Hollywood/Retirement/Naming

Category B: Satyajit Ray/Film Education/Amitabh Bachchan/Naturalistic Acting/Politician/Tapan Sinha/Sharmila Tagore/Child Actor/Convent Education/Bengal/ Kolkata/Kamini Kaushal/Scholarship/Education/Unconventional/Education/World Cinema/Casting/Gulzar/Literature/Rituparno Ghosh/Work Satisfaction/Delhi/Debut/First Film/Acting Technique/Process/Preparation/Choices/Personal Nature/Patience/Member of Parliament/Waheeda Rehman/Lachhu Maharaj/Dance/Challenges/Inhibitions/Middle Class Upbringing/Skin Show/Compromises/Sexual Harassment/Indian Values/Media Sensationalism/Critique/Film Critic/Criticism/Mainstream/Experimental/Jaya Bhaduri

It took six months to complete the dataset. Readers may now turn to Part II of this series to explore a few key visualizations created using Tableau.

Jump to:

Guftagoo Introduction

Part II: The Visualizations

Part III: Skin Color & Gender Stereotypes in Bollywood

--

--

Sudev Sheth

Historian, musician, author, & teacher @Penn , @Wharton Lecturer @LauderInstitute , @PennHistory Past Fellow @HarvardHBS Relative perspectivism, past & present