Lifelines, Race and Survival (on TV shows)

This one's been sitting around for about six months; I knew that similar research must exist somewhere and figured that I might as well just post it since I won't get around to researching the issue in the near future, or ever.  The idea is to look at the race and/or ethnicity of the random NYC pedestrians chosen by contestants on the television show Cash Cab, the show where unsuspecting NYC cab passengers might get into the "Cash Cab" and have the chance to answer trivia questions for money on the way to their destination.

The show’s rules are basically that the questions get harder and worth more as time goes on, and three strikes and you’re out (i.e. kicked out of the cab, even if not at your destination yet).  If contestants do not know the answer to a question, they are given one mobile shout-out and one street shout-out.  If they use their street shout-out, the host tells them to “choose a pedestrian who you think might know the answer.”  I became interested in whether there is a relationship between the race or age or appearance of a pedestrian and whether contestants tend to choose a particular pedestrian based on the question category.  So, I started thinking about what a study might look like to pick apart the question.  I'm not a social scientist, am not trained in the precise methodology of these things, but hopefully get somewhat close with the outline below.

This analysis would involve watching each episode in a season of the show.  While watching each episode, the following would be recorded:

1) Question Category
2) Contestant Race/Ethnicity
3) Location density
4) Whether the pedestrian chosen got the correct answer

In drawing conclusions, one important thing to look at would be how the $$$ amount (which varies as the contestants progress; the pedestrians don't know how much is at stake) might factor into the contestant's choice of pedestrian.  So, below is a broad sketch of what I'd be interested in looking at and how the data gathering might be structured.
 
         I.     Data gathering
a.     Question Category
                                               i.     Watch every episode for one season and write down each question.  Analyze and group into five or six broad categories, e.g. sports, history, literature and culture, politics, science, etc. based on analyzing all the questions.  Or, maybe use the Trivial Pursuit categories or something familiar.
        II.     Contestant Info
a.     Race
                                               i.     There will be some inaccuracies here, but I’d be inclined to use broad categories, since there's obviously no way to retroactively get precision answers from contestants.  So, categories like White, Black, Hispanic, Asian.  Or get politically correct and call it Caucasian, Mongoloid, etc. 
b.     Approximate Age
                                               i.     Three categories: young (30 or under), middle aged (30ish to 60), and elderly (over 60)
c.     Other info.  No precision here, but assuming the show gathered the info and made it available, or facial/individual recognition were that advanced (read: scary), there are a host of other factors that could be addressed here, such as:
                                               i.     Has the contestant or pedestrian seen the show? 
                                             ii.     Education level of pedestrian or contestant
                                            iii.     Actual age of contestant or pedestrian
      III.     Location density
a.     This could measure how many pedestrians are in a certain type of area.  A scale might look like the following:
                                               i.     Hardly anyone around…
                                             ii.     Only the chosen person on the street, regardless of area of the city or time of day
                                            iii.     A few people in the area
                                            vi.     Any time of day, busy area of the city
      IV.     Pedestrian Info
a.     Race, Ethnicity
b.     Approximate Age
c.     This would be the biggie and carry more weight in the statistical analysis that I'm underqualified  to accurately perform: did the chosen pedestrian get the answer right?
   
The question that remains for me is how to better analyze all of this data, present it, and find additional insight.  Also, how to assign weights to each piece of data based on category or importance, and then present numbers to then analyze and draw conclusions from.  At the very least, maybe it gives you something to think about while watching this type of show.

Popular posts from this blog

Thinking About BIPA and Machine Learning

Changing PDF Metadata with Python

A New Serverless Look