You'll notice in this resulting graph that I have tried to choose "common thematic" words, enabling us to have deep insight into the history of action movie naming conventions. I would like to cross-reference these results with the year the movies were created. It would be even more interesting to overlap those create dates with a historical time line of major events to see how heavily movie naming conventions are influenced by current events.
On a more technical note, I generated this graph using the pychart module, amongst more common ones. Graph and associated code to follow :)
#! /usr/bin/env python
#Copyright (C) 2007 Christopher M. Ball (chris.m.ball@gmail.com)
#Originally Posted At: <http://strainthebrain.blogspot.com>
#
#This program is free software: you can redistribute it and/or modify
#it under the terms of the GNU General Public License as published
#by the Free Software Foundation, either version 3 of the License, or
#(at your option) any later version.
#
#This program is distributed in the hope that it will be useful,
#but WITHOUT ANY WARRANTY; without even the implied warranty of
#MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
#the GNU General Public License for more details.
#
#You should have received a copy of the GNU General Public License
#along with this program. If not, see <http://www.gnu.org/licenses/>.
from pychart import *
import commands, pdb, operator
theme.use_color = True
theme.reinitialize()
theme.get_options()
#This function obtains grep counts for each of our keywords against our data file, returning them in coordinate pairs
def keyworddata():
keywords = ['love','sex','war','money','god','murder','kill','mission','journey','freedom']
keywords += ['happy','blood','steal','travel','fun','sad','power','peace','drug','friend']
keywords += ['marriage','country','patriot','tax','prison']
results = []
#For each keyword, perform a grep count, then form our keyword/value tuple
for currentWord in keywords:
ycount = commands.getstatusoutput("grep -i -c " + currentWord + " movielist.txt")
results.append((currentWord, int(ycount[1])))
return results
#Setting up the tick and axis
tick = tick_mark.Circle(size=10, fill_style=fill_style.red)
xaxis = axis.X(label="Movie Title Keyword")
yaxis = axis.Y(label="Quantity of Action Movies (23,530 total)")
#Before assigning my results from the function, I sort the resulting list of tuples by the 2nd item of each pair, descending.
plotpoints = sorted(keyworddata(), key=operator.itemgetter(1), reverse=1)
#Setting up and binding our data with graph settings
ar = area.T(x_coord = category_coord.T(plotpoints, 0), x_axis=xaxis, y_axis=yaxis, x_grid_interval=20, y_grid_interval=25, size=(1000,600), legend=None, y_range = (0, 450))
ar.add_plot(line_plot.T(data = plotpoints, tick_mark=tick))
ar.draw()
1 comments:
where's the written analysis? i thought you're going to give more info on action movie-naming conventions??
Post a Comment