Data Mining Reveals the Surprising Factors Behind Successful Movies 2007 comedy Evan Almighty starred Steve Carell and Morgan Freeman, two big box office stars. So in some ways, its not surprising that the movie raked in over $100 million in revenue. By comparison, the 2001 comedy Super Troopers starred some relative unknowns and took in a measly $18.5 million.

And yet a wise investor would almost certainly choose the second rather than the first to invest in. That’s because Super Troopers cost only $3 million to make, compared with Evan Almighty’s $175 million, and produced a return on investment of more than 5, compared with -0.4 for the bigger film.

But how to decide in advance what to invest in? Today we get an answer of sorts thanks to the work of Michael Lash and Kang Zhao from the University of Iowa. These guys have created a database of over 100 categories of film-related information, such as the budget and revenue, the stars involved, what the film was about, and when it was released, and then used a machine-learning algorithm to discover patterns that predict profitability. And the results  are surprising.

Read Also:
Implementing a multiple open source database strategy in the cloud

The team began by combining data from two online sources: the Internet Movie Database and BoxOfficeMojo. In this way they gathered together data on over 14,000 films and 4,000 actors, directors, and so on, focusing in particular on films released between 2000 and 2010.

Lash and Zhao then used this data to work out how experienced individual actors were, how much revenue and profit each of their movies had made, and whether they had appeared in films with other actors. They did a similar calculation for directors as well.

They also used the plot summaries on IMDb to compare the content of the films. And they worked out a return on investment for each film to get a sense of its profitability.

The task for the machine-learning algorithm was to hunt through this data looking for patterns that correlate with profitability.

It turns out that the factor most strongly correlated with a film’s profitability is the average gross revenue made by the director’s previous films. In other words, directors who have generated more revenue in the past are correlated with greater profitability in future.

Read Also:
UK must act now to support internet of things, regulator warns

In many ways that is unsurprising. Good directors, such as Christopher Nolan, are often well known by the cinema-going public and can be a considerable draw.

However, the results throw up a significant surprise. They show that popular stars are correlated with increased revenue but not with profitability. In other words, big stars draw crowds but they don’t guarantee a profit, presumably because they cost a lot to hire in the first place.

Other factors that turn out to be important are whether the film has an R rating or is designated foreign, which presumably correlate with lower profits (although Lash and Zhao do not make this clear).

The real test, of course, is not in predicting the past but in predicting the future. If this algorithm is able to pick out potentially profitable films before they are even made, then Lash and Zhao and set to become wealthy individuals. We’ll look forward to seeing how they fare. Read more…

Leave a Reply

Your email address will not be published. Required fields are marked *