Introduction
On the 17 of September 2014 the US streaming portal „Netflix Inc“ started its online service in Austria. Netflix is the leading streaming service worldwide „with more than 44 million streaming members in over 40 countries enjoying more than one billion hours of TV shows and movies per month“ (Netflix 2013: 1). Thereby one of the key strengths of Netflix is collecting Big Data. By that it uses a personalised algorithm, „which recommends other shows to watch based on what the user has seen“ (Nippes 2014: n.s.). Critical voices claim that through that the creative process of discovering and liking a movie or TV show gets lost, because the algorithm offers the user programmes it thinks they will like (cf. Leonard 2013: n.s.). Keeping that in mind Netflix has a great power to promote content that is produced by themselves (for example „House of Cards“) or even other shows from other providers.
For this reason in this paper I will discuss how user information and Big Data is used to manipulate users watching suggestions and try to expose what it implies of the future of streaming portals. Therefore I will begin with a characterisation of streaming, give a short description of the origins of streaming media, explain basic principles and describe on-demand files as well. After that I will briefly present streaming portals in general and will go on with one of the most successful streaming portals recently Netflix Inc. I will continue with the concept of Big Data and it’s underlaying principles. In fact, Netflix uses Big Data to give its users’ movie and TV shows suggestions. Finally, I will talk about how Netflix Inc. uses Big Data to manipulate users watching recommendations. In addition I will also speak about consequences for “external” and “internal” content providers and also about how the possible future of streaming portals will look like.
1. What is streaming?
Nowadays nearly any website uses streaming media, for example online radio stations or sample-before-you-buy music retail shops like iTunes. In fact, the Internet was not designed to transmit media in such a high volume nor for the large number of users today. But thanks to the original architects who designed the Internet in such a flexible manner it is possible to send media online these days. (cf. Mack 2002: 29).
1.1. Origins of streaming media
In 1993 the first graphical browser “Mosaic” was released and with that the use of Internet grow massively. Mosaic offered a simple way to share and link together varied resources. Nevertheless, the new media files were much larger than the former used text files. For this reason people had to wait a long period of time in order to download and send files. The main problem was that the users had to wait until the whole file had been downloaded. It was not possible to listen to it while it downloaded. The reason for that was that a separate application was responsible for playing the music, but for that application it was just possible to play the downloaded file. Consequently, in 1995 the streaming media was established. (cf. Mack 2002: 29-30).
1.2. Basic Principles of Streaming Media
Streaming offers a new possibility to manage media files. Instead of waiting for the whole download “streaming media playback occurs as the file is being transferred. The data travels across the Internet, is played back and then discarded” (Mack 2002: 30). The great thing about streaming media, which is relevant for that paper, is that you can use its for archived files that can be watched on-demand. In other words the users can watch content whenever they prefer. (cf. ibid: 30).
In contrast to downloading, streaming is in real-time. The user clicks on a link and after a few seconds it is possible to hear something. Because of the fact that through that way a file never touches the user’s hard drive there are also no problems with copyright violation. One of the advantages of streaming is the user interactivity, because the user can control the system. The user can pause, play, fast-forward or rewind a file which is not possible with a file who needs to be downloaded first. (cf. ibid.: 30-31).
1.3. On-demand files
Due to the fact that the examined streaming portal “Netflix” offers on-demand media files it is also important to understand what that concept means. On-demand files have the benefit that they can be consumed at any time, by anyone and anywhere as “soon [as] files have been encoded into a format suitable for streaming and put on a server” (cf. Mack 2002: 31). That also implies that different people also can use the media file as they prefer, although other people may also use that file at the moment. In order to be able to stream you need three different pieces of software:
– player: with which the users can watch and listen to the media files
– server: with which the stream is delivered to the users
– encoder: with which the audio and video material for streaming is converted to a suitable format
These software items need to communicate with each other using so called protocols and also have to exchange files in specific formats. (cf. ibid: 31-34).
I will not give much further technical details of the streaming process here, because it is not the aim of this paper to describe the streaming process. The provided basic principles should only give an overview what on-demand files are and how it works.
After discovering basic terms of streaming and on-demand files the next chapter deals with streaming portals in general. After that I will give a special emphasis on one of the most successful streaming portals Netflix Inc. and will describe its rewarding recommendation system.
2. What are streaming portals?
In the last ten years streaming portals or video on-demand (VOD) portals as some people refer to it (cf. Rizzuto/Wirth 2002) have grown enormously. But in contrast to Rizzuto & Wirth, who only refer to VOD on the TV screen, the development went away from the traditional television set to an exclusively online VOD platform. Jenner refers to VOD platforms as part of the era of matrix media “where viewing patterns, branding strategies, industrial structures, the way different media forms interact with each other or the various ways content is made available shift completely away from the television set” (2014: 4).
Since my paper mainly considers Netflix as the research object, in the next part I will briefly describe the US company Netflix Inc. I will provide an overview of the company and also explain how their recommendation system works. On the basis of that I will later discuss how user information is used to manipulate users watching suggestions.
2.1. Netflix, Inc.
In 1997 Reed Hastings and Marc Randolph founded Netflix Inc. as an online DVD-movie rental. Two years later they launched a subscription service, which offers its subscribers unlimited rental for a low monthly prize. It was just one year later that Netflix introduced a personalised movie recommendation system, which predicts on the basis of user ratings, further choices for other Netflix users. In 2007 Netflix expanded its service and started to offer online streaming and slowly moved away from the DVD-movie subscription system. (cf. Netflix 2014: n.s.). “Netflix placed 10,000 titles front its 90,000 film library on-line in ‘Watch Instantly’ mode as a free value-added service to its large base of existing Netflix customers who had to use their ID and password to watch those films” (Cunningham/Silver 2012: 33). During the next three years Netflix also partners with several consumer electronics companies like PS3, XBOX360, TV set-top boxes, Apple IPad, IPhone and many more. Cunningham and Silver (2012) also point out that Netflix transformed their business model to a monthly subscription service for unlimited movies and TV shows on Watch Instantly (cf. 33-66).
Furthermore, Netflix more and more moved away from a solely exhibitor of film content to a business model which produces serialised drama as well. Netflix was the first provider who offers original dramas for its users. In other words Netflix extended their service from TV-series and films that are available elsewhere too (shown elsewhere or already on DVD), to a provider who shows a series or a film first. In addition to that, the VOD provider tries to offer high quality production and service in order to create a brand identity. Thereby they rely on “(social) media buzz with original programming shaping the brand identity” (Jenner 2014: 7). On contrary, they also differ from for example TV broadcasters “through forms of distribution, business model (assumed), viewing practices and marketing” (ibid.: 7). Additionally, Netflix bases its service also on “models of individualised viewing practices and self-scheduling of TV” (ibid.: 11).
2.2. The Netflix algorithm
“Netflix is all about connecting people to the movies they love” (netflixprize.com 2009a: n.s.). In order to do so Netflix established their movie recommendation system “CinematchSM” at the beginning of their service. This system has the aim to predict if a user “will enjoy a movie based on how much they liked or disliked other movies. We use those predictions to make personal movie recommendations based on each customer’s unique tastes” (ibid.: n.s.). In 2006 Netflix started the so called “Netflix Prize” where they offered one million US$ for a recommendation system which recommendation predictions are 10% higher than of Cinematch (cf. ibid.: n.s.). More than 50,000 participants from 186 countries took part. Three years later, after more than 44,000 submissions, the wining team was chosen (cf. netflixprize.com 2009b: n.s.). The Netflix Prize made it possible to understand some details of the recommendation system, which was hidden from the public before. At that time Netflix used a system where the customers can rate films and TV-shows on a scale from 1-5 (5 stars “Loved It”; 4 stars “Really Liked It”; 3 stars “Liked It”; 2 stars “Didn’t Like IT; 1 star “Hated It”). There was also a sixth option named “Not Interested” in a separate box. (cf. Hallinan/ Striphas 2014: 4). In fact, Netflix never implemented the winning recommendation system, because Netflix changed during the three years of the competition and also because of the high cost of implementing the new developed recommendation system. Nevertheless the competition still shows the importance for Netflix to develop suitable algorithms. (cf. Madrigal 2014: n.s.).
The changes occurred after the launch of the instant streaming service in 2007, just one year after the initiation of the Netflix Prize. With streaming the way people interacted with the service changed, because the users gave instant feedback while watching. Netflix began to collect the feedback data which lead to a change in the algorithm as well. (cf. Amatriain/ Basilico 2012a: n.s.). A more personalised algorithm was established which conducts the fact that “75% of what people watch is from some sort of recommendation” (ibid.: n.s.). Netflix claims that the personalisation has a high value for its subscribers, which is the reason for it in the first place. Furthermore they clearly state, that the recommendation has nothing to do with their business model, but “it matches the information we have from you [the user]: your explicit taste preferences and ratings, your viewing history, or even your friends` recommendations” (ibid.: n.s.). The aim of that is to “find the best possible ordering of a set of items for a member, within a specific context, in real-time” (ibid.: n.s.).
So, how is the new recommendation system working? It combines two different features:
– Popularity: This is one basic principle. It can be said that “a member is most likely to watch what most others are watching” (Amatriain/Basilico 2012b: n.s.). But due to the fact that popularity is the opposite of personalisation, because it just offers what everyone else watches and not specified on the users’ taste, this system is not enough to recommend movies and TV shows on Netflix.
– Predicted rating: Thereby they use the users’ data of movies they watched before (1-5 stars system) and with the help of this Netflix will predict if the user will like a different movie too. But the disadvantage of predicted rating is that this might lead to movies that are too niche and will exclude films they may also like although they will not rate it highly. (cf. Amatriain/ Basilico 2012b: n.s.).
In order to overcome the downsides of solely popularity and predicted rating Netflix combines both of them. In fact, Netflix uses far more data than just popularity and predicted rating. From its’ users Netflix gets data about stream plays showing length of watching, at what time of the day they watched and which device the subscribers used. Due the possibility for the users to put items in queues about what they want to watch next, Netflix also gets data about that. Furthermore, Netflix also has a lot of metadata containing information about the movies and TV shows as well (actors, director, genre, reviews). In addition to that Netflix can also collect information about how users deal with the presentations of the different movies and how it affected them. Additionally, Netflix gathers information about search entries, external information about reviews or other data like demographics or location. Recently, social data also was included in their recommendation system. With the term “social data” are recommendations of connected friends considered. In order to manage that large amount of data Netflix uses different methods. The names of that methods (for example linear regression, elastic nets, etc.) can be found on the technical blog of Netflix, but how they really combine those different methods and use it to generate movie and TV show recommendations is not described. (cf. Amatriain/ Basilico 2012b: n.s.). Only a year later Netflix gives some details about how they build a software architecture to deal with the high amount of data. Thereby they use a combination between online, offline and near line (intermediate between online and offline) computing. I don’t want to go in detail here, but it is clear that Netflix uses machine algorithms that can deal with the data effectively. (cf. Amatriain/ Basilico 2013: n.s.).
After all this information it becomes clear why Netflix calls itself a “data driven organization” (Amatriain/ Bailico 2012b: n.s.). This approach was implemented in the companies culture and called “Consumer (Data) Science” from the foundation of the company on. The “main goal of our Consumer Science Approach is to innovate for members effectively” (ibid.: n.s.). For Netflix “more data availability enables better results” (ibid.: n.s.).
All the descriptions above show the importance of data collection and data converting for Netflix. In other words, one could say that Netflix collects Big Data. But what is Big Data? In the next part of the paper I will describe the term “Big Data” and will describe its’ basic principles.
3. Big Data
To begin with it is essential to understand the name “data”. “Data can be summed up as everything that is experienced, whether it is a machine recording information from sensors, an individual taking pictures, or a cosmic event recorded by a scientist” (Ohlhorst 2013: ix). So you can say that basically everything is data. (cf. ibid.: ix).
What is Big Data then? Big Data “defines a situation in which data sets have grown to such enormous sizes that conventional information technologies can no longer effectively handle either the size of the data set or the scale and growth of data set” (Ohlhorst 2013: 1). But that is not all. When talking about Big Data the initiation of the data is involved too. In the business world Big Data also means opportunity. Everyday 2.5 quintillion (2.5 x 1018) bytes of data is produced according to IBM. To put that in an even wider context: 90% of the data worldwide was acquired in the last two years. Thereby there is a huge variety of forms of data for example data from sensors, social media sites, video platforms like YouTube or GPS signals. (cf. IBM 2013: n.s.) A big number of companies began to analyse this data in order to extract value of it. NASA for example uses Big Data for aeronautical or other research. (cf. Ohlhorst 2013: 1-2).
The easiest way to describe Big Data is when you look at it on different dimensions. This is also known of the 4Vs of Big Data:
– Volume: As already mentioned before the amount of data bytes is enormous. So the companies have to deal with a huge amount of information. Most companies in the US have at least 100 Terabytes of data stores.
– Variety: I also pointed that out before, but there is a various amount of different information. From social media sites, videos, music to data in health care. On Twitter for example there are 400 million tweets a day from 200 million active users.
– Veracity: Due to the high amount of data errors or misinterpretation can occur. For this reason each data byte has to be accurate and correct. It is not astonishing then that a lot of companies mistrust the collected data.
– Velocity: The data also has to be analysed which can be time consuming and time sensitive. When the data is too old the data might be useless. (cf. IBM 2013: n.s.).
After identifying and detecting basic principles about Big Data it is without a doubt possible to say that Netflix really uses Big Data, because characteristics of it can be found looking at the data Netflix collects form its users.
THE NEXT PART IS BY FAR NOT FINISHED, BUT I AM WORKING ON IT! JUST TO GIVE AN INSIGHT HOW THE FINAL PART OF MY PAPER WILL LOOK LIKE! I AM HAPPY FOR ANY COMMENT OF FURTHER IMPLICATIONS!!
4. Netflix Inc. and Big Data
Looking at the different types of data Netflix collects (instant feedback, popularity, queues of user, metadata, external information, demographics, location) it is clear that there is a huge amount of information. So you can say that the first characteristic of the 4Vs is met. The volume of information on Netflix is enormously. According to GigaOM Netflix has 30 million plays per day which means even a higher number of information about instant feedback (playing, pausing, rewinding, etc.), four million ratings and about three million searches per day. Furthermore, as already mentioned, Netflix also collects data about geo-location, device information, but also social media data (cf. Harris 2012: n.s.). The second aspect “Variety” is also achieved, because there is a high diversity of information. Veracity is also important for Netflix, because if they have wrong or adulterated data their recommendation system will not work anymore. Furthermore velocity is also reached due to the fact that the company analyses data continuously in order to offer their users recommendations for movies and TV shows.
4.1. Using Big Data for the prediction of the success of a movie or TV show
House of Cards – “Executives at the company knew it would be a hit before anyone shouted ‘action’” (Carr 2013: n.s.). Furthermore each movie and TV show is combined with hundreds of tags. In fact, those facts are created by professional “tagger” whose job is to watch movies and TV shows and describe it with objective tags provided by Netflix. There are 40 tagger around the world at the moment, but due to the expansion in Europe Netflix also began to look for taggers in the UK or Ireland. (cf. Heritage 2014: n.s.). Those taggers should act “as a UK cultural consultant and highlighting UK cultural specificities and taste preferences” (ibid.: n.s.).
In former times those tags were used for their recommendation system, but nowadays Netflix uses it to predict the success of a show too. (cf. Carr 2013: n.s.).
4.2. Critical perspective
“Netflix doesn’t know merely what we’re watching, but when, where and with what kind of device we’re watching” (Leonard 2013: n.s.). This quote and its implications describes the basic line for critique on Netflix and its recommendation system.
Leonard for example claims that the creative process of discovering a new movie or TV show will get lost, because Netflix already offers recommendations what we might like. That implies that users will never get in contact with movies they never thought of also liking, because the data shows that it is unlike that the user will like that sort of movie too. In other words Netflix pushes us a specific direction when it comes to their recommendations. (cf. ibid.: n.s.). The question which occurs for Leonard here is “at what point do we go from being happy subscribers, to mindless puppets” (2013: n.s.).
Furthermore, the data collection of Netflix also leads to critique. If one person stops at a certain part of the movie it will make no difference, but when hundreds of people do it some conclusions can be made. Maybe this part of the movie was boring or the scene showed something interesting worth stoping the movie for. Actually, it does not matter why users stopped at a certain point of movie, sure is that they did. (cf. Leonard 2013: n.s.).
4.3. Consequences for “external” content provider
4.4. Consequences for “internal” content provider
4.5. Implications of the future of streaming portals
Movie and TV show producer have always had some sort of data, but since to the growth of streaming technology, real-time consumer data is available. (cf. Carr 2013: n.s.). But due to Big Data …
Conclusion
References
Amatriain, Xavier/ Basilico, Justin. 2012a. “Netflix Recommendations: Beyond the 5 stars (Part 1)”. http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html (accessed 09 October 2014).
Amatriain, Xavier/ Basilico, Justin. 2012b. “Netflix Recommendations: Beyond the 5 stars (Part 2)”. http://techblog.netflix.com/2012/06/netflix-recommendations-beyond-5-stars.html (accessed 09 October 2014).
Amatriain, Xavier/ Basilico, Justin. 2013. “System Architectures for Personalization and Recommendation”. http://techblog.netflix.com/2013/03/system-architectures-for.html (accessed 09 October 2014).
Carr, David. 2013. “Giving Viewers What They Want”. http://www.nytimes.com/2013/02/25/business/media/for-house-of-cards-using-big-data-to-guarantee-its-popularity.html?pagewanted=all&_r=0 (accessed 28 September 2014).
Cunningham, Stuart/ Silver, Jon. 2012. “On-line film distribution: its history and global complexion”. In: Cunningham, Stuart/ Iordanova, Dina (eds). “Digital Disruption: Cinema Moves On-Line”, 33–66.
Hallinan, Blake/ Striphas, Ted. 2014. “Recommended for you: The Netflix Prize and the production of algorithmic culture”. New Media Society published online 23 June 2014. http://nms.sagepub.com/content/early/2014/06/23/1461444814538646 (accessed 09 October 2014).
Harris, Derrick. 2012. “Netflix analyzes a lot of data about your viewing habits” https://gigaom.com/2012/06/14/netflix-analyzes-a-lot-of-data-about-your-viewing-habits/ (accessed 28 September 2014).
Heritage, Stuart. 2014. “Playing tag: Netflix will pay me to watch films all day. Only catch – they’re Dyer” http://www.theguardian.com/film/filmblog/2014/jul/07/netflix-tagging-job-paid-to-watch-movies (accessed 12 October 2014).
IBM. 2013. “The Four V’s of Big Data”. http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg (accessed 10 October).
Jenner, Mareike. 2014. “Is this TVIV? On Netflix, TVIII and binge-watching”. New Media Society published online 7 July 2014. http://nms.sagepub.com/content/early/2014/07/03/1461444814541523 (accessed 09 October 2014).
Leonard, Andrew. 2013. „How Netflix is turning viewers into puppets“. http://www.salon.com/2013/02/01/how_netflix_is_turning_viewers_into_puppets/ (accessed 26 September 2014).
Mack, Steve. 2002. “Streaming media bible”. New York (N.Y.): Hungry minds.
Madrigal, Alexis C.. 2014. “How Netflix Reverse Engineered Hollywood”. http://www.theatlantic.com/technology/archive/2014/01/how-netflix-reverse-engineered-hollywood/282679/ (accessed 09 October 2014).
Nippes, Daniel. 2014. “Netflix’s Big Data Architecture“. http://dataconomy.com/netflix-big-data-architecture/ (accessed 26 September 2014).
Netflix. 2013. “Netflix, Inc.“. http://files.shareholder.com/downloads/NFLX/3504957457x0x748407/76a245dc-3314-401c-baba-ed229ca9145a/NFLX_AR.PDF (accessed 26 September 2014).
Netflix. 2014. “A brief history of the company that revolutionised watching of movies and TV shows”. https://pr.netflix.com/WebClient/loginPageSalesNetWorksAction.do?contentGroupId=10477&contentGroup=Company+Timeline (accessed 09 October 2014).
netflixprize.com. 2009a. “The Netflix Prize Rules”. http://www.netflixprize.com/rules (accessed 09 October 2014).
netflixprize.com. 2009b. “Leaderboard”. http://www.netflixprize.com/leaderboard (accessed 09 October 2014).
Ohlhorst, Frank. 2013. “Big data analytics: turning big data into money“. N.J.: Wiley, cop.
Rizzuto, Ronald J./ Wirth, Michael O. 2002. “The Economics of Video on Demand: A Simulation Analysis”. Journal of Media Economics, 15 (3), 209-225).
Your paper is well structured and gives detailed impressions on the stream market. Your researched data is very specific and detailed in regard to your chosen example Netflix as well. I like the way you have integrated specific quotations to describe the current situation on the topic and to clarify what and why data is used in order to recommend a movie for instance. Although, I think you went too much into detail at some technical points at the beginning. Other than that, I enjoyed reading your paper and wish you good luck for your further work on it!!
Interesting topic! Especially, the ”house of cards” part was very worthwhile for my personal interest. Therefore, I would really like to get more information about it. For example, you could point out some more details like: Why was “house of cards” as successful as it was? What were the tags? And why is not every movie company now producing movies based on those tags? Or were there other factors as the tags? etc. But that’s just my personal interest! So, now I’m coming to your actually topic:
About the critical perspective: In my opinion, you can add there the critic that Netflix can make a movie (or a whole movie company?!) rise and fall just because of tags or its way of providing it.
Some implications that came to my mind after reading the critical perspective part: How could Netflix (or other providers) address the issue that people don’t discover new genres, they don’t know but maybe like? I am thinking about an option like “New ideas” or “Exotic recommendations” which allows the provider to recommend some completely different movies to the user. Or maybe providers can offer the option to poke in a subcategory like “unknown movies”. Furthermore, it could be helpful to have a look on other portals to discover some “best practice” techniques.
One point, I have to criticize is your direct citation. I counted around 20 direct quotes. That’s quite a lot! I would replace some of them with indirect quotes to avoid the danger of losing your own style of writing.
However, I really enjoyed reading and wish you every success with your final paper! 🙂