Data, Notes and Acknowledgments 

Measuring CC’s slice of the Commons

This is the second year that Creative Commons has published a State of the Commons report. CC doesn’t host or control the content in the commons, which is hosted on a myriad of platforms and is made up of every type of media and data. We see this as an iterative process, where each year we improve our process and get better results. As with any large-scale global movement, it is impossible to quantify the full impact of our work. For example, how do we capture content that is insufficiently marked, or content that is in the public domain in some jurisdictions and not in others? When CC licensed work is remixed, how do we know when one work ends and another begins?

That said, we do have a handful of valuable tools that we’ve put to good use to tell the story of the CC licensed Commons and its massive growth. Building on the baseline we established with the 2014 State of the Commons reporting format, we continue to track size, scope, content, and diversity of the Commons. We added new stats from our servers, including widespread use of our CC license buttons and views of license deeds in different languages. We also added new usage data from more platforms hosting the majority of CC licensed content on the web. We aimed for variety in representation, both by media type and domain of content, and we focused on highly trafficked platforms with CC integrations up to the technical standard that would allow us to track this data over time. We also added sections on CC’s broader impact as a steward of our global Commons. We dug a bit deeper into open education, policy, and our shared cultural heritage in an aim to measure not just the quantity of the Commons, but the incredible impact that a robust Commons can have on different regions of world.

Still, given the breadth and reach of CC’s global diversity, this year’s report can’t fully capture the true scope of CC’s international impact. Our aim is to continue to refine our process each year, including a more in-depth, collaborative process that reflects the geographical breadth of CC content in different languages (see CC license use on the Polish web, for example). We also hope to collaborate with domain experts and content partners in open data and cultural heritage (what we like to call the GLAM sector) to include those stats, and eventually track commons growth in those fields over time. With regard to the open policy landscape, we are already collaborating with our partners in the Open Policy Network who are leading a State of Open Policy report for publication in early 2016.

Another huge undertaking on our wish list would be an effort to measure the public domain, not just the portion marked with our tools, but the entire public domain as it exists and as it is defined differently across jurisdictions. We may find that the topic merits a separate academic study and report, and look forward to outlining a best process for achieving and sharing comprehensive public domain data.

Ultimately, we want to be able to measure not just growth, but usability and vibrancy of the commons. The size of the Commons and its continued growth is most interesting when shared next to incredible stories of how the content is used by creators around the world to achieve CC’s vision and mission. This is why we have begun to ask for new metrics from platforms on how CC works are used, including data points like number of downloads, users, and site trends.

“More than 1 billion CC licensed works in the Commons as of 2015”

Google provided us with the raw data, counting all of the gweb pages in its cache that link to Creative Commons license deeds, which we used to make the estimates in this report. While pages may link to Creative Commons license deeds for reasons other than to license or attribute works under them, we reason that those are vastly outnumbered by pages that indicate a CC license choice without linking to the deed. We’ve supplemented Google’s data with that of several websites that each have over a million CC-licensed works but aren’t reflected in Google’s data.

Google data

Source: Google query, available at Github.

Platforms not included in Google’s data

Platform name

Number of works (rounded down by millions)

Number by license

Source

Flickr

356 million

CC BY: 67,354,310
CC BY-ND: 19,215,096
CC BY-NC-ND: 90,361,041
CC BY-NC: 45,793,028
CC BY-NC-SA: 99,330,609
CC BY-SA: 32,756,937
CC0: 372,095
PDM: 1,220,335

https://www.flickr.com/creativecommons (6 November 2015)

Wikipedia (all pages in all languages)

140 million

all CC BY-SA

http://meta.wikimedia.org/wiki/List_of_Wikipedias

Scribd

50 million

unknown

Source at Scribd (same as 2014)

MusicBrainz

43 million

unknown combination of CC0, CC BY-NC-SA

https://musicbrainz.org/statistics

Freebase

39 million

all CC BY

https://developers.google.com/freebase/faq#how_big_is_freebase (same as 2014)

DeviantArt

18 million

CC BY: 1,128,465

CC BY-SA: 840,407

CC BY-ND: 1,417,293

CC BY-NC: 619,170

CC BY-NC-SA: 1,817,106

CC BY-NC-ND: 12,270,444

Note: Due to late submission of data, these numbers are not included in the license breakdown bar graph.

Source at DeviantArt

Geonames

10 million

all CC BY

http://www.geonames.org/about.html (same as 2014)

YouTube

13 million

all CC BY

Source at YouTube

Libre.fm

133 million

all CC BY-SA

https://librefm.wordpress.com/2015/11/13/133-million-songs-and-counting

Google grand total 316,900,000 + total works across platforms (802 million) = 1,118,900,000 works (1.1 billion)

We cross-referenced data with sample size data from Bing which also put the total CC licensed or public domain works at just over 1 billion. This is still a low bound estimate, as we can’t know the total universe of platforms not included in Google’s data.

“More people are choosing to share openly!”

The bar graph showing breakdown by CC license and PD tool correspond to the following numbers which include Google data (web pages linking to current tools plus retired PD tool) and data from platforms not cached by Google (where we know the breakdown of works by license).

Total # of works reflected in bar graph: 1,008,283,451 (minus ~111 million works where the license was unspecified)

Breakdown by license: Google + Platforms not cached by Google = Total estimated works under that license

Total Free Culture works (CC BY + CC BY-SA + Public domain): 652,533,677. CC uses the definition of free cultural works at Freedom Defined to categorize the CC licenses. “Free Culture” licenses allow for both commercial use and adaptations. Learn more at http://creativecommons.org/freeworks.

“The public domain is growing!”

The bar graph reflects the following data by year:

Sources:

Just as in 2014, we did not include PD totals from platforms other than Flickr, which we were able to confirm were not included in Google’s data. As such, the total public domain works marked by our tools is a low estimate. For example, Europeana alone reports over 10 million works under PDM or CC0.

“In 2015, CC licensed works were viewed online 136 billion times”

Views of CC works were calculated by combining two sets of data: the number of times CC license buttons were downloaded by a browser as part of viewing a web page, and the number of Wikipedia page views. CC license buttons are hosted by the CC server; web pages that display our machine-readable code, as that from our CC license chooser, reference the CC license buttons hosted by our servers, which allow us to track each time a browser downloads the button image as part of someone viewing that page. In September 2015, CC license buttons were downloaded 552,015,520 times. We multiplied this number by 11.5 to achieve the approximate number of 6 billion views in 2015. We combined 6 billion with the total number of page views of Wikipedia in 2015 (130 billion) to reach 136 billion views. Wikipedia pages do not use the machine-readable code referencing the CC license buttons and therefore are not included in our server data. Since we have not accounted for the many CC licensed works that do not reference the CC license buttons hosted on our servers and that are not Wikipedia, 136 billion is a low bound estimate.

“To date, the 4.0 license suite has been officially translated into 7 languages, with 3 more to be published before the new year.”

Starting with the 4.0 license suite, CC instituted its first Legal Code Translation Policy. Official translations of the 4.0 suite and CC0 are tracked at https://wiki.creativecommons.org/wiki/Legal_Tools_Translation.

“People are sharing with CC licenses in as many as 34 languages with more than 90 million views of CC’s deeds in the last ten years”

Prior to CC’s first official translation policy, CC affiliates unofficially translated license deeds for understanding around the world. Since CC launched the first version of its license suite in 2002, CC deed pages have been unofficially translated into at least 34 languages. 90 million views reflects the total number of views of these deed pages from January 1, 2005 through November 3, 2015, with 2005 marking the earliest period where tracking with Google Analytics is available for our servers. For those interested in a specific breakdown of deed pageviews by language, we have included that data here. Language categories with an asterisk may include variations on that language for simplicity, eg. Chinese includes simplified and traditional Chinese.

Views of deed pages grouped by language (unofficial translations):

  1. Arabic: 28,438 views
  2. Belarusian: 7,686 views
  3. Catalan: 12,298 views
  4. Chinese*: 1,070,159 views
  5. Croatian: 331,111 views
  6. Czech: 333,413 views
  7. Danish: 30,735 views
  8. Dutch: 315,019 views
  9. English*: 67,155,975 views
  10. Esperanto: 19,984 views
  11. Finnish: 115,246 views
  12. French*: 2,204,878 views
  13. Galician: 10,760 views
  14. German*: 1,700,918 views
  15. Greek: 201,277 views
  16. Hungarian:170,549 views
  17. Indonesian: 14,604 views
  18. Italian: 1,411,242 views
  19. Japanese: 1,143,936 views
  20. Korean: 6,337,118 views
  21. Latvian: 4,611 views
  22. Lithuanian: 6,918 views
  23. Malay: 67,611 views
  24. Maori: 769 views
  25. Norwegian: 93,737 views
  26. Persian (Farsi): 6,917 views
  27. Polish: 351,859 views
  28. Portuguese: 2,148,746 views
  29. Romanian: 63,000 views
  30. Russian: 42,274 views
  31. Spanish*: 4,163,811 views
  32. Swedish: 129,838 views
  33. Turkish: 5,381 views
  34. Ukrainian: 15,372 views

“From research to cute cat photos, the Commons offers a treasure trove of content.”

Breakdown of works by media type includes data from 16 platforms plus the Directory of Open Access Journals (DOAJ). Specific totals by platform below.

“Audio tracks: 4 million”

“Images (Photos, Artworks): 391 million”

“Videos: 18.4 million”

“Texts (Articles, Stories, Documents): 46.9 million”

“Open Educational Resources: 76,000”

“Research (Journal Articles): 1.4 million”

“Other (Multimedia, 3D): 23,000”

CC is everywhere: Millions of websites use CC licenses, including major platforms like Wikipedia and Flickr and smaller websites like your grandma’s blog.

Breakdown of works by platform includes data from 16 platforms and the Directory of Open Access Journals (DOAJ). Specific totals by platform below, with additional usage data from some platforms.

“Flickr: 356 million photos”

“Wikipedia: 35.9 million articles”

“Wikimedia Commons: 21.6 million media files”

“Europeana: 20.9 million digital objects”

“YouTube: 13 million videos”

“Vimeo: 5 million videos”

“Internet Archive: 2 million files”

“Bandcamp: 1.95 million tracks”

“500px: 661,000 photos”

“Jamendo: 496,000 tracks”

“PLOS: 140,000 articles”

“Total Open Access articles under CC BY: 675,000; under any CC license: 1.3 million”

“Free Music Archive: 86,000 tracks”

“Boundless: 49,000 open educational resources”

“Tribe of Noise: 29,000 tracks”

“Skills Commons: 24,000 career training materials”

“MIT OpenCourseWare: 2,300 courses”

Countries with Open Education policies

Countries listed have legislation, policies, or funder mandates at the national, provincial/state, or institutional level that lead to the creation, increased use, or support for improving OER. CC relied on our international open education partners to notify us of existing open education policies in their countries. For specific policy information, see the OER Policy Registry.

Total $ dispersed via policies to date:

“Open Textbooks have saved students $174 million to date, with an additional $53 million projected through academic year 2015/16”

Our open education partners collectively reported cost savings of $174,448,941 USD after replacing proprietary textbooks and materials with open textbooks licensed under CC. These savings are to date, inclusive of the 2015 fall term. These same partners projected collective cost savings of $53,427,667 USD for the 2015-2016 academic year.

Data was collected via an open call to the global open education community. All respondents were from North America. All respondents, projects, reported savings, and sources are listed below. Due to the diversity of the open education space, savings were calculated per individual project. We only included a project’s cost savings in the total if a methodology was provided. We also avoided duplication by verifying each project’s savings by year.

Reported savings to date in 2015

Project

Reported savings (to date unless specified)

Projected savings for AY 2015/16

Method used to calculate savings

Source

Georgia Affordable Learning Initiative - Georgia Highlands College

$1,970,000

$1,000,000

Textbook savings calculated by looking at the actual number of students enrolled in each class, and then using the cost for the non-OER textbooks from the previous semester. Example: for ARTS 113, the previous non-OER text cost was $117.50. Had the 315 students enrolled in the class this fall had to purchase the previous text, the total cost would have been $37,012.50. The OER used in this class this semester is free, so we consider $37,012.50 to be the savings to our students. In total over the last three years 21,536 students have not had to purchase a textbook for a savings of $1.97 million.

Source: College staff obtained data from faculty using OER; https://campustechnology.com/articles/2015/11/20/u-georgia-nears-2-million-mark-in-oer-savings.aspx

California Affordable Learning Solutions Initiative

$0

$1,000,000

No reported savings to date. Projected savings will be captured by an online reporting tool used by CSU campuses.

Source: California State University, Office of the Chancellor.

Alberta OER Project

$0

$11,512

Rough estimate: 10 courses @ 30 students @ $50 CAD per student = $15 000 CAD

Source: Rory McGreal, Athabasca University

B.C. Open Textbook Project

$703,648

$538,192

Method described in detail at http://open.bccampus.ca/2015/09/10/more-bc-open-textbook-stats/

Source: BCcampus

CK-12: District El Paso ISD, TX

$200,000

No method described, but reported by news source.

Source: http://archive.elpasotimes.com/news/ci_26690335/episd-switches-e-books-high-school-science/

CK-12: District Tullahoma City Schools

$490,000

$215,000

Average social studies text price was approximately $80 per text. In accomplishing the transition to CK-12 flexbooks we were able to save the cost of $80 per text and reallocate those funds to support our teachers who served as flexbook curators and to purchase chromebooks. So for students these textbooks, formerly $80 per text, are now $0 per text.

Source: CK-12 Foundation

College of the Canyons: OER Degree Pathway

$750,000 for academic year 2014/15

$800,000

Total sections using OER in lieu of commercial textbook X average enrollment per section X cost of new textbook. in years past we used $100 as an average cost. The amount saved is now based on the actual retail cost of the new textbook for each specific course in our local college bookstore.

Source: College of the Canyons

Introductory Statistics/Collaborative Statistics: by Openstax College/Illowsky & Dean

$3,000,000 (for De Anza College only)

95+ sections/year using text → 95 * 40 = 3800 students/year minimum. 3800 students * 8.5 years = 32,300 students. 32,300 * average cost (new, used, self-selling) of $100 for another text = $3,230,000 if ALL students have a text. Some years, 100-105 sections used the text (110 sections per year). The 5 distance learning sections/ year have 60 students per section.

Source: Barbara Illowsky, PhD, Dean of Basic Skills & Open Educational Resources, CCC Online Ed Initiative (OEI). Savings are professor’s personal conservative estimate for De Anza College.

Lumen Learning

$1,614,335 for Fall 2015

We have assumed the textbook cost was $100 per course, and have calculated savings at $95 per course after our $5 support fee.

Source: Lumen Learning

Maricopa Millions: OER Project

$4,584,000

Savings calculated using class size of 20 and textbook cost of $100 for the “No Cost/Low Cost” sections of the 50 highest enrollment courses and all developmental education courses. We use the following to calculate the savings: Top 50 enrollment courses identified as “Low cost/No cost - less than $40” (mostly OER). 20 students per section. $100 savings per student (books, online homework, etc)

Source: Maricopa Community Colleges funds; https://www2.maricopa.edu/welcome-to-the-maricopa-millions-oer-project

Montgomery College: OER Project

$0

$78,640

Track using fields: instructor, course, OER type, reference, students FY2106, cost of former resource, cost savings (# of students times former resource cost)

Source: Montgomery College

Open Learning Initiative

$15,264,110 for academic year 2014/15

$4,000,000

Calculated using SPARC methodologies (100 dollars per enrollment average price), less cumulative service fees. In the 2014-2015 year we saw 48k enrollments academically (this does not count independent OLI use). However some portion of those enrollments did pay a service fee (always less than the textbook, totaling 232,190. So using 100 dollars, 4,800,000 - 232,190 = 4,567,810

Source: OLI

Open Oregon

$0

$227,175

Open Oregon calculated these numbers by asking grantees to estimate max/min cost of previous textbooks, as well as estimated # of students enrolled per year.

Source: Open Oregon; http://openoregon.org/2015-oer-special-project-grants-awarded/

Open Textbook Network (University of Minnesota)

$1,500,000

We are using an average per student savings estimate based on research done by national experts (David Wiley and Nicole Allen). They examined several research studies on open textbook savings and determined that $100 student savings was a reliable average. It takes several variables into consideration, including how many student buy used, rent, don’t buy, etc.  As a network, we agreed on the $100/student method. Our members verified adoption of the book by faculty and then verified enrollments. We then apply the $100/student.

Source: OTN obtained the $1.5 million in student savings reports from nine early members.

OpenIntro

$2,000,000

$1,000,000

Based on feedback from students, about half of college students buy the paperback version of the textbook while the other half exclusively use the free PDF. The paperbacks are about $10, so we estimate savings per student to average about $100. Through Oct 7th, 2015, we sold 14,838. Subtracting out about 500 as books we’ve directly purchased for teachers and a couple thousand that are associated with one of the co-author’s Coursera courses, we think $2 million is a conservative estimate. We also average around 15,000 to 20,000 textbook PDF downloads per month, though we don’t know all the groups that are making use of the textbook. We don’t include Coursera savings / value in our current estimates. We also do not estimate savings from our other free and open-source resources, e.g. in time savings for instructors who then do not need to develop their own resources from scratch.

Source: OpenIntro; with paperback figures from sales through CreateSpace

OpenStax College

$53,600,000

$25,800,000

We use a composite number of $98.57 to estimate student savings (our number discounts for used books, rentals, e-books etc). Current number of students 261,000, number of schools 2,000

Source: OpenStax College manually verifies every faculty adoption

Tacoma Community College: OER Project

$1,000,000

To date, we have been meticulously calculating the exact savings by keeping track of the price of the book that was replaced, but we are contemplating moving to a revised methodology that takes into account other factors, such as a new course that uses OER from the beginning.

Source: Tacoma Community College OER Project

Virginia Tech: University Libraries Open Education Initiative

$0

$72,000

Savings are calculated for zero-cost textbook replacement for 900 students given the conservative assumptions regarding what students otherwise would pay (where 10% go without textbook access, 10% buy the old (2011) edition used, 50% rent ($56.93), 10% buy a new paperback ($72.78) and 20% buy a used hardcover textbook from the campus bookstore ($215.60).

Source: University Libraries, Virginia Tech

University of Maryland University College

$5,167,748

$15,000,000

UMUC’s department of Learning Design and Solutions compiled the list of courses and the type of resource(s) used in each, as well as the previous price for textbooks in those courses. Using Fall, Spring, and Summer 2014 enrollment data, which I disaggregated by course, I estimated the cost-savings based on the price of each course’s previous textbook times that course’s enrollment for the aforementioned term. The estimated cost-savings thus assumes that all UMUC students’ would have purchased the textbook.

Source: Dr. Katrice Hawthorne, Professor at University of Maryland University College

Virginia Community Colleges: Zx23 Project

$1,454,100

$3,685,148

We count adoptions of OER through direct communication with Zx23 Project grantees (# of course sections taught X student enrolled in each section X $100 avg textbook cost). We combine data on these adoptions with data from tangential NVCC projects.

Source: Virginia Community College System & Lumen Learning, Inc.

Reported savings included from the 2014 State of the Commons report

Project

Reported savings in 2013

College of the Canyons

$556,000

Flat World Knowledge

$47,947,100

Lumen Learning

$1,300,000

OER-Based General Education Certificate Program at NOVA

$122,000

Open Course Library

$5,711,400

Orange Grove Texts Plus

$530,000

Precalculus by Carl Stitz & Jeff Zeager

$1,235,400

Siyavula

$23,500,000

Center for Computer- Assisted Legal Instruction: eLangdell Press

$67,500

The Virginia Community College System

$52,000

Utah State Office of Education

$87,500

“Z Degree” Tidewater Community College

$42,100

“The Ford Foundation, Bill & Melinda Gates Foundation, Vancouver Foundation, and Wikimedia Foundation”

“Together with The William and Flora Hewlett Foundation, a longstanding open policy leader,  these foundations collectively made grants of approximately $1.9 billion in 2015”

To arrive at this figure, we added the annual grantmaking figures from the most recent year available (2013) as listed on http://foundationcenter.org/. We assume those foundations will continue to make similar amounts of grants in the near future. The Gates Foundation policy applies only to research funding—it is not foundation-wide like the others. So we used the estimate of $900M/year quoted by Science.

Hewlett Foundation

Ford Foundation

Gates Foundation

Vancouver Foundation

Acknowledgments

Thank you to Google for providing us with the foundational data upon which this report was built. Thanks especially to:

Agnes Toth, Google
Brendan Hickey, Google
Paul Haahr, Google
Erin Simon, Google

We would also like to acknowledge the following individuals and organizations for their contributions. This report would not be possible without them.

Platform data

Alexis Rossi, Internet Archive
Ariel Diaz, Boundless
Benjamin Glatstein, Microsoft
Brandon Muramatsu, MIT Office of Digital Learning
Cheyenne Hohman, Free Music Archive
Danielle Ward, DeviantArt
David Knutson, PLOS
Donna Okubo, PLOS
Ed Harrison, Boundless
Guillaume Paumier, Wikimedia Foundation
Hessel van Oorschot, Tribe of Noise
Jake Johnson, Internet Archive
Jennifer Elias , Bandcamp
Juliet Barbara, Wikimedia Foundation
Kimberly Potvin, Flickr
Leo Lipsztein, YouTube
Martin Guerber, Jamendo
Matt Lee, Libre.fm
Matt McLernon, YouTube
Neil P. Quinn, Wikimedia Foundation
Nuno Silva, 500px
Paul Keller, Kennisland
Richard Lumadue, Skills Commons/California State University, Office of the Chancellor
Sarah Agudo, Medium
Tilman Bayer, Wikimedia Foundation
Yvonne Ng, MIT Office of Digital Learning

Open education data

Amanda Coolidge, B.C. Open Textbook Project
Amy Hofer, OpenOregon
Andrew Christopher, Institute of Museum and Library Services
Anita Walz, University Libraries, Virginia Tech
Barbara Illowsky, CCC Online Ed Initiative (OEI)
Buddy Muse, Montgomery College “Open Educational Resources”
Christie Fierro, Tacoma Community College OER Project
David Diez, OpenIntro
David Harris, OpenStax College
David Wiley, Lumen Learning
Elijah Scott, Affordable Learning Georgia Initiative, Georgia Highlands College
Gerry Hanley, California State University, Office of the Chancellor
Hetav Sanghavi, CK-12 Foundation
James Glapa-Grossklag, College of the Canyons
Jennryn Wetzler, US. Department of State
Kamil Śliwowski, CC Poland
Karen Vignare, University of Maryland University College
Katrice Hawthorne, University of Maryland University College
Kelsey Wiens, CC South Africa
Kim Thanos, Lumen Learning
Konstantin D. A. SCHELLER , European Commission
Lorna Campbell , University of Edinburgh
Marion Kelt, Glasgow Caledonian University
Nate Angell, Lumen Learning
Neeru Khosla, CK-12 Foundation
Norman Bier, Open Learning Initiative
Ovidiu Voicu, Foundation for an Open Society Romania
Paul Golisch, Maricopa Millions OER Project
Renva Watterson, Affordable Learning Georgia Initiative, Georgia Highlands College
Ricardo FERREIRA, European Commission
Richard Sebastian, Virginia Community College System
Rory McGreal, Athabasca University
Sara Trettin, U.S. Department of State
Sarah Cohen, Open Textbook Network at University of Minnesota

Attributions                                        

Bassel Khartabil drawing from http://freebassel.org/ and in the public domain, thanks to CC0.

Icons used with permission via subscription from the Noun Project. Courtesy these creators: Picture By Hoang Loi, VN; Book By Creative Stall, PK; Document By Melvin Salas, CR; Atomic By Geoffrey Joe, GB; Headphones By Molly Bramlet, US; Video-Player By Cédric Villain, FR; Book By Bryn Taylor, GB; 3D Glasses By Luis Rodrigues, PT; Cat By Rajha Surya, IN; Cat By Richard Zeid, US; Grandmother By Alberto Miranda, ES; Heart Bills By Till Teenck, DE; Public Domain; Book By Mike Ashley, AU; Gift By Stefan Parnarov, BG; Frame By Kari Svangstu, NO; Share By AJ Annunziata; Polaroid By Joe Mortell, GB; Waxing Crescent By hunotika. Icons originally licensed CC BY.