Videolectures Ingredients that can make Analytics - CEUR

by user








Videolectures Ingredients that can make Analytics - CEUR
Videolectures Ingredients that can
make Analytics Effective
Marco Ronchetti
Dipartimento di Ingegneria e
Videolectures over the Internet started at the turn of
the century and became more and more popular, until
they recently obtained a wide echo in the form of
Massive Open On-Line Courses (MOOCs). Although
videolecture usage data have always been important, in
the case of MOOCs they are vital for the success of the
initiative. In the present paper, we suggest that some
(already available) tools for the extraction of semantic
information from the video should be used, as they
may vastly improve the meaningfulness of the
information extracted from videolecture analytics.
Scienza dell’Informazione
Università degli Studi di Trento
Trento, IT- 38050, Italy
[email protected]
Author Keywords
Videolectures, effective analytics, semantics, MOOCs
ACM Classification Keywords
K.3.1 [Computers and Education]: Computer Uses in
Education - Computer-managed instruction (CMI).
The idea of massively using videos of recorded lectures
for teaching goes back to the attempts to use TV as an
educational medium. The TV introduced some
educational programs (and later channels), but only in
rare occasions they were a success. A such case was
the Italian TV show “Non è mai troppo tardi” (It’s never
too late) which from 1960 to 1968 brought more than a
© 2013 for the individual papers by the papers' authors.
Copying permitted only for private and academic purposes.
This volume is published and copyrighted by its editors.
WAVe 2013 workshop at LAK’13, April 8, 2013, Leuven, Belgium.
million of illiterates to achieve a primary school degree
(probably one of the most successful examples of TVbased distance education ever, and a sort of early
MOOC –Massive Open On-line Course, even though the
“line” was not the Internet). Even before that there
were instructional movies – used for instance to
demonstrate scientific experiments that were too
complex or too lengthy to be performed in a school
laboratory. Also today there are educational TV
channels, like Teachers TV : a digital channel for
everyone who works in schools. Teachers TV’s
programmes cover every subject in the curriculum, all
key stages and every professional teaching role. It can
be accessed on digital cable and satellite (more recently
also via Internet).
experiments [3, 13] a lot of research has been done on
the Internet carried videolectures field (for a review see
[9, 10]).
It took then about 15 years for these videolectures to
pass from the work of the pioneers to the pages of the
New York Times [8]. They went progressively though a
larger and larger diffusion, with a first boost given
(around 2005) by the Apple iTunes-U initiative, which
also allowed extracting some usage data from the logs,
see e.g. [2]. Along the path, for a few years (starting
again from 2005) the podcasting variant has been a
fashionable approach. Only recently MOOCs finally
made it into the official dictionaries: the MOOC entry in
the English Wikipedia dates July 2011. A history of
MOOCs in 2012, the year of the boom, is reported in a
post by Audrey Watters.
In the seventies, the use of VHS cassettes allowed for
the first time to attempt transforming videos into an
“on demand” resource for satisfying educational needs,
but again the effort had only a marginal impact on the
education mainstream.
MOOC Numbers
Figures such as “1.7 million students for Coursera” or
“ratio students to professor 150.000:1 in Udacity” [8]
are certainly impressive: however, in spite of their
popularity, there is little data on MOOCs. Stories of
success and failure are often anecdotal. Some statistics
is available coming from MOOC platforms like Coursera,
Udacity and MITx, and they are puzzling.
At the end of the 80’s, a system that implemented a
rather mechanical process of individualized instruction
was patented [1]. Part of the system consisted in the
ability to use some ad-hoc hardware to play movies.
Only in the nineties PCs had sufficient power and
memory space to consider them as tools that can be
used for reproducing videos and multimedia in general.
With the millennium turn the increased network
bandwidth and the power of mobile devices (laptops
first, and then pads and smartphones) allowed
distributing videos over the Internet, which ultimately
delivered today’s capability to use video instruction
anywhere and at any time. Since the early
The first MIT MOOC (MITx - 6.002x: Circuits and
Electronics.), boomed with 154,763 registrants. Only
45% however (69,221 people) looked at the first
problem set, and out of them only 26,349 earned at
least one point (17% of the enrolled): we can consider
these as the ones who manifested a real interest,
rather than just a curiosity.
The number halved by the midterm assignment
(13,569 people looked at it while it was still open and
9,318 people got a passing score on the midterm - 6%
of the enrolled).
In the end, after completing 14 weeks of study, 7,157
people earned the first certificate (4,6% of the enrolled,
i.e. 27% of those who really manifested interest). In
spite of the gigantic drop, having more than seven
thousand students passing a course is a massive
achievement indeed.
The statistics raise several questions. The most
compelling one is probably “why aren’t a large number
of students finishing the course?”. This question may
be difficult to find a response to, but responses to other
inquiries can be obtained by monitoring the users’
behaviour, and gathering statistics and analytics.
Examples of such queries are e.g. the following ones:
Where do the students come from?
Which videos are most popular, and which ones
attract little interest?
Are students actually watching the videos on
the assigned dates?
Are viewers watching all the way through?
At what point in the lecture, if any, do viewers
stop watching?
Are the students watching the videos by the
assigned deadlines?
Do the videos generating active user
Do students edit, share, download the
Some help may come from a low-granularity structure
of the material. For instance, if “lectures” are broken
into small pieces (20 minutes) as in the case of Kahn
Academy, or even less (10 minutes fragments, like in
certain Coursera cases), it is likely that each unit has a
well-defined semantics. Instead, if a lecture is recorded
in class, and hence follows time constraints which are
dictated by logistics rather than by content, things are
much more difficult.
MOOC questions and challenges
Are there any portions of the videos that are
being watched repeatedly?
The interpretation of the statistics may however be not
easy. Knowing that the sequence on lecture N at time
between t1 and t2 is often reviewed is not by itself a
meaningful cue. What is there? To know, we need to
view ourselves the fragment. When the potentially
interesting sections or points are many, this may be a
very time-consuming task. The problem arises by the
lack of semantic information.
The numbers for Coursera’s Social Network Analysis
class are less encouraging. Out of the 61,285 students
registered, 1303 (2%) earned a certificate, and only
107 earned "the programming (i.e. with distinction)
version of the certificate” (0.17%).
In these cases, substantial help may come from certain
ingredients that we claim to be important ingredients of
the videolectures:
multiple (parallel) cognitive channels,
semantic marking,
Videolecture enhancements that may (also)
help analytics
efficacy brought by the presence of video as an
additional cognitive channels. We believe MOOCS
should adopt such a rich communication paradigm, and
not rely on the poorer paradigm based on a single
video channel (+ audio).
The ingredients we mentioned are not really new, as
some people have been using them for years in the
context of videolectures as tools for improving the user
experience. For instance, semantic annotation has been
used for facilitating lecture navigation (see e.g. [11]),
and transcripts have helped searching a videolecture
(see later). However, in the light of analytics they
assume a new dimension. Let us briefly examine them.
This choice would help introducing the second
ingredient: semantic marking. Having e.g. slides
transitions makes it very easy to associate metadata to
specific portions of a video. When a teacher presents a
slide, what is s/he talking about? Most likely, we find
the answer in the slide title. If slide transition timing,
and slide content, are captured while recording the
video, it becomes extremely easy to tag the video with
semantic annotation. Questions like the ones we have
mentioned, e.g. “Are there any portions of the videos
that are being watched repeatedly?” may have now a
significantly more interesting answer than “at time
nn:nn”: the answer might rather be something like “the
fragment discussing third Kepler law”. The power of
analytics suddenly is vastly increased, exactly because
of the availability of semantic metadata. And the
important point is that such metadata – which are a
resource which is notoriously difficult and costly to
obtain, are automatically generated!
The first component we mentioned is multiple cognitive
channels. Typically on-line lectures in MOOCs focus on
at exactly two channels: they are either video + audio,
slides + audio (the so called webcasts), or computer
screen + audio (as e.g. in the case of the Kahn
Academy). There are even lectures bases on audio
alone (podcast), even though they were mostly used
before the success of the MOOC term.
In contrast, even the snubbed frontal lectures in class
are based on a richer paradigm. The teacher uses the
blackboard, PowerPoint slides, may project his/her
computer screen, and at the same time students see
gestures and facial expressions. It is quite possible to
reproduce such environment even in on-line lectures. A
variety of authoring systems allow using in parallel (at
least) two visual channels (e.g. slides + video), making
the on-line lecture richer. While Moreno and Mayer [7]
suggested that the presence of multiple cognitive
channel brings a negative “split attention” effect,
Glowalla [6], a German instruction psychologist,
reported that lectures showing a video and slides
favour learners show better concentration, while the
audio + slide version is perceived as more boring. Data
obtained by other investigators [4] confirm the better
On the same line, availability of (synchronized) audio
transcripts allows associating meaningful information to
the timeline. A few years ago, we [5] successfully
experimented using Automatic Speech Recognition
tools to enrich videos with synchronized transcripts that
allowed students to perform searches into on-line
videolectures. This technique would of course also
allow mapping any data coming from analytics on the
content without the need of visual inspection of video
fragments. Natural language processing (NLP) tools
could be used to extract additional semantic
information from a specific video fragment.
World Conference on Educational Multimedia,
Hypermedia and Telecommunications 2011,
Chesapeake, VA: AACE, 2011, p. 720-727.
Finally, we mention in passing that the possibility for
students to annotate video lectures would be a yet
additional, precious source of information. Again, this
would be a case of a feature that was originally
designed to achieve a particular goal (such as e.g. to
grow a community sense around a set of
videolectures), and that would acquire an additional
value in the context of usage analysis that is typical for
analytics tools. This would be true for the extra
information that NLP tools could mine from the notes,
but in addition to that, data regarding annotation would
per se be an extra source that could be mined (e.g. to
find correlations with the difficulty or interest of a
particular video portion).
[3] M.H. Hayes. (1998) Some approaches to Internet
distance learning with streaming media, Second IEEE
Workshop on Multimedia Signal Processing, Redondo
Beach, CA, USA (1998).
[4] A. Fey. (2002) Audio vs. Video: Hilft Sehen beim
Lernen? Vergleich zwischen einer audio-visuellen und
auditiven virtuellen Vorlesungen. Lernforschung, 30.
Jhg (4):331–338 (in German)
[5] A. Fogarolli, G. Riccardi, M. Ronchetti. (2007).
“NEEDLE: Searching information in a collection of
video-lectures” in Proceedings of World Conference on
Educational Multimedia, Hypermedia and
Telecommunications ED-MEDIA 2007, Norfolk (Va):
AACE, 2007, p. 1450-1459.
[6] U. Glowalla. (2004). Utility and Usability von ELearning am Beispiel von Lecture-on-demand
Anwendungen. In Entwerfen und Gestalten, 2004 (in
MOOCs may be just an ephemeral fashion, or might
revolutionize the future landscape of higher education:
only time will tell. In this short paper we advocated the
need for them to embrace a richer cognitive paradigm,
and to be enriched by metadata associated with video
fragments. The availability of such metadata, which
should be automatically extracted, provides important
hints that they make the information extracted by
videolecture analytics much more significant.
[7] R. Moreno, & R. E. Mayer. (2000). A LearnerCentred Approach to Multimedia Explanations: Deriving
Instructional Design Principles from Cognitive Theory,
Interactive Multimedia Electronic Journal of ComputerEnhanced Learning (2).
[8] L. Pappano. (2012). “The year of the MOOC”, The
New York Times, Nov 2, 2012
[9] M. Ronchetti. (2011). "Perspectives of the
Application of Video Streaming to Education" in Ce Zhu,
Yuenan Li, Xiamu Niu (a cura di), Streaming Media
Architectures, Techniques, and Applications: Recent
Advances, Hershey PA, USA: Information Science
Reference, IGI Global, 2011, p. 411-428.
[1] A. Louis Abrahamson, Frederick F. Hantline, Milton
G. Fabert, Michael J. Robson, Robert J. Knapp. (1989).
“Electronic classroom system enabling interactive selfpaced learning”, US Patent 5002491.
[2] A. Defranceschi, M. Ronchetti. (2011). "Videolectures in a traditional mathematics course on iTunes
U: usage analysis" in Proceedings of EDMEDIA 2011 -
[10] M. Ronchetti. (2011). "Video-Lectures over
Internet: The Impact on Education" in G. Magoulas (a
cura di), E-Infrastructures and Technologies for Lifelong
Learning: Next Generation Environments, New York:
IGI Global, 2011, p. 253-270.
[11] M. Ronchetti. (2012). "LODE: Interactive
demonstration of an open source system for authoring
video-lectures" in Interactive Collaborative Learning
(ICL), 2012 15th International Conference on, Los
Alamitos, USA: Computer society Press of the IEEE,
2012, p. 1-5.
[12] D. McKinney., J.L. Dyck, & E.S. Luber. (2009).
iTunes University and the classroom: Can podcast
replace Professors? Computer & Education 52, 617623.
[13] F. Tobagi. (1995) Distance learning with digital
video. Multimedia, IEEE vol. 2 (1) pp. 90 – 93.
Fly UP