Cross Cultural Multimedia Computing

by Shlomo Dubnov, Kevin Burns, Yasushi Kiyoki

Cross Cultural Multimedia Computing The ability to communicate cultural codes in multimedia depends on their meaning and beauty as perceived by different audiences around the globe In this book the ongoing research on computational modeling of visual musical and textual contents is described in terms of identifying and mapping their semantic representations across different cultures The underlying psychology of sense making is quantified through analysis of aesthetics in terms of organizational and structural aspects of the c

Publisher : Springer International Publishing

Author : Shlomo Dubnov, Kevin Burns, Yasushi Kiyoki

ISBN : 9783319428710

Year : 2016

Language: en

File Size : 3.37 MB

Category : No Category

SPRINGER BRIEFS IN COMPUTER SCIENCE

Shlomo Dubnov
Kevin Burns
Yasushi Kiyoki

Cross-Cultural
Multimedia
Computing
Semantic
and Aesthetic
Modeling
123

SpringerBriefs in Computer Science
Series editors
Stan Zdonik, Brown University, Providence, Rhode Island, USA
Shashi Shekhar, University of Minnesota, Minneapolis, Minnesota, USA
Jonathan Katz, University of Maryland, College Park, Maryland, USA
Xindong Wu, University of Vermont, Burlington, Vermont, USA
Lakhmi C. Jain, University of South Australia, Adelaide, South Australia, Australia
David Padua, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
Xuemin (Sherman) Shen, University of Waterloo, Waterloo, Ontario, Canada
Borko Furht, Florida Atlantic University, Boca Raton, Florida, USA
V.S. Subrahmanian, University of Maryland, College Park, Maryland, USA
Martial Hebert, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Katsushi Ikeuchi, University of Tokyo, Tokyo, Japan
Bruno Siciliano, Università di Napoli Federico II, Napoli, Italy
Sushil Jajodia, George Mason University, Fairfax, Virginia, USA
Newton Lee, Newton Lee Laboratories, LLC, Tujunga, California, USA

More information about this series at http://www.springer.com/series/10028

Shlomo Dubnov Kevin Burns
Yasushi Kiyoki


Cross-Cultural Multimedia
Computing
Semantic and Aesthetic Modeling

123

Shlomo Dubnov
University of California in San Diego
La Jolla, CA
USA

Yasushi Kiyoki
Keio University
Minato, Tokyo
Japan

Kevin Burns
The MITRE Corporation
Bedford, MA
USA

ISSN 2191-5768
ISSN 2191-5776 (electronic)
SpringerBriefs in Computer Science
ISBN 978-3-319-42871-0
ISBN 978-3-319-42873-4 (eBook)
DOI 10.1007/978-3-319-42873-4
Library of Congress Control Number: 2016946332
© The Author(s) 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer Nature
The registered company is Springer International Publishing AG Switzerland

Contents

1 A ‘Kansei’ Multimedia and Semantic Computing System
for Cross-Cultural Communication . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 The Mathematical Model of Meaning (MMM) . . . . . . . . . . . . . . . .
1.3 Cross-Cultural Computing System for Music . . . . . . . . . . . . . . . . .
1.3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Impression-Based Metadata Extraction
for a Cross-Cultural Music Environment . . . . . . . . . . . . . . .
1.4 An Applied Model of MMM to Automatic Media-Decoration . . . .
1.4.1 Basic Semantic Spaces and a Media-Transmission
Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.2 Basic Functions for Media Decoration . . . . . . . . . . . . . . . .
1.5 Media Design with “Automatic Decorative
Multimedia Creation” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Music Decoration with Images . . . . . . . . . . . . . . . . . . . . . .
1.5.2 Color-Based Impression Analysis for Video
and Decoration with “Kansei” Information . . . . . . . . . . . . .
1.6 Cross-Cultural Computing System for Images . . . . . . . . . . . . . . . .
1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Cross-Cultural Aesthetics: Analyses and Experiments
in Verbal and Visual Arts. . . . . . . . . . . . . . . . . . . . . . . . . .
2.1 A Theoretical Framework for Computing Aesthetics . .
2.1.1 Birkhoff and Bense . . . . . . . . . . . . . . . . . . . . . .
2.1.2 IR and EVE′ . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.3 Models of Memory . . . . . . . . . . . . . . . . . . . . . .
2.2 A Computational Model of Verbal Aesthetics . . . . . . .
2.2.1 Haiku Humor . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Serious Semantics . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Amusing Advertisements. . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

1
1
2
3
4
6
8
8
9
9
10
11
14
18
19
21
21
22
23
24
25
25
27
28

v

vi

Contents

2.3 An Experimental Study of Visual Aesthetics . . . . . . . .
2.3.1 Abstract Artworks . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Personal Preferences . . . . . . . . . . . . . . . . . . . . .
2.3.3 Cultural Comparison . . . . . . . . . . . . . . . . . . . . .
2.4 The Fundamental Challenge of Computing Semantics .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

3 Information Sensibility as a Cultural Characteristic:
Tuning to Sound Details for Aesthetic Experience . . . . . . . . . . .
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 Information Dynamics . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Information Seeking as an Aesthetic Perception . . . . . . . . . . .
3.2.1 Information Dynamics and Music Cognition . . . . . . . .
3.2.2 Musical Information: Structure Versus Meaning . . . . .
3.2.3 Paradigmatic Analysis Revisited . . . . . . . . . . . . . . . . .
3.3 The Variable Markov Oracle (VMO) Model: Capturing
the Past to Predict the Future . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Motif Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.2 The Oracle Structure . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.3 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Neutral, Aesthetic and Poietic: Tuning Acoustic Sensibility
in Order to Maximize the Information Rate . . . . . . . . . . . . . .
3.4.1 Choice of Experimental Repertoire . . . . . . . . . . . . . . .
3.4.2 Analysis Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Multi-level Listening . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6 Disclaimer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

29
30
34
35
38
39

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

43
43
45
45
46
46
48

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

49
51
52
54

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

54
55
55
58
60
60
61
61

About the Authors

Shlomo Dubnov is a Professor of Music and Computer Science in UCSD. He
holds Ph.D. in Computer Science from Hebrew University, MSc in EE from Israel
Institute of Technology (Technion) and B.Mus in Music Composition from Rubin
Academy in Jerusalem. He served as a researcher at Institute for Research and
Coordination in Acoustics and Music (IRCAM) in Centre Pompidou, Paris and was
a visiting professor in KEIO University, Japan, and Computer Science Laboratory
(LaBRI) in University of Bordeaux, France. His work on computational modeling
of style and computer audition has led to development of several computer music
programs for improvisation and machine understanding of music. He currently
directs the Center for Research in Entertainment and Learning (CREL) at UCSD’s
Qualcomm Institute (Calit2) and serves as a lead editor in ACM Computers in
Entertainment Journal.
Kevin Burns is a Cognitive Scientist researching human perceptions, decisions,
emotions, and interactions between humans and machines—with applications to
national security, industrial safety, organizational efficiency, and personal artistry.
The objective across these diverse domains is to measure and model how people
think and feel, as a basis for building machines that can help humans manage risk
and enjoy life. Kevin is employed by The MITRE Corporation and holds engineering degrees from the Massachusetts Institute of Technology.
Yasushi Kiyoki received his B.E., M.E., and Ph.D. degrees in Electrical
Engineering from Keio University in 1978, 1980 and 1983, respectively. From
1984 to 1996, he was with Institute of Information Sciences and Electronics,
Univ. of Tsukuba, as Assistant Professor and then Associate Professor. In 1990 and
1991, he was in University of California at Irvine, as a visiting researcher. Since
1996, he has been with Department of Environment and Information Studies at
Keio University, and from 1998 he is currently Professor. Since 2011, he is a chair
and coordinator of “Global Environmental System Leader Program (GESL)” in
KEIO University. His research addresses semantic computing, environmental
engineering, multimedia database systems and knowledge base systems. His

vii

viii

About the Authors

original semantic model is “Mathematical Model of Meaning (MMM),” and he has
more than 100 paper publications related to MMM. He serves as the editor-in-chief
on Information Modelling and Knowledge Bases (IOS Press). He has also served as
the program chair for several international conferences, such as International
Conferences on Information Modelling and Knowledge Bases (2004—Present). He
was a keynote speaker in 7th IEEE International Conference on Semantic
Computing, September 2013, as the title of “A Kansei: Multimedia Computing
System for Environmental Analysis and Cross-Cultural Communication.”

Short Description of the Book

The ability to communicate cultural codes in multimedia depends on their meaning
and beauty, as perceived by different audiences around the globe. In this book we
describe ongoing research on computational modeling of visual, musical, and
textual contents in terms of identifying and mapping their semantic and aesthetic
representations across different cultures. The underlying psychology of
sense-making is quantified through analysis of aesthetics in terms of organizational
and structural aspects of the contents that influence an audience’s formation of
expectations for future signals, violations of these expectations, and explanations
of their meaning. Complexity-accuracy trade-offs in sound representation are further used to develop new computational methods that capture poietic and aesthetic
aspects in music communication. Experimental studies are reported that try to
characterize preferences for complexity in abstract, classical, and traditional art and
music across samples of Western and Far Eastern cultures. These experiments
illustrate how aesthetics can be computed in terms of semantic and information
measures, highlighting commonalities and uncovering differences in aesthetic
preferences across cultures and individuals.

ix

Introduction

Art is the imposing of a pattern on experience, and our
aesthetic enjoyment is recognition of the pattern.
―Alfred North Whitehead

This book represents a collective effort to define the role of computing in cultural
context, allowing closer understanding of differences and similarities in aesthetic
expressions and sensibilities between different cultures, as detected by the machine.
This goal cannot rely purely on exploitation of existing computational tools.
Cross-cultural multimedia computing is a field of research where problems of
meaning or semantics are applied to cultural data, such as images, music, video, and
text, with comparative analysis performed between different cultures. Reporting
state-of-the-art research in this domain spans various disciplines, including but not
limited to areas such as semantic computing, psychology, information theory,
computer music, semiotics, and more. The development of such cultural computational tools has practical implications: culture-specific impression-based metadata
can be extracted and mapped to similar impressions from another culture, associations between sounds and images can be used to create automatic media decoration
models dependent on their cultural context, and many more. But most importantly,
such research sheds new light on our understanding of the human aesthetic faculty,
a concept referring to human ability to perceive elegance or beauty, and possibly
even wit and humor, encompassed to some extent by Western term of Aesthetics
and the Japanese term “Kansei” that links psychological sensibilities and emotions
to aspects of product design. The Kansei approach, described in the first chapter,
assumes that an artistic artifact can be described in a certain vector space, which is
defined by semantic expressions (words) that represent impressions and imaginations evoked by the object in the viewer or listener. Extending the vector space
approach to include aspects of expectation and surprise requires tracking of changes
in the qualities of the artistic object over time. In the second chapter of this book a
novel EVE’ (Expectation, Violation, Explanation) model is applied to textual and
visual expression, emphasizing the sequential nature of human thought process and

xi

xii

Introduction

artistic decision-making. Moreover, it is argued that generic object properties, such
as complexity and order, can be used to characterize an aesthetic effect, though in
ways that are different from those proposed early on in the field of computational
aesthetics. This chapter also suggests and shows experimental evidence for existence of universals in perception of aesthetic properties across cultures.
Using tools and terminology of communication and information theory, the
properties of the object or artistic artifact, and properties of the receiver, the viewer,
or listener, are inevitably linked to each other. Such relation seems to suggest that
aesthetic appreciation can be communicated based on commonalities in cognitive
processing of abstract structures shared by humans across cultures. It is important to
realize that these commonalities do not preclude the development of different forms
of cultural expression, or development of personal preferences towards certain types
of artistic expression through process of learning or enculturation. The question of
relations between units and levels of structure, and their relation to aesthetic
appreciation is brought up in the third chapter in the context of analysis of sound
examples from Western and Far Eastern musical cultures. Borrowing from semiotic
analysis of text, the question of “poietic” or compositional aspects of a work of art
that are employed in its creation versus the ability to perceive its “aesthetics” is
dealt through musical information dynamics analysis. It is shown that paying notice
to different levels of sound detail results in very different structures and cultural
paradigms between Western and Far Eastern music. The results seems to support an
intuitive notion that expressive intonation and more nuanced aspects of instrumental sound production are of significance in Far Eastern traditions, while schematic and pronounced structural elements are more dominant in Western classical
music. The theories, applications, and experimental results presented in this book
suggest that communication across cultures is amenable to computational analysis,
though our formal understanding of the cultural phenomena is only in its
beginnings.

Chapter 1

A ‘Kansei’ Multimedia and Semantic
Computing System for Cross-Cultural
Communication

Abstract In the design of multimedia computing systems, one of the most
important issues is how to search and analyze media data (images, music, movies
and documents), according to user’s impressions and contexts. This paper presents
“Kansei-Multimedia Computing System” for realizing international and
cross-cultural research environments, as a new platform of multimedia computing
system. We introduce a “Kansei” and semantic associative search method based on
the “Mathematical Model of Meaning (MMM)”. The concept of “Kansei” includes
several meanings on sensitive recognition, such as “emotion”, “impression”, “human senses”, “feelings”, “sensitivity”, “psychological reaction” and “physiological
reaction”. MMM realizes “Kansei” processing and semantic associative search for
media data, according to user’s impressions and contexts. This model is applied to
compute semantic correlations between keywords, images, music, movies and
documents dynamically in a context-dependent way. This system based on MMM
realizes (1) “Kansei” image and music search and analysis for cooperative creation
and manipulation of multimedia objects and (2) Cross-cultural communications
with music and images databases.

1.1

Introduction

The rapid progress of multimedia technology has realized the large scale of media
data transfer and resource-accumulation in the world. Cross-cultural computing
becomes an important issue in global societies and communities connected in the
world-wide scope. The innovative integration of large scale multimedia data
management and cross-cultural computing will lead to new cross-cultural environments in our society.
We have designed “Kansei-Multimedia Computing System” for realizing
automatic media decoration with dynamic sub-media data selection for representing
main-media as decorative multimedia. The aim of this method is to create “automatic decorative-media art” with “semantic associative computing” [1].

© The Author(s) 2016
S. Dubnov et al., Cross-Cultural Multimedia Computing,
SpringerBriefs in Computer Science, DOI 10.1007/978-3-319-42873-4_1

1

1 A ‘Kansei’ Multimedia and Semantic Computing System …

2

The field of “Kansei” information was originally introduced as the word “aesthetics” by Baumgrarten in 1750. The aesthetics of Baumgrarten had been established and succeeded by Kant with his ideological aesthetics [2, 3]. In the research
field of multimedia database systems, it is becoming important to deal with
“Kansei” information for defining and extracting media data according to impressions and senses of individual users.

1.2

The Mathematical Model of Meaning (MMM)

In this section, the outline of our semantic associative search method based on the
Mathematical Model of Meaning (MMM) is briefly reviewed. This model has been
presented in [4, 5] in detail.
The overview of the MMM is expressed as follows:
(1) A set of m words is given, and each word is characterized by n features. That
is, an m by n matrix is given as the data matrix.
(2) The correlation matrix with respect to the n features is constructed. Then,
the eigenvalue decomposition of the correlation matrix is computed and the
eigenvectors are normalized. The orthogonal semantic space is created as the
span of the eigenvectors which correspond to nonzero eigenvalues.
(3) Images and context words are characterized by using the specific features
(words) and representing them as vectors.
(4) The images and context words are mapped into the orthogonal semantic space
by computing the Fourier expansion for the vectors.
(5) A set of all the projections from the orthogonal semantic space to the invariant
subspaces (eigen spaces) is defined. Each subspace represents a phase of
meaning, and it corresponds to a context or situation.
(6) A subspace of the orthogonal semantic space is selected according to the
user’s impression or the image’s contents, which are given as a context represented by a sequence of words.
(7) The most correlated image to the given context is extracted in the selected
subspace by selecting and applying one of the metrics defined in the semantic
space.
The advantages and original points of the MMM are as follows:
(1) The semantic associative search based on semantic computation for words is
realized by a mathematical approach. This media search method surpasses the
search methods which use pattern matching for associative search. Users can
use their own words for representing impression and data contents for media
retrieval, and do not need to know how the metadata of media data of retrieval
candidates are characterized in databases.

1.2 The Mathematical Model of Meaning (MMM)

3

(2) Dynamic context recognition is realized using a mathematical foundation. The
context recognition can be used for obtaining multimedia information by
giving the user’s impression and the contents of the information as a context.
A semantic space is created as a space for representing various contexts which
correspond to its subspaces. A context is recognized by the computation for
selecting a subspace.
Several information retrieval methods, which use the orthogonal space created
by mathematical procedures like SVD (Singular Value Decomposition), have been
proposed. The MMM is essentially different from those methods using the SVD
method. The essential difference is that our model provides the important function
for semantic projections which realizes the dynamic recognition of the context. That
is, the context-dependent interpretation is dynamically performed by computing the
distance between different media data, information resources and words. The
context-dependency is realized by dynamically selecting a subspace from the entire
orthogonal semantic space, according to a context. In MMM, the number of phases
of contexts is almost infinite (currently 22000 in the general English word space and
2130 in the color-image space, approximately). For semantic associative computations of “Kansei” information in MMM, we have constructed several actual
semantic spaces, such as the general English-word space in 2000 dimensions
approximately, the color-image space in 130 dimensions, and music space in 8
dimensions in the current implementations.

1.3

Cross-Cultural Computing System for Music

This section introduces a cross-cultural computing system for music, which is
realized by applying MMM to “cultural-music resources.” We have designed this
system to promote cross-cultural understanding and communication by using cultural music. The system consists of music analysis, search and visualization functions: (1) a culture-dependent semantic metadata extraction function, which extracts
both musical elements (e.g., key, pitch, tempo) and impression metadata (e.g., sad,
happy, dreamy) corresponding to properties of each musical-culture, (2) a
cross-cultural computing function to represent differences and similarities among
various music-cultures, and (3) an easy-to-use interface function designed for
helping users to join the music database creation process. The significant feature of
our cross-cultural computing system is its multimedia database technology applying
of “Kansei” impressions, to compute the cultural differences. This system extracts
features of music-cultures and expresses cultural-dependent impressions by interpreting cultural-music pieces in the semantic music-space, and makes it possible to
compare cultural difference and similarity in terms of impressions among various
cultural music resources.
The important objective of this system is to evoke impressions and imaginations
including the cultural diversity by representing various impression-based responses

© 2018-2019 uberlabel.com. All rights reserved