This is not the document you are looking for? Use the search form below to find more!

0.00 (0 votes)

n its introduction this paper discusses why marketing professionals do not make satisfactory use of the marketing models posed by academics in their studies. The main body of this research is characterized by the proposal of a brand new and complete methodology for knowledge discovery in databases (KDD), to be applied in marketing causal modelling and with utilities to be used as a marketing management decision support tool. Such methodology is based on Genetic Fuzzy Systems, a specific hybridization of artificial intelligence methods, highly suited to the research problem we face. The use of KDD methodologies based on intelligent systems like this can be considered as an avant-garde evolution, exponent nowadays of the so- called knowledge-based Marketing Management Support Systems; we name them as Marketing Intelligent Systems.

- Added:
**April, 05th 2011** - Reads:
**344** - Downloads:
**1** - File size:
**1.56mb** - Pages:
**18** - content preview

- Name:
**amadej**

Related Documents

One of the most important issues in an ambient intelligent environment, indeed in any problem-solving situation, is the ability of a system to appreciate its environment and assess the situation in ...

Consumer behaviour in Kosovo in respect of dairy products (white cheese, yoghurt, fruit yoghurt, Sharri cheese, curd and caciocavalo) was studied during 2007 using different ...

Complete Testbank for Essentials of Accounting for Governmental and Not-for-Profit Organizations, 10ed, by Copley 9780073527055 TB DOWNLOAD NOW Make Payment and Receive Testbank for Essentials of ...

Most Complete Testbank for Consumer Behavior - Michael Solomon (9th ed) DOWNLOAD NOW

The language of Constraint Handling Rules, CHR, is an extension to Prolog intended as a declarative language for writing constraint solvers for CLP systems; here we give a very compact introduction ...

By Ghaayathri P GAPR09Rm082 Consumer Behavior Study of People How and why they buy what they Buy? How to identify needs and satisfy ...

Banks organize exams for its various posts like Clerk, PO and RRB. Candidates, who want to enter in the bank as Clerk, can give Bank Clerk Exam. The first and foremost eligibility criteria to give ...

This paper investigates the relationship between Marketing Mix Strategy and Consumer Motives at major TESCO stores in Malaysia. A quantitative approach was used and the survey was conducted at TESCO ...

This paper introduces an Intelligent agent for the vacuum cleaner named as VROBO. Objectives of this work are to prepare a pedagogical device for Artificial Intelligence students and to practically ...

Content Preview

Industrial Marketing Management 38 (2009) 714–731

Contents lists available at ScienceDirect

Industrial Marketing Management

Marketing Intelligent Systems for consumer behaviour modelling by a descriptive

induction approach based on Genetic Fuzzy Systems

Francisco J. Martínez-López a , ⁎, Jorge Casillas b ,1

a Department of Marketing, Business Faculty, University of Granada, Granada, E-18071, Spain

b Department of Computer Science and Artiﬁcial Intelligence, Computer and Telecommunication Engineering School, University of Granada, Granada, E-18071, Spain

a r t i c l e i n f o

a b s t r a c t

Article history:

In its introduction this paper discusses why marketing professionals do not make satisfactory use of the

Received 2 March 2007

marketing models posed by academics in their studies. The main body of this research is characterised by the

Received in revised form 26 December 2007

proposal of a brand new and complete methodology for knowledge discovery in databases (KDD), to be

Accepted 12 February 2008

applied in marketing causal modelling and with utilities to be used as a marketing management decision

Available online 14 April 2008

support tool. Such methodology is based on Genetic Fuzzy Systems, a speciﬁc hybridization of artiﬁcial

intelligence methods, highly suited to the research problem we face. The use of KDD methodologies based on

Keywords:

intelligent systems like this can be considered as an avant-garde evolution, exponent nowadays of the so-

Marketing modelling

called knowledge-based Marketing Management Support Systems; we name them as Marketing Intelligent

Management support

Systems. The most important questions to the KDD process–i.e. pre-processing; machine learning and post-

Analytical method

processing

Knowledge discovery

–are discussed in depth and solved. After its theoretical presentation, we empirically experiment

Genetic Fuzzy Systems

with it, using a consumer behaviour model of reference. In this part of the paper, we try to offer an overall

Methodology

perspective of how it works. The valuation of its performance and utility is very positive.

© 2008 Elsevier Inc. All rights reserved.

1. Introduction

ever to provide this support to marketing managers' decision making, in

order to give useful and valuable information about market behaviour.

Firms operate in markets that are increasingly “turbulent” and

Speciﬁcally, we highlight the following: models and methods of analysis.

“volatile.” How to deal with this turbulence and survive in these

It is expected that MkMSS will improve their performance in the

hypercompetitive conditions has become a strategic question (Agarwal,

near future, taking advantage of the synergies caused by the

Shankar, & Tiwari, 2007; Christopher, 2000). Consequently, the idea of

integration of modelling estimation techniques based on classic

the achievement and support of a sustainable competitive advantage

econometrics with new methods and systems based on artiﬁcial

gave rise, in the nineties, to another focused on its continuous

intelligence (Gatignon, 2000; Van Bruggen & Wierenga, 2000). The

development (D'Aveni, 1994), which is more realistic these days. One

adoption of these new methods represents a worthwhile opportunity

of the main implications of this reformed strategic approach is a search

to improve the efﬁciency of the marketing managers' decision making

for new suitable market opportunities. Of course, such opportunities

and consequently, if well applied, the accuracy of marketing strategies

need to be correctly identiﬁed and addressed by ﬁrms. This premise

(Li, Kinman, Duan, & Edwards, 2000).

justiﬁes the transcendental relevance recently given to the creation and

The paper we present here focuses on the exploration and analysis

management of knowledge about markets (Drejer, 2004). In this vein,

of the suitability of certain brand new methods based on knowledge

the marketing function of companies and, most especially, their

discovery in databases (KDD) to be applied in marketing modelling.

Marketing Management Support Systems (MkMSS) plays a notable

Speciﬁcally, our main aim is twofold: ﬁrst, we aim to make a modest

role in this task, as they must contribute to the reduction of the

contribution to the methods used in consumer behaviour modelling.

uncertainty related to the ﬁrms' markets of reference. As we know, this

In any case, this is the marketing ﬁeld we have focused on to develop

question does not only imply having access to good marketing

and experiment our methodology, though it also applies to marketing

databases. On the contrary, the key question is having the necessary

causal modelling, in general, as well as to other Science and Social

level of knowledge to take the right decisions (Campbell, 2003; Lin, Su, &

Sciences ﬁelds that work with similar causal models.

Chien, 2006). The analytical capabilities of MkMSS are more critical than

We propose a complete knowledge discovery methodology, whose

main questions are shown in this paper, to extract useful patterns of

information with a descriptive rule induction approach based on

⁎ Corresponding author. Tel.: +34 958 242350.

Genetic Fuzzy Systems; this is a novel hybridization of methods

E-mail addresses: fjmlopez@ugr.es (F.J. Martínez-López), casillas@decsai.ugr.es

(J. Casillas).

belonging to the ﬁeld of artiﬁcial intelligence, highly appropriate for

1 Tel.: +34 958 240804.

the marketing problem we face. With this purpose, we have had to

0019-8501/$ – see front matter © 2008 Elsevier Inc. All rights reserved.

doi:10.1016/j.indmarman.2008.02.003

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

715

give solutions, adapted from our academic ﬁeld, to the diverse

he can for making the model conform to reality in structure,

questions related to the main stages of the KDD process; i.e. data

parameterization, and behaviour.

preparation, data mining, and knowledge interpretation. Moreover,

an important characteristic of our methodology is that it has been

Consequently, it seems clear that modellers should be driven by the

designed under the base there is a causal model of reference; a

requirements of models users (i.e. demand-side), instead of by a supply-

consumer behaviour model in our case. In other words, the knowledge

side orientation (Gatignon, 2000). This practice is expected to improve

discovery process is guided by a prior theoretic structure that deﬁnes

the use of the academic models among the practitioners (Roberts, 2000).

the elements (variables) of the model. Hence, this machine learning

In this sense, a ﬁrm focused on consumption markets with access not

approach is not only interesting for practitioners, but also for

only to more representative models of real systems being modelled but

academic researchers' purposes.

also to more powerful methods of analysis to extract knowledge from

To address these questions, the paper is structured as follows.

huge databases and able to simulate with models ought to improve its

Section 2 reﬂects on the suitability of evolving our marketing

competitiveness and competitive advantage (Van Bruggen & Wierenga,

modelling methods towards a growing importation and use of

2000). This is a premise that has signiﬁcantly conditioned the evolution

artiﬁcial intelligence methods to support professional and academic

of MkMSS from the early 80s, speciﬁcally with the appearance of the

marketing problems. Section 3 presents an overview and justiﬁcation

Marketing Decision Support Systems, until now (Li et al., 2000; Talvinen,

of the artiﬁcial intelligence tools employed (fuzzy rules, genetic

1995; Wierenga & Van Bruggen, 1997, 2000).

algorithms, etc.). Section 4 illustrates with some examples the

The late 80s saw the increasing use of diverse methods from

behaviour of the proposed KDD tools. Section 5 shows the methodo-

Computer Science and Artiﬁcial Intelligence to the detriment of those

logical proposal in detail. Next, in Section 6 we experiment with the

from the Operational Research and, especially, the econometrics and

methodology, show some signiﬁcant results and dedicate a brief

statistics ﬁelds. This tendency has increasingly intensiﬁed in the last

closing part to illustrate both the intrinsic and complementary

two decades (Bucklin, Lehmann, & Little, 1998; Eliashberg & Lilien,

advantages of our fuzzy modelling-based method. Section 7 discusses

1993; Leeﬂang & Wittink, 2000; Leeﬂang, Wittink, Wedel, & Naert,

the main contributions of our research, reﬂecting on the academic and

2000; Li, Davies, Edwards, Kinman, & Duan, 2002).

managerial implications. Finally, in Section 8 we comment on some

This evolution in the methods used in marketing modelling has not

research limitations and opportunities (our future research agenda).

been accidental. In this sense, Lilien, Kotler, and Moorthy (1992) noted

that this tendency was to be expected as modellers and users needed

2. Background and starting reﬂections

techniques that were more ﬂexible, powerful and robust, capable of

providing greater and improved information with respect to the real

Is there a gap between what marketing modellers offer and what

systems being modelled. Of course, this implies a greater adaptation to

marketing managers demand? If marketing modelling had got to a

both the characteristics of current databases–i.e. huge, imprecise, with

stage of maturity, as Leeﬂang and Wittink (2000) argue, one would

data gathered in formats of a different nature (numerical, categorical,

expect to ﬁnd a signiﬁcant use of academic models among marketing

linguistic, etc.)–and the type of decision problems to be supported by

practitioners. Notwithstanding, it seems that marketing managers

such models. Under these circumstances, it seems an evolution of the

rarely apply them (Roberts, 2000; Wind and Lilien, 1993; Winer,

marketing modelling methods towards systems based on artiﬁcial

2000). It is essential that we academics meditate on this. Maybe, the

intelligence is only logical (Shim et al., 2002; Wedel, Kamakura, &

answer is much less complex than we would primarily expect.

Böckenholt, 2000), which justiﬁes the growing predominance of the

We think that the efforts of marketing academics are not

knowledge-based MkMSS in the last two decades (Wierenga & Van

productive in terms of the managerial applications of their models.

Bruggen, 2000).

This is not due to deﬁciencies in the theoretic aspects that support the

In sum, MkMSS clearly tend to be based on knowledge discovery

models' structure, but due to a lack of involvement by not offering

methods that make use of diverse artiﬁcial intelligence methods to be

useful methods of analysis that allow the models' users (marketing

applied during the machine learning process; e.g.: evolutionary

managers) to “play” with these models to support their decisions. This

algorithms, fuzzy logic, artiﬁcial neural networks, rules induction,

is what has guided our research, hence the gist of this paper.

decision trees, etc. Speciﬁcally, it is expected that the use of artiﬁcial

The academics may be too focused on testing hypotheses and

intelligent methods in the MkMSS framework will evolve towards the

validating models and theories without paying enough attention to

use of intelligent systems based on the hybridization of these

what our “customers”–the marketing managers, users of our scientiﬁc

techniques (Carlsson & Turban, 2002; Shim et al., 2002). We like to

production–need. Indeed, marketing modellers cannot afford to fall

call them as Marketing Intelligent Systems. It might be the inexorable

into marketing myopia! In this regard, we should not forget that the

fate of marketing modelling methods. This fact, which is more evident

main purpose of our research efforts ought to be the contribution to

from a professional perspective–i.e. under the framework of applica-

the development of our ﬁeld, and this necessarily implies looking after

tion of the MkMSS–, has still to take hold in academic studies.

the practical applicability of our models, too.

Therefore, how can we strengthen the utility of our models to

3. Knowledge extraction based on fuzzy rules and genetic

achieve a better explanation of markets, thus better matching them to

algorithms

marketing managers' needs? Research efforts can be addressed to the

improvement of three main areas of interest in marketing modelling

3.1. The KDD process

(Roberts, 2000): theoretic aspects deﬁning the models; understanding

of managers' (users) needs, hence the framework of application of

In general terms, KDD is a recent research ﬁeld belonging to artiﬁcial

models; and reﬁnement of the statistical tools (i.e. techniques and

intelligence whose main aim is the identiﬁcation of new, potentially

methods in general) applied to estimate the models. The pursuit of

useful, and understandable patterns in data (Fayyad, Piatesky-Shapiro,

these improvement guidelines is not too distant from what Little

Smyth, & Uthurusamy, 1996). Furthermore, KDD implies the develop-

(1970, p. B-483) asked of researchers a few decades ago when building

ment of a process compounded by several stages that allow the

models to support marketing managers' decision making:

conversion of low-level data into high-level knowledge (Mitra, 2002).

Though KDD is synthetically viewed as a three-stage process–i.e. pre-

Although the results of using a model may sometimes be personal

processing, data mining and post-processing–(Freitas, 2002), we believe

to the manager […] the researcher still has the responsibilities of a

that, for our academic ﬁeld, it is more interesting to present it within a

scientist in that he should offer the manager the best information

wider structure. Speciﬁcally, we prefer the following ﬁve-stage process

716

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

(Cabena, Hadjinian, Stadler, Verhees, & Zanasi, 1998; Han & Kamber,

21 years of age belongs to the fuzzy set labelled young to a degree of 0.55

2001): (1) identiﬁcation and problem delimitation; (2) data preparation

(colloquially speaking, 55%), while a 27 year-old belongs entirely to the

(pre-processing); (3) data mining (machine learning); (4) analysis,

fuzzy set young, or a 37-year-old belongs to young people to a 0.3 degree

evaluation and interpretation of results; and (5) presentation, assimila-

and also to adult people to a degree of 0.7. If we used classical (crisp) sets

tion and use of knowledge. It is important to highlight that the success of

and ﬁxed the boundary between young and adult at 35 years of age, a

such process, applied to solve or support the resolution of a particular

person aged 34.9 years would be considered 100% young while another

problem of information in marketing, depends on the suitable

aged 35.1 years would not be young to any degree.

development of every stage. The reader will be more conscious of this

Fuzzy rules can be considered a useful representation of knowledge

question when observing the lengths we go in order to explain how to

to discover intrinsic relationships contained in a database (Freitas,

prepare marketing data (pre-processing) or how to analyse the output

2002). Thus, by means of fuzzy rules we can represent the relationship

(knowledge) of the data mining stage (post-processing).

existing among different variables, thus deducing the patterns

contained in the data examined. Useful patterns allow us to do non-

3.2. Knowledge representation by fuzzy rules

trivial predictions about new data. There are two extremes to express a

pattern: black boxes, whose internal behaviour is incomprehensible;

Nowadays, one of the most successful tools for the development of

and white boxes, whose construction reveals the pattern structure. The

descriptive models is fuzzy modelling (Lindskog, 1997). This is an

difference lies in whether the patterns generated are represented by a

approach used to model a system making use of a descriptive language

structure that is easy to examine and which can be used to reason and

based on fuzzy logic with fuzzy predicates. The way to express fuzzy

to inform further decisions. In other words, when the patterns are

predicates is by means of IF–THEN rules, as in the following example:

structured in a comprehensible way, they will be able to help explain

something about the data. The trouble with KDD, the interpretability-

IF Age_of_Consumer is Young and Purchasing_Power is Very_High

accuracy trade-off, is also being tackled in current fuzzy modelling

THEN Trend_To_Buy_Sports_Cars is High

(Casillas et al., 2003a,b) and will be considered by our proposal.

The use of fuzzy rules when developing the knowledge discovery

These rules set logical relationships among variables of a system by

process has some advantages, which are (Freitas, 2002; Dubois, Prade, &

using qualitative values. Such a representation mode easily matches the

Sudkamp, 2005): they allow us to deal with uncertain data; they ade-

humans' way of reasoning. Hence, the performance of both the analysis

quately consider multi-variable relationships; results are easily under-

and interpretation steps of the modelling process improves thanks to

standable by humans; additional information is easily added by an expert;

the true behaviour of a system that is more effectively revealed.

the accuracy degrees can be easily adapted to the needs of the problem,

Notwithstanding, it should be noted that though human reasoning may

and the process can be highly automatic with low human intervention.

deal without difﬁculty with terms like high or young, when this issue is

Therefore, we will use fuzzy logic as a tool to structure the

tackled by means of an automatic process its treatment is more complex.

information of a consumer behaviour model in a clear and intelligible

To work properly with these kinds of qualitative valuations,

way that is close to that of the human being. Fuzzy logic methods are

linguistic variables (Zadeh, 1975a,b, 1976) based on both Fuzzy Sets

expected to offer beneﬁts to marketing decision makers when

Theory and Fuzzy Logic (Zadeh, 1965) are used, so the previously

integrated with current MkMSS (Metaxiotis, Psarras, & Samouilidis,

exempliﬁed rule is known as a fuzzy rule. The use of fuzzy logic

2004). The fuzzy system will allow us to represent adequately the

provides several beneﬁts, such as a higher generality, expressive

interdependence of variables and the non-linear relationships that

power, ability to model real problems and, last but not least, a

could exist between them.

methodology to exploit tolerance in the face of imprecision.

For example, we can consider the linguistic variable Age_of_Consu-

3.3. Multiobjective genetic algorithms

mer, which could take in the linguistic terms (values) teenager, young,

adult, and old. These linguistic terms (also know as labels) are

In the previous section, we introduced the proposed representation

mathematically expressed by simple functions that return the member-

of knowledge based on fuzzy rules. However, we also need an algorithm

ship degree (with a real value between 0 and 1) to each fuzzy set.

to automatically extract a set of fuzzy rules with good properties. In this

Therefore, instead of considering that a consumer could be 100% young

paper, we propose the use of a genetic algorithm. The main reasons for

or 100% adult, with fuzzy sets we can say that the consumer belongs to

using it instead of other well-known machine learning techniques are

the set of young people with one degree and also to the set of adults with

the following. Firstly, since there are usually contradictory objectives to

another degree. So, the boundaries between sets are fuzzy instead of

be optimised in KDD (such as accuracy and interpretability, or support

crisp, thus providing a powerful linguistic expression and a gradual

and conﬁdence), we perform multiobjective optimisation. It is one of the

transition of the membership to the different fuzzy sets.

most promising issues and one of the main distinguishing features of

Fig. 1 represents an example of how the age of a person can be

genetic algorithms compared to other techniques. Furthermore, we

expressed by fuzzy sets. In this ﬁgure, we could say that a person of

consider a ﬂexible representation of fuzzy rules that can be developed

Fig. 1. Illustrative example of the linguistic variable age, composed of the linguistic terms teenager, young, adult and old, and their corresponding fuzzy sets. A 37-year-old has a

membership degree 0.3 to young and 0.7 to adult.

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

717

Fig. 2 shows the structure of a simple GA.

A ﬁtness function must be devised for each problem to be solved.

Given a particular chromosome (i.e. a solution), the ﬁtness function

returns a numerical value that is supposed to be proportional to the

utility or adaptation of the solution represented by this chromosome.

In our case, we will consider two different measures to assess the

quality of a solution (fuzzy rule): support and conﬁdence.

There are a number of ways to do selection. We might view the

population as a mapping on a roulette wheel, where each individual is

represented by a space that proportionally corresponds to its ﬁtness.

By repeatedly spinning the roulette wheel, individuals are chosen

using stochastic sampling with replacement to ﬁll the intermediate

population. Another possibility, called binary tournament, consists in

doing a number of tournaments equal to the size of the population. In

each tournament, two chromosomes of the old population are chosen

Fig. 2. Structure of a genetic algorithm.

at random, and the best one according to ﬁtness is included in the new

population. We will employ this second approach in our proposal.

properly by genetic algorithms. This ﬂexible representation improves the

After selection has been carried out, the construction of the

description capability of the fuzzy rule, an important issue in KDD.

intermediate population is completed and crossover and mutation can

Genetic algorithms demonstrated good results for management

occur. The crossover operator combines the features of two parent

and marketing applications, thus arousing the interest of researchers

structures to form two similar offspring. Classically, it is applied at a

and practitioners in the nineties (Hurley, Moutinho, & Stephens, 1995;

random position with a probability of performance, the crossover

Nissen, 1995). However, one of the novelties of this paper for market-

probability. The mutation operator arbitrarily alters one or more

ing is that, in this instance, fuzzy logic and genetic algorithms will not

components of a selected structure so as to increase the structural

be applied separately to tackle a particular marketing problem, but in

variability of the population. Each position of each solution vector in

cooperation. In the following, genetic algorithms and multiobjective

the population undergoes a random change according to a probability

optimisation are brieﬂy introduced.

deﬁned by a mutation rate, the mutation probability.

Fig. 6 in Section 4 illustrates graphically the use of a genetic

3.3.1. Genetic algorithms

algorithm to extract fuzzy rules from available data in the marketing

Genetic algorithms are general-purpose search algorithms that use

problem we are dealing with in this paper.

principles inspired by natural population genetics to evolve solutions

to problems. The basic principles of genetic algorithms were ﬁrst laid

3.3.2. Multiobjective optimisation

down rigorously by Holland (1975) and are well described in many

Many real-world problems involve simultaneous optimisation of

texts (e.g.: Goldberg, 1989; Michalewicz, 1996).

multiple objectives. In principle, multiobjective optimisation is very

The basic idea is to maintain a population (i.e., a set) of knowledge

different from single-objective optimisation. The second case

structures that evolves over time through a process of competition and

attempts to obtain the best solution; i.e. the global minimum or the

controlled variation. Each structure in the population represents a

global maximum depending on the problem. However, in the case of

candidate solution to the speciﬁc problem and has an associated ﬁtness

multiple objectives, there may not be a single solution that is better

to determine which structures are used to form new ones in the process of

than the rest with respect to all objectives.

competition. The new individuals are created using genetic operators such

In a typical multiobjective optimisation problem, there is a set of

as crossover and mutation. Genetic algorithms have had a great measure

solutions that are superior to the rest of the solutions in the search

of success in search and optimisation problems. The main reason for this

space when all the objectives are considered, but which are inferior to

success is their ability to exploit accumulative information about an

other solutions in the space occupied only by some of them. These

initially unknown search space in order to bias subsequent search into

solutions are known as non-dominated solutions (Chankong & Haimes,

useful subspaces, i.e., their robustness. This is their key feature, especially in

1983), while the rest of the solutions are known as dominated

large, complex and poorly understood search spaces, where the classical

solutions. Since none of the solutions in the non-dominated set is

search tools (enumerative, heuristic, etc.) are inappropriate, offering a

worse in all the objectives than the other ones, all of them are

valid approach to problems requiring efﬁcient and effective search.

acceptable solutions.

A genetic algorithm starts with a population of randomly generated

Mathematically, the concept of Pareto-optimality2 or non-dominance

solutions, chromosomes, and advances towards better solutions by

is deﬁned as follows. Let us consider, without loss of generality, a mul-

applying genetic operators, modelled on the genetic processes occurring

tiobjective maximization problem with m parameters (decision vari-

in nature. As previously mentioned, in these algorithms we maintain a

ables) and n objectives:

population of solutions (in our case, fuzzy rules) for a given problem; this

population undergoes evolution in a form of natural selection. In each

Maximise

f ðxÞ ¼ f

ð 1ðxÞ; f2ðxÞ; N ; fnðxÞÞ

generation, relatively good solutions reproduce to give offspring that

replace the relatively bad solutions, which die. An evaluation or ﬁtness

with x = (x1,x2,…,xm)∈X. A decision vector a∈X dominates b∈X (noted

function plays the role of the environment to distinguish between good

as a ⪯b) if, and only if:

and bad solutions. The process of evolving from the current population to

8ia 1

f ; N ; ngj f

f

gj f

the next one constitutes one generation in the execution of a genetic

i a

ð Þ z fi b

ð Þ and aja 1; N ; n j a

ð Þ N fj b

ð Þ:

algorithm.

Any vector that is not dominated by any other is said to be Pareto-

Although there are many possible variants of the basic genetic

optimal or non-dominated. These concepts are depicted graphically in

algorithm, the fundamental underlying mechanism involves three

Fig. 3.

operations (Goldberg, 1989):

(1) evaluation of individual ﬁtness,

2 The concept Pareto optimality is an important notion in neoclassical economics. It

(2) formation of a gene pool (intermediate population), and

is named after the French–Italian economist Vilfredo Pareto (1848, Paris–1923,

(3) crossover and mutation.

Geneva).

718

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

Fig. 3. Example of multiobjective optimisation.

Thanks to the use of a solution population, genetic algorithms can

features such as discontinuities, multimodality, disjoint feasible spaces,

simultaneously search for many Pareto-optimum solutions. For this

and noisy function evaluations, reinforces the potential effectiveness of

reason, genetic algorithms have been recognised as possibly being well

genetic algorithms in multiobjective search and optimisation. Generally,

suited to multiobjective optimisation (Coello, Van Veldhuizen, & Lamont,

the multiobjective approaches only differ from the rest of the genetic

2002). Furthermore, the ability to handle complex problems, involving

algorithms in the ﬁtness function and/or in the selection operator.

4. An illustrative example on how to extract knowledge from data to analyse consumer behaviour

This section serves as a bridge between the technical concepts included in the previous section and the modelling methodology proposed in

the next one. Therefore, to introduce the reader to the methodology, we propose extracting useful knowledge from data that can aid better

understanding the existing relationships between variables by presenting in this section a toy problem (with a few variables and a small data set

size) to illustrate the basic behaviour and powerful nature of the proposed KDD process. Some parts of the process have been intentionally

simpliﬁed with the aim of focusing on the most relevant aspects. The rigorous description of the proposal can be found in Section 5, while Section

6 amply describes the experimental results in a real-world problem.

To illustrate the proposed use of KDD, we will consider a simple measurement (causal) model depicted in Fig. 4(a), compounded by three

construct or latent variables (depicted by circles), two exogenous and one endogenous: (1) fashion consciousness, (2) conservatism, and

(3) hedonism; extracted from MacLean and Gray (1998). Likewise, imagine that the three constructs have been measured by means of several

seven-point interval scales (e.g. Likert-type and differential semantic scales). Finally, Fig. 4(b) shows an example of a data set available for this

problem, which consists of three variables, each made up of a set of values. There are just four cases (e.g., questionnaires), which are not realistic at

Fig. 4. Example of a simple measurement (causal) model–extracted from (MacLean & Gray, 1998)–and a data set from four hypothetical consumers' responses.

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

719

Fig. 5. Example of transformation of a seven-point Likert-type scale into a fuzzy semantic. According to that, the membership degree of 5 to the fuzzy set associated to the linguistic

term Medium is 0.67, while the membership degree of 6 is 0.33.

all–i.e. think that a consumer database usually has hundreds or even thousands of collected individuals' responses–, though it is useful for our

illustrative purpose.

The ﬁrst step we perform is to transform the interval scale into fuzzy semantics. This allows us to use linguistic terms to describe the different

items by means of linguistic variables. We can consider the following three membership functions to describe the terms Low, Medium, and High:

8

8 x À 1

>

>

>

>

(

>

< 4 À x

>

if 1 V x V 4

< 3

7 À x

A

ð Þ ¼

if 2 V x V 4

ð Þ ¼

ð Þ ¼

if 5 V x V 8

Low x

3

; AMedium x

7 À x

; AHigh x

3

:

>

>

> 0

otherwise

>

if 4 b x V 7

>

> 3

0

otherwise

:

: 0

otherwise

A graphical representation of these membership functions is depicted in Fig. 5.

Once we have deﬁned the variables in terms of fuzzy sets, we can use fuzzy rules to express relationships (i.e., patterns) among the variables

(refer to Section 3.2. for a description of these kinds of rules). To do that, we will consider the two exogenous variables and the endogenous one,

antecedents and consequent respectively in this example.

These fuzzy rules can represent many different relationships among the variables; however, not all of them will match the existing data

exactly. Therefore, we need some measures to assess the quality of each rule with respect to the data. These measures can be considered a kind

of statistical computation. In this paper, we will consider two important values: support and conﬁdence. On the one hand, support (whose real

value is in [0,1]) will give us an idea about in which degree the rule represents the cases of the data set. For example, a support of 0.25 could be

understood as the rule that covers 25% of the available cases. We are interested in obtaining fuzzy rules with a support as high as possible since

the rule will be more general and will represent a higher portion of the sample. On the other hand, conﬁdence (whose real value is also in [0,1]),

indicates how accurate the fuzzy rule is. Since the fuzzy rule predicts a relationship between the antecedent and the consequent, we need

to know in which degree such a prediction appears in the available data set. For example, if a fuzzy rule has a conﬁdence of 0.9, we can say

that, according to the available data, the fuzzy rule is 90% true. Of course, we are interested in obtaining fuzzy rules with a high degree of

conﬁdence.

As one can imagine, support and conﬁdence are two contradictory features. Inasmuch as the degree of representation is higher, it is more

difﬁcult to accurately express the relationships among variables. One fuzzy rule will be clearly preferable to another if the former has higher

values of both support and conﬁdence.

In the following, we will show some examples of fuzzy rules and the computation of the corresponding support and conﬁdence values from

the data set of Fig. 4(b).

R1: If Fashion_Consciousness is LOW and Conservatism is MEDIUM then Hedonism is MEDIUM

A

Yð1Þ Þ ¼

f

g ¼

f

g ¼

Low

x

max A

max 1; 0:67; 1

1

1

Low 1

ð Þ; ALow 2

ð Þ; ALow 1

ð Þ

A

Yð1Þ Þ ¼

f

g ¼

f

g ¼

Medium

x

max A

max 0:67; 0:33

0:67

2

Medium 5

ð Þ; AMedium 6

ð Þ

À

Á

n

o

A

ð Þ ¼

Yð1Þ

Yð1Þ

¼

f

g ¼

A 1

ð Þ x 1

min ALow x

; A

x

min 1; 0:67

0:67

1

Medium

2

A

Yð1Þ Þ ¼

f

g ¼

f

g ¼

Medium

y

max AMedium 1

ð Þ; AMedium 2

ð Þ

max 0; 0:33

0:33

À

Á

A

Yð2ÞÞ ¼

Yð2Þ Þ ¼

ð Þ ¼

Yð2Þ Þ ¼

Low

x

0; A

x

0:33; A

0; A

y

0:33

1

Medium

2

A 1

ð Þ x 2

Medium

À

Á

A

Yð3Þ Þ ¼

Yð3Þ Þ ¼

ð Þ ¼

Yð3Þ Þ ¼

Low

x

0; A

x

0:33; A

0; A

y

0

1

Medium

2

A 1

ð Þ x 3

Medium

À

Á

A

Yð4Þ Þ ¼

Yð4Þ Þ ¼

ð Þ ¼

Yð4Þ Þ ¼

Low

x

0; A

x

0:67; A

0; A

y

0:67

1

Medium

2

A 1

ð Þ x 4

Medium

1 X

4

YðeÞ

0:67 Á 0:33 þ 0 þ 0 þ 0

Support R

ð 1Þ ¼

A

Á A

y

¼

¼ 0:05556

4

A 1

ð Þ

x e

ð Þ

B 1

ð Þ

4

e¼1

P

À

Á

n

À

Á

o

4

A

ð Þ Á

ðeÞ

max 1 À A

; A

Y

y

e¼1

A 1

ð Þ x e

A 1

ð Þ x e

ð Þ

B 1

ð Þ

0:67 Á max 1

f À 0:67; 0:33g þ 0 þ 0 þ 0

Conf idence R

ð 1Þ ¼

P

¼

¼ 0:33333

4

A

ð Þ

ð

Þ

0:67 þ 0 þ 0 þ 0

e¼1

Að1Þ x e

720

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

R2: If Fashion_Consciousness is MEDIUM and Conservatism is MEDIUM then Hedonism is MEDIUM

À

Á

À

Á

A

Yð1Þ Þ ¼

ð Þ ¼

Yð2Þ Þ ¼

¼

Medium

x

0:33; A

0:33

A

x

0; A

0

1

A 2

ð Þ x 1

Medium

1

A 2

ð Þ x 2

ð Þ

À

Á

À

Á

A

Yð3Þ Þ ¼

ð Þ ¼

Yð4Þ Þ ¼

ð Þ ¼

Medium

x

0; A

0

A

x

1; A

0:67

1

A 2

ð Þ x 3

Medium

1

A 2

ð Þ x 4

0:33 Á 0:33 þ 0 þ 0 þ 0:67 Á 0:67

Support R

ð 2Þ ¼

¼ 0:13889

4

0:33 Á max 1

f À 0:33; 0:33g þ 0 þ 0 þ 0:67 Á max 1

f À 0:67; 0:67g

Conf idence R

ð 2Þ ¼

¼ 0:44445:

0:33 þ 0 þ 0 þ 0:67

As we can observe, the fact of using the linguistic term “medium” for the fashion consciousness variable instead of “low” (as in rule R1) allows

us to cover better the data set and, at the same time, to improve the accuracy of the rule.

R3: If Fashion_Consciousness is MEDIUM and Conservatism is {LOW or MEDIUM} then Hedonism is MEDIUM

n

o

À

Á

A

Yð1Þ ¼

Y 1

ð Þ

þ

Yð1Þ

¼

¼

Low or Medium

x

min

1; A

x

A

x

0:67; A

0:33

2

Low

2

Medium

2

A 3

ð Þ x 1

ð Þ

À

Á

A

Yð2Þ ¼

ð Þ ¼

Low or Medium

x

1; A

0:33

2

A 3

ð Þ x 2

À

Á

A

Yð3Þ ¼

ð Þ ¼

Low or Medium

x

1; A

0:33

2

A 3

ð Þ x 3

À

Á

A

Yð4Þ ¼

ð Þ ¼

Low or Medium

x

1; A

0:67

2

A 3

ð Þ x 4

Support R

ð 3Þ ¼ 0:16667

Conf idence R

ð 3Þ ¼ 0:66667:

This third rule includes two linguistic terms in the variable conservatism. Doing that, the support is higher since we can cover the data set to a

higher degree compared to rule R2 (it is obvious since R3 is more general than R2). Moreover, the conﬁdence is also improved, so this third rule is

clearly better than the previous ones.

Once we have shown some examples of fuzzy rules and how to compute their associated support and conﬁdence values from a data set, we will

illustrate a simpliﬁcation of how the data mining process works. Fig. 6 depicts a scheme of the behaviour of a genetic algorithm to reveal fuzzy rules

from data. The genetic algorithm, as explained in Section 3.3.1, optimises generation by generation the population, in our case a set of different

fuzzy rules, i.e., patterns. To analyse alternative fuzzy rules, new ones are generated from the existing one by applying the crossover and mutation

operators. The genetic algorithm encodes the rules in a format that is easily tractable in a computer, in this case by using a binary representation.

In the example of Fig. 6, the mutation takes a solution from the current population and applies a slight alteration; in this case, it changes the

linguistic term used in the ﬁrst variable from “low” to “medium.” The new generated rule is included in the next population since its corresponding

values of support and conﬁdence are better. In other example, the crossover takes two solutions and combines them by generating a new rule that

contains the linguistic terms considered in each parent rule. This new rule, better than its parents, is included in the new population.

5. A marketing intelligent system for consumer behaviour analysis

theoretical constructs (i.e. unobserved variables), should be made.

Consequently, we think that time should be spent analysing the

This section introduces the process in which we propose perform-

adaptation of the fuzzy rule-based KDD to the latter case, inasmuch as

ing knowledge discovery related to consumers by fuzzy rules.

its treatment seems to be the more controversial.

Basically, it consists of preparing the data and of ﬁxing the scheme

Previously, it could be said that measuring streams for these latent

we follow to represent the knowledge existing in the data. Once these

variables in consumer modelling was classiﬁed into two groups

aspects are deﬁned, a machine learning method is used to auto-

depending on if they declared that these constructs could or could not

matically extract interesting fuzzy rules. Finally, a post-processing

be perfectly measured by means of observed variables (indicators):

stage is carried out. All these questions are now presented in detail.

the operational deﬁnition philosophy and the partial interpretation

philosophy respectively. This latter approach of measurement, cur-

5.1. Data gathering

rently predominant in the marketing modelling discipline, recognises

the impossibility of doing perfect measurements of theoretical

First step is to collect the data related to the variables deﬁning the

constructs by means of indicators, so it poses joint consideration of

theoretical model of the consumer behaviour proposed. In this sense,

multiple indicators–imperfect when considered individually, though

as has been done traditionally in Marketing Science in particular,

reliable when considered together–of the subjacent construct to

and in Social Sciences in general, data is obtained by means of a

obtain valid measures (Steenkamp & Baumgartner, 2000).

questionnaire. This questionnaire gathers the measures for the set of

Therefore, our methodological approach should be aware of this

constituent elements of the model.

question when adapting the data (observed variables) to a fuzzy rule

learning method. Notwithstanding, we would like to highlight that

5.2. Data processing

our method does not have any problem with processing elements of a

model for which we have just a single variable or indicator associated

Next, it is necessary to adapt the collected data to a scheme easily

to each of them, even when they have been measured by varied

tractable by fuzzy rule learning methods. Thus, at ﬁrst, attention

measurement scales. The problem comes, hence the challenge to face,

should be paid to how modellers face and develop the measurement

when there are multiple variables related to the measurement of a

process of the elements/variables contained in the complex beha-

particular element of the model. Some intuitive solutions and aprioristic

vioural models. In this respect, reﬂections about the measurement of

analyses of the internal consistency of the multi-item scales associated

such variables, with a special focus on those usually known as

to such elements have been proposed, with the aim of keeping just

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

721

Fig. 6. A simpliﬁed example of the behaviour of a genetic algorithm when extracting knowledge in form of fuzzy rules from the data set available in Fig. 4(b).

one indicator (the best) per construct (see: Casillas, Martínez-López, &

5.3.1. Fuzzy semantics from expert knowledge

Martínez, 2004). The weakness of these approaches is that the data must

Once the marketing modeller has ﬁnally determined both the

be transformed, so relevant information may be lost.

elements of the model and the observed variables associated to each

We propose a solution based on a more sophisticated process

one (i.e. the measurement model), a transformation into linguistic

that allows working with the original format without any pre-

terms (fuzzy semantic) of the original marketing scales used for

processing stage (Martínez-López & Casillas, 2007): the multi-item

measuring those observed variables should be done. This is necessary

fuzziﬁcation. Thus, a T-conorm operator (e.g., maximum), tradition-

for the derivation of fuzzy rules later. This question implies treating

ally used in fuzzy logic to develop the union of fuzzy sets, can be

the application of the fuzzy set theory to the measurement in

applied to aggregate the partial information given by each item.

marketing. In this regard, as far as we know, Viswanathan, Bergen,

Since it is not pre-processing data but a component of the machine

Dutta, and Childers (1996) were the ones who ﬁrst researched this

learning design, the details of that treatment of the items is des-

question by proposing a methodology for the scale development in

cribed in Section 5.4.2.

marketing. In any case, as this is not the central theme of this paper,

we are not going to treat this issue in depth, though it is thoroughly

5.3. Representation and inclusion of the marketing expert's knowledge

analysed in the research that supports this study.

Several marketing scale types can be used to measure the variables

Several issues should be tackled at this step of our methodological

associated to the constituent elements of a consumer behaviour model.

proposal: the set of variables/constructs to be processed, the

With the aim of focusing the problem, we take Stevens (1946, 1959) as

transformation of the marketing scales used for measuring such

a base to summarize them in four categories with regard to their level of

variables into fuzzy semantic, the relations among constructs (i.e. the

measurement, i.e. nominal, ordinal, interval and ratio. Considering

causal model), and the fuzzy rules' sets to be generated. All of them are

those types, a transformation into fuzzy semantic is meaningful for

based on the expert's capability to express his knowledge in a hu-

the majority with the exception of variables measured by means of

manly understandable format by fuzzy logic.

a nominal scale, where the nature of categories deﬁning the scale

722

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

are purely deterministic. In general terms, this transformation should

This structure uses a more compact description that improves

be practiced taking into account two main questions:

interpretability. Moreover, the structure is a natural support to allow

for the absence of some input variables in each rule (simply making Ãi

a) The number of linguistic terms to be used, which determines the

to be the whole set of linguistic terms available).

granularity (the scale sensitivity) of certain fuzzy variable, must be

deﬁned. Thus, although more terms are used, the analysis of

5.4.2. Multi-item fuzziﬁcation

relations among variables is more accurate, but more complex too.

In order to properly consider the set of items available for each

Consequently, the marketing modeller should take time to think

input/output variable (as discussed in Section 5.2), we propose an

about what the most convenient degree of sensitivity is in the

extension of the membership degree computation, the so-called multi-

fuzzy scales used in his/her study. Three or ﬁve linguistic terms

item fuzziﬁcation. The process is based on a union of the partial

(fuzzy sets) seem good options.

information provided by each item. Given Xi and Yj measured by the

b) The membership function type and shapes deﬁning the behaviour of a

→

(i)

(i)

(i)

→

( j)

vectors of items x

( j)

( j)

i = (x1 ,…, xh ,…, xp ) and yj = (y1 ,…,yt ,…,yq ), res-

certain fuzzy variable should be also deﬁned. Such behaviour can be

i

i

j

j

pectively, the fuzzy propositions “Xi is Ãi” and “Yj is Bj” are respectively

broadly treated considering the use of linear vs. non-linear member-

interpreted as follows:

ship functions to characterise the fuzzy sets. Thus, trapezoidal and

triangular functions can be used to obtain a linear behaviour, while

p

q

A

Y

i

j

x

¼ max A xi

and A

Y

y

¼ max A yðjÞ :

Gaussian functions can be used for a non-linear one.

˜A

i

˜

B

j

B

i

h

t

h

Ai

i

j

j

j

i ¼1

tj¼1

We are now going to focus on those marketing scales mainly used for

measuring the observed variables related to the elements (theore-

Therefore, the T-conorm of maximum is considered to interpret the

tical constructs) of a particular marketing model; i.e.: Likert-type and

disjunction of items.

differential semantic. Firstly, we have considered that it is more

appropriate to use linear functions, inasmuch as it facilitates the

5.4.3. Discovery process

interpretation of relations later. Second, we believe that a trans-

In order to perform descriptive induction we will apply a method

formation into a triangular function is more convenient if special

with some similarities to subgroup discovery, widely used in learning

characteristics of these marketing scales are considered; scales

classiﬁcation rules (Lavrac, Cestnik, Gamberger, & Flach, 2004) where

valuations are punctual. Then, when the membership degree of

the interest property is the class associated to the consequent variable.

certain linguistic terms is equal to one, such a term should be

Therefore, this technique seeks to group the set of data into different

associated to a point of the scale. In this regard, this choice has also

subgroups, including in each of them the example set by the corres-

been justiﬁed in the marketing context, with the argument that

ponding consequent, and to discover a set of rules representing this

trapezoidal functions facilitate the later process of fuzzy inference

subgroup. In that case, the most usual approach involves running the

(Li et al., 2002).

algorithm once for each subset of examples holding the property ﬁxed

for the consequent.

To sum up, Fig. 5 shows an example based on the transformation of

Instead of that, our algorithm considers the subgroup division

a seven-point rating scale into a three-triangular fuzzy semantic, with

according to the used fuzzy set in the consequent; therefore, the

the three linguistic terms (Low, Medium, and High) represented by the

subsets of examples can be overlapped. Moreover, we propose per-

corresponding fuzzy sets characterised by the three membership

forming a simultaneous subgroup discovery where niches of fuzzy

functions shown in Section 4.

rules, in accordance with the consequent, are formed and optimised in

parallel to generate a ﬁnal set of suboptimal solutions in each sub-

5.3.2. Input/output linguistic variables from expert knowledge

group. To perform this process, as explained in the following sections,

Once the causal model has been ﬁxed by the marketing expert, fuzzy

we vary the concept of multiobjective dominance and we design the

rules are used to relate input (antecedents) with output (consequents)

genetic operators for acting only on the antecedent part.

variables. Obviously, the theoretic relations deﬁning the model can be

directly used to deﬁne the IF–THEN structures by considering the

5.4.4. Coding scheme

dependences shown among the variables. Thus, we obtain a set of fuzzy

Each individual of the population represents a fuzzy rule. The rule

rules for each considered consequent (i.e. endogenous element of the

is encoded by a binary string for the antecedent part and an integer

model) and its respective set of antecedents. Several examples of fuzzy

coding scheme for the consequent part. The antecedent part has a size

rules from the model included in Fig. 4(a) can be found in Section 4.

equal to the sum of the number of linguistic terms used in each input

variable. The allele ‘1’ means that the corresponding linguistic term is

5.4. Machine learning (data mining process)

used in the corresponding variable. The consequent part has a size

equal to the number of output variables. In that part, each gene

5.4.1. Fuzzy rule structure

contains the index of the linguistic term used for the corresponding

In data mining, it is crucial to use a learning process with a high

output variable.

degree of interpretability preservation. To do that, we can opt for

For example, assuming we have three linguistic terms (S [small],

using a compact description as the disjunctive normal form. This kind

M [medium], and L [large]) for each input/output variable, the fuzzy

of fuzzy rule structure has the following form (González & Pérez,

rule [IF X1 is S and X2 is {M or L} THEN Y is M] is encoded as [100|

1998):

011||2].

R: IF X1 is Ã1 and … and Xn is Ãn THEN Y1 is B1 and … Ym is Bm

5.4.5. Objective functions

with each input variable Xi, i∈{1,…, n}, taking as a value a set of linguistic

We consider the two criteria most often used to assess the quality

terms Ãi ={Ai1 or … or Ain}, whose members are joined by a disjunc-

of association rules (Dubois et al., 2005): support and conﬁdence. In

i

tive (T-conorm) operator, while the output variables Yj, j∈{1,…, m},

Section 4, the reader can see some examples of how these measures

remain a usual linguistic variable with single labels associated. We use

are computed.

the bounded sum as T-conorm in this paper:

(1) Support: This objective function measures the representation

(

)

degree of the corresponding fuzzy rule among the available

X

ni

A ˜A

data. It is computed as the mean covering degree of the rule for

iðxÞ ¼ min

1;

A ð Þ

A

x

:

ik

k¼1

each data. As covering, we consider the conjunction of the

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

723

membership degrees of both antecedent and consequent

To perform simultaneous subgroup discovery properly, we need to

variables. Therefore, the support measure (for maximization)

redeﬁne the dominance concept. Thus, one solution (fuzzy rule) do-

of the fuzzy rule R: A ⇒ B is deﬁned as follows:

minates another when, besides being better or equal in all the ob-

jectives and better in at least one of them, it has the same consequent

1 X

N

as the other rule. In that way, those rules with different consequents

YðeÞ

Sup R

ð Þ ¼

A

ð Þ

Á A

y

are not dominated between them, thus inducing the algorithm to form

N

A x e

B

e¼1

a search niche (Pareto set) for each considered consequent (subgroup).

→(e)

→(e)

→

with N being the data set size, x(e) =(x 1 ,…, x n ) and ye the

À

Á

5.4.7. Genetic operators

eth input/output multi-item data instance, and

A

ð Þ ¼

A x e

The initial population is built by deﬁning the same amount of

ðeÞ

min

A

Y

˜

x

the covering degree of the antecedent of the

i

groups (with the same size) as the consequents considered. In each of

ia 1

f ; N ;ng Ai

rule R for each example (i.e., the T-norm minimum is considered to

them, the chromosomes are generated by ﬁxing the consequent and

interpret the connective ‘and’ of the fuzzy rule). As shown, the T-

by randomly deﬁning a simple antecedent to which each variable is

norm of the product is considered as joint antecedent and con-

assigned only one linguistic term. The two genetic operators (cross-

sequent. Note that we use the multi-item fuzzi

ﬁcation described

over and mutation) act only on the antecedent part. This allows the

ðeÞ

→

in Section 5.4.2 to compute A

Y

(e)

algorithm to keep a constant size for each subgroup.

˜

x

and μ

).

A

i

B( y

i

The crossover operator randomly chooses two cross points (in the

(2) Conﬁdence: This second objective measures the reliability of the

antecedent) and exchanges the central string of the two selected

relation between antecedent and consequent described by the

parents. If all the linguistic terms of a variable are set off after cross-

analysed fuzzy rule. We have used a conﬁdence measure that

over, a linguistic term used in the parents is randomly chosen and set

avoids the accumulation of low cardinalities (Dubois et al.,

to ‘1’. It is interesting to note that no constraints are imposed on

2005). It is computed (for maximization) as following:

selecting the parents, so the crossover can be applied to parents with

different consequents (i.e., belonging to different subgroups). It allows

P

À Á

n

À

Á

o

N

A

ð Þ Á

ðeÞ

max 1 À A

ð Þ ; A Y

y

migrations between niches, thus improving the search process.

e¼1

A x e

A x e

B

Conf R

ð Þ ¼

P

:

The mutation operator randomly selects an input variable of the

N

A

ð Þ

ð

Þ

e¼1

A x e

fuzzy rule encoded in the chromosome and one of the three following

possibilities is applied: expansion, which ﬂips to ‘1’ a gene of the selected

Therefore, the Dienes' S-implication, I(a,b)=max{1 −a,b}, is used.

variable; contraction, which ﬂips to ‘0’ a gene of the selected variable; or

Note that this implication operator is a fuzzy interpretation of the

shift, which ﬂips to ‘0’ a gene of the variable and ﬂips to ‘1’ the gene

classical interpretation p ⇒q≡¬p∨q used in Boolean logic where

immediately before or after it. The selection of one of these mechanisms

the negation is interpreted as 1 −a and the disjunction as max{a,b}.

is made randomly among the available choices (e.g., contraction cannot

Multi-item fuzziﬁcation is again considered.

be applied if only one gene of the selected variable has the allele ‘1’). Note

that it is always possible to perform at least one of these options.

5.4.6. Evolutionary scheme

We consider a generational approach with the multiobjective elitist

6. Experimentation and knowledge interpretation

replacement strategy of NSGA-II (Deb, Pratap, Agarwal, & Meyarevian,

2002). Crowding distance in the objective function space is used.

6.1. Marketing model and data source used for the experimentation

Binary tournament selection based on the non-domination rank (or

the crowding distance when both solutions belong to the same front)

Regarding other published marketing-related studies that have

is

Contents lists available at ScienceDirect

Industrial Marketing Management

Marketing Intelligent Systems for consumer behaviour modelling by a descriptive

induction approach based on Genetic Fuzzy Systems

Francisco J. Martínez-López a , ⁎, Jorge Casillas b ,1

a Department of Marketing, Business Faculty, University of Granada, Granada, E-18071, Spain

b Department of Computer Science and Artiﬁcial Intelligence, Computer and Telecommunication Engineering School, University of Granada, Granada, E-18071, Spain

a r t i c l e i n f o

a b s t r a c t

Article history:

In its introduction this paper discusses why marketing professionals do not make satisfactory use of the

Received 2 March 2007

marketing models posed by academics in their studies. The main body of this research is characterised by the

Received in revised form 26 December 2007

proposal of a brand new and complete methodology for knowledge discovery in databases (KDD), to be

Accepted 12 February 2008

applied in marketing causal modelling and with utilities to be used as a marketing management decision

Available online 14 April 2008

support tool. Such methodology is based on Genetic Fuzzy Systems, a speciﬁc hybridization of artiﬁcial

intelligence methods, highly suited to the research problem we face. The use of KDD methodologies based on

Keywords:

intelligent systems like this can be considered as an avant-garde evolution, exponent nowadays of the so-

Marketing modelling

called knowledge-based Marketing Management Support Systems; we name them as Marketing Intelligent

Management support

Systems. The most important questions to the KDD process–i.e. pre-processing; machine learning and post-

Analytical method

processing

Knowledge discovery

–are discussed in depth and solved. After its theoretical presentation, we empirically experiment

Genetic Fuzzy Systems

with it, using a consumer behaviour model of reference. In this part of the paper, we try to offer an overall

Methodology

perspective of how it works. The valuation of its performance and utility is very positive.

© 2008 Elsevier Inc. All rights reserved.

1. Introduction

ever to provide this support to marketing managers' decision making, in

order to give useful and valuable information about market behaviour.

Firms operate in markets that are increasingly “turbulent” and

Speciﬁcally, we highlight the following: models and methods of analysis.

“volatile.” How to deal with this turbulence and survive in these

It is expected that MkMSS will improve their performance in the

hypercompetitive conditions has become a strategic question (Agarwal,

near future, taking advantage of the synergies caused by the

Shankar, & Tiwari, 2007; Christopher, 2000). Consequently, the idea of

integration of modelling estimation techniques based on classic

the achievement and support of a sustainable competitive advantage

econometrics with new methods and systems based on artiﬁcial

gave rise, in the nineties, to another focused on its continuous

intelligence (Gatignon, 2000; Van Bruggen & Wierenga, 2000). The

development (D'Aveni, 1994), which is more realistic these days. One

adoption of these new methods represents a worthwhile opportunity

of the main implications of this reformed strategic approach is a search

to improve the efﬁciency of the marketing managers' decision making

for new suitable market opportunities. Of course, such opportunities

and consequently, if well applied, the accuracy of marketing strategies

need to be correctly identiﬁed and addressed by ﬁrms. This premise

(Li, Kinman, Duan, & Edwards, 2000).

justiﬁes the transcendental relevance recently given to the creation and

The paper we present here focuses on the exploration and analysis

management of knowledge about markets (Drejer, 2004). In this vein,

of the suitability of certain brand new methods based on knowledge

the marketing function of companies and, most especially, their

discovery in databases (KDD) to be applied in marketing modelling.

Marketing Management Support Systems (MkMSS) plays a notable

Speciﬁcally, our main aim is twofold: ﬁrst, we aim to make a modest

role in this task, as they must contribute to the reduction of the

contribution to the methods used in consumer behaviour modelling.

uncertainty related to the ﬁrms' markets of reference. As we know, this

In any case, this is the marketing ﬁeld we have focused on to develop

question does not only imply having access to good marketing

and experiment our methodology, though it also applies to marketing

databases. On the contrary, the key question is having the necessary

causal modelling, in general, as well as to other Science and Social

level of knowledge to take the right decisions (Campbell, 2003; Lin, Su, &

Sciences ﬁelds that work with similar causal models.

Chien, 2006). The analytical capabilities of MkMSS are more critical than

We propose a complete knowledge discovery methodology, whose

main questions are shown in this paper, to extract useful patterns of

information with a descriptive rule induction approach based on

⁎ Corresponding author. Tel.: +34 958 242350.

Genetic Fuzzy Systems; this is a novel hybridization of methods

E-mail addresses: fjmlopez@ugr.es (F.J. Martínez-López), casillas@decsai.ugr.es

(J. Casillas).

belonging to the ﬁeld of artiﬁcial intelligence, highly appropriate for

1 Tel.: +34 958 240804.

the marketing problem we face. With this purpose, we have had to

0019-8501/$ – see front matter © 2008 Elsevier Inc. All rights reserved.

doi:10.1016/j.indmarman.2008.02.003

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

715

give solutions, adapted from our academic ﬁeld, to the diverse

he can for making the model conform to reality in structure,

questions related to the main stages of the KDD process; i.e. data

parameterization, and behaviour.

preparation, data mining, and knowledge interpretation. Moreover,

an important characteristic of our methodology is that it has been

Consequently, it seems clear that modellers should be driven by the

designed under the base there is a causal model of reference; a

requirements of models users (i.e. demand-side), instead of by a supply-

consumer behaviour model in our case. In other words, the knowledge

side orientation (Gatignon, 2000). This practice is expected to improve

discovery process is guided by a prior theoretic structure that deﬁnes

the use of the academic models among the practitioners (Roberts, 2000).

the elements (variables) of the model. Hence, this machine learning

In this sense, a ﬁrm focused on consumption markets with access not

approach is not only interesting for practitioners, but also for

only to more representative models of real systems being modelled but

academic researchers' purposes.

also to more powerful methods of analysis to extract knowledge from

To address these questions, the paper is structured as follows.

huge databases and able to simulate with models ought to improve its

Section 2 reﬂects on the suitability of evolving our marketing

competitiveness and competitive advantage (Van Bruggen & Wierenga,

modelling methods towards a growing importation and use of

2000). This is a premise that has signiﬁcantly conditioned the evolution

artiﬁcial intelligence methods to support professional and academic

of MkMSS from the early 80s, speciﬁcally with the appearance of the

marketing problems. Section 3 presents an overview and justiﬁcation

Marketing Decision Support Systems, until now (Li et al., 2000; Talvinen,

of the artiﬁcial intelligence tools employed (fuzzy rules, genetic

1995; Wierenga & Van Bruggen, 1997, 2000).

algorithms, etc.). Section 4 illustrates with some examples the

The late 80s saw the increasing use of diverse methods from

behaviour of the proposed KDD tools. Section 5 shows the methodo-

Computer Science and Artiﬁcial Intelligence to the detriment of those

logical proposal in detail. Next, in Section 6 we experiment with the

from the Operational Research and, especially, the econometrics and

methodology, show some signiﬁcant results and dedicate a brief

statistics ﬁelds. This tendency has increasingly intensiﬁed in the last

closing part to illustrate both the intrinsic and complementary

two decades (Bucklin, Lehmann, & Little, 1998; Eliashberg & Lilien,

advantages of our fuzzy modelling-based method. Section 7 discusses

1993; Leeﬂang & Wittink, 2000; Leeﬂang, Wittink, Wedel, & Naert,

the main contributions of our research, reﬂecting on the academic and

2000; Li, Davies, Edwards, Kinman, & Duan, 2002).

managerial implications. Finally, in Section 8 we comment on some

This evolution in the methods used in marketing modelling has not

research limitations and opportunities (our future research agenda).

been accidental. In this sense, Lilien, Kotler, and Moorthy (1992) noted

that this tendency was to be expected as modellers and users needed

2. Background and starting reﬂections

techniques that were more ﬂexible, powerful and robust, capable of

providing greater and improved information with respect to the real

Is there a gap between what marketing modellers offer and what

systems being modelled. Of course, this implies a greater adaptation to

marketing managers demand? If marketing modelling had got to a

both the characteristics of current databases–i.e. huge, imprecise, with

stage of maturity, as Leeﬂang and Wittink (2000) argue, one would

data gathered in formats of a different nature (numerical, categorical,

expect to ﬁnd a signiﬁcant use of academic models among marketing

linguistic, etc.)–and the type of decision problems to be supported by

practitioners. Notwithstanding, it seems that marketing managers

such models. Under these circumstances, it seems an evolution of the

rarely apply them (Roberts, 2000; Wind and Lilien, 1993; Winer,

marketing modelling methods towards systems based on artiﬁcial

2000). It is essential that we academics meditate on this. Maybe, the

intelligence is only logical (Shim et al., 2002; Wedel, Kamakura, &

answer is much less complex than we would primarily expect.

Böckenholt, 2000), which justiﬁes the growing predominance of the

We think that the efforts of marketing academics are not

knowledge-based MkMSS in the last two decades (Wierenga & Van

productive in terms of the managerial applications of their models.

Bruggen, 2000).

This is not due to deﬁciencies in the theoretic aspects that support the

In sum, MkMSS clearly tend to be based on knowledge discovery

models' structure, but due to a lack of involvement by not offering

methods that make use of diverse artiﬁcial intelligence methods to be

useful methods of analysis that allow the models' users (marketing

applied during the machine learning process; e.g.: evolutionary

managers) to “play” with these models to support their decisions. This

algorithms, fuzzy logic, artiﬁcial neural networks, rules induction,

is what has guided our research, hence the gist of this paper.

decision trees, etc. Speciﬁcally, it is expected that the use of artiﬁcial

The academics may be too focused on testing hypotheses and

intelligent methods in the MkMSS framework will evolve towards the

validating models and theories without paying enough attention to

use of intelligent systems based on the hybridization of these

what our “customers”–the marketing managers, users of our scientiﬁc

techniques (Carlsson & Turban, 2002; Shim et al., 2002). We like to

production–need. Indeed, marketing modellers cannot afford to fall

call them as Marketing Intelligent Systems. It might be the inexorable

into marketing myopia! In this regard, we should not forget that the

fate of marketing modelling methods. This fact, which is more evident

main purpose of our research efforts ought to be the contribution to

from a professional perspective–i.e. under the framework of applica-

the development of our ﬁeld, and this necessarily implies looking after

tion of the MkMSS–, has still to take hold in academic studies.

the practical applicability of our models, too.

Therefore, how can we strengthen the utility of our models to

3. Knowledge extraction based on fuzzy rules and genetic

achieve a better explanation of markets, thus better matching them to

algorithms

marketing managers' needs? Research efforts can be addressed to the

improvement of three main areas of interest in marketing modelling

3.1. The KDD process

(Roberts, 2000): theoretic aspects deﬁning the models; understanding

of managers' (users) needs, hence the framework of application of

In general terms, KDD is a recent research ﬁeld belonging to artiﬁcial

models; and reﬁnement of the statistical tools (i.e. techniques and

intelligence whose main aim is the identiﬁcation of new, potentially

methods in general) applied to estimate the models. The pursuit of

useful, and understandable patterns in data (Fayyad, Piatesky-Shapiro,

these improvement guidelines is not too distant from what Little

Smyth, & Uthurusamy, 1996). Furthermore, KDD implies the develop-

(1970, p. B-483) asked of researchers a few decades ago when building

ment of a process compounded by several stages that allow the

models to support marketing managers' decision making:

conversion of low-level data into high-level knowledge (Mitra, 2002).

Though KDD is synthetically viewed as a three-stage process–i.e. pre-

Although the results of using a model may sometimes be personal

processing, data mining and post-processing–(Freitas, 2002), we believe

to the manager […] the researcher still has the responsibilities of a

that, for our academic ﬁeld, it is more interesting to present it within a

scientist in that he should offer the manager the best information

wider structure. Speciﬁcally, we prefer the following ﬁve-stage process

716

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

(Cabena, Hadjinian, Stadler, Verhees, & Zanasi, 1998; Han & Kamber,

21 years of age belongs to the fuzzy set labelled young to a degree of 0.55

2001): (1) identiﬁcation and problem delimitation; (2) data preparation

(colloquially speaking, 55%), while a 27 year-old belongs entirely to the

(pre-processing); (3) data mining (machine learning); (4) analysis,

fuzzy set young, or a 37-year-old belongs to young people to a 0.3 degree

evaluation and interpretation of results; and (5) presentation, assimila-

and also to adult people to a degree of 0.7. If we used classical (crisp) sets

tion and use of knowledge. It is important to highlight that the success of

and ﬁxed the boundary between young and adult at 35 years of age, a

such process, applied to solve or support the resolution of a particular

person aged 34.9 years would be considered 100% young while another

problem of information in marketing, depends on the suitable

aged 35.1 years would not be young to any degree.

development of every stage. The reader will be more conscious of this

Fuzzy rules can be considered a useful representation of knowledge

question when observing the lengths we go in order to explain how to

to discover intrinsic relationships contained in a database (Freitas,

prepare marketing data (pre-processing) or how to analyse the output

2002). Thus, by means of fuzzy rules we can represent the relationship

(knowledge) of the data mining stage (post-processing).

existing among different variables, thus deducing the patterns

contained in the data examined. Useful patterns allow us to do non-

3.2. Knowledge representation by fuzzy rules

trivial predictions about new data. There are two extremes to express a

pattern: black boxes, whose internal behaviour is incomprehensible;

Nowadays, one of the most successful tools for the development of

and white boxes, whose construction reveals the pattern structure. The

descriptive models is fuzzy modelling (Lindskog, 1997). This is an

difference lies in whether the patterns generated are represented by a

approach used to model a system making use of a descriptive language

structure that is easy to examine and which can be used to reason and

based on fuzzy logic with fuzzy predicates. The way to express fuzzy

to inform further decisions. In other words, when the patterns are

predicates is by means of IF–THEN rules, as in the following example:

structured in a comprehensible way, they will be able to help explain

something about the data. The trouble with KDD, the interpretability-

IF Age_of_Consumer is Young and Purchasing_Power is Very_High

accuracy trade-off, is also being tackled in current fuzzy modelling

THEN Trend_To_Buy_Sports_Cars is High

(Casillas et al., 2003a,b) and will be considered by our proposal.

The use of fuzzy rules when developing the knowledge discovery

These rules set logical relationships among variables of a system by

process has some advantages, which are (Freitas, 2002; Dubois, Prade, &

using qualitative values. Such a representation mode easily matches the

Sudkamp, 2005): they allow us to deal with uncertain data; they ade-

humans' way of reasoning. Hence, the performance of both the analysis

quately consider multi-variable relationships; results are easily under-

and interpretation steps of the modelling process improves thanks to

standable by humans; additional information is easily added by an expert;

the true behaviour of a system that is more effectively revealed.

the accuracy degrees can be easily adapted to the needs of the problem,

Notwithstanding, it should be noted that though human reasoning may

and the process can be highly automatic with low human intervention.

deal without difﬁculty with terms like high or young, when this issue is

Therefore, we will use fuzzy logic as a tool to structure the

tackled by means of an automatic process its treatment is more complex.

information of a consumer behaviour model in a clear and intelligible

To work properly with these kinds of qualitative valuations,

way that is close to that of the human being. Fuzzy logic methods are

linguistic variables (Zadeh, 1975a,b, 1976) based on both Fuzzy Sets

expected to offer beneﬁts to marketing decision makers when

Theory and Fuzzy Logic (Zadeh, 1965) are used, so the previously

integrated with current MkMSS (Metaxiotis, Psarras, & Samouilidis,

exempliﬁed rule is known as a fuzzy rule. The use of fuzzy logic

2004). The fuzzy system will allow us to represent adequately the

provides several beneﬁts, such as a higher generality, expressive

interdependence of variables and the non-linear relationships that

power, ability to model real problems and, last but not least, a

could exist between them.

methodology to exploit tolerance in the face of imprecision.

For example, we can consider the linguistic variable Age_of_Consu-

3.3. Multiobjective genetic algorithms

mer, which could take in the linguistic terms (values) teenager, young,

adult, and old. These linguistic terms (also know as labels) are

In the previous section, we introduced the proposed representation

mathematically expressed by simple functions that return the member-

of knowledge based on fuzzy rules. However, we also need an algorithm

ship degree (with a real value between 0 and 1) to each fuzzy set.

to automatically extract a set of fuzzy rules with good properties. In this

Therefore, instead of considering that a consumer could be 100% young

paper, we propose the use of a genetic algorithm. The main reasons for

or 100% adult, with fuzzy sets we can say that the consumer belongs to

using it instead of other well-known machine learning techniques are

the set of young people with one degree and also to the set of adults with

the following. Firstly, since there are usually contradictory objectives to

another degree. So, the boundaries between sets are fuzzy instead of

be optimised in KDD (such as accuracy and interpretability, or support

crisp, thus providing a powerful linguistic expression and a gradual

and conﬁdence), we perform multiobjective optimisation. It is one of the

transition of the membership to the different fuzzy sets.

most promising issues and one of the main distinguishing features of

Fig. 1 represents an example of how the age of a person can be

genetic algorithms compared to other techniques. Furthermore, we

expressed by fuzzy sets. In this ﬁgure, we could say that a person of

consider a ﬂexible representation of fuzzy rules that can be developed

Fig. 1. Illustrative example of the linguistic variable age, composed of the linguistic terms teenager, young, adult and old, and their corresponding fuzzy sets. A 37-year-old has a

membership degree 0.3 to young and 0.7 to adult.

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

717

Fig. 2 shows the structure of a simple GA.

A ﬁtness function must be devised for each problem to be solved.

Given a particular chromosome (i.e. a solution), the ﬁtness function

returns a numerical value that is supposed to be proportional to the

utility or adaptation of the solution represented by this chromosome.

In our case, we will consider two different measures to assess the

quality of a solution (fuzzy rule): support and conﬁdence.

There are a number of ways to do selection. We might view the

population as a mapping on a roulette wheel, where each individual is

represented by a space that proportionally corresponds to its ﬁtness.

By repeatedly spinning the roulette wheel, individuals are chosen

using stochastic sampling with replacement to ﬁll the intermediate

population. Another possibility, called binary tournament, consists in

doing a number of tournaments equal to the size of the population. In

each tournament, two chromosomes of the old population are chosen

Fig. 2. Structure of a genetic algorithm.

at random, and the best one according to ﬁtness is included in the new

population. We will employ this second approach in our proposal.

properly by genetic algorithms. This ﬂexible representation improves the

After selection has been carried out, the construction of the

description capability of the fuzzy rule, an important issue in KDD.

intermediate population is completed and crossover and mutation can

Genetic algorithms demonstrated good results for management

occur. The crossover operator combines the features of two parent

and marketing applications, thus arousing the interest of researchers

structures to form two similar offspring. Classically, it is applied at a

and practitioners in the nineties (Hurley, Moutinho, & Stephens, 1995;

random position with a probability of performance, the crossover

Nissen, 1995). However, one of the novelties of this paper for market-

probability. The mutation operator arbitrarily alters one or more

ing is that, in this instance, fuzzy logic and genetic algorithms will not

components of a selected structure so as to increase the structural

be applied separately to tackle a particular marketing problem, but in

variability of the population. Each position of each solution vector in

cooperation. In the following, genetic algorithms and multiobjective

the population undergoes a random change according to a probability

optimisation are brieﬂy introduced.

deﬁned by a mutation rate, the mutation probability.

Fig. 6 in Section 4 illustrates graphically the use of a genetic

3.3.1. Genetic algorithms

algorithm to extract fuzzy rules from available data in the marketing

Genetic algorithms are general-purpose search algorithms that use

problem we are dealing with in this paper.

principles inspired by natural population genetics to evolve solutions

to problems. The basic principles of genetic algorithms were ﬁrst laid

3.3.2. Multiobjective optimisation

down rigorously by Holland (1975) and are well described in many

Many real-world problems involve simultaneous optimisation of

texts (e.g.: Goldberg, 1989; Michalewicz, 1996).

multiple objectives. In principle, multiobjective optimisation is very

The basic idea is to maintain a population (i.e., a set) of knowledge

different from single-objective optimisation. The second case

structures that evolves over time through a process of competition and

attempts to obtain the best solution; i.e. the global minimum or the

controlled variation. Each structure in the population represents a

global maximum depending on the problem. However, in the case of

candidate solution to the speciﬁc problem and has an associated ﬁtness

multiple objectives, there may not be a single solution that is better

to determine which structures are used to form new ones in the process of

than the rest with respect to all objectives.

competition. The new individuals are created using genetic operators such

In a typical multiobjective optimisation problem, there is a set of

as crossover and mutation. Genetic algorithms have had a great measure

solutions that are superior to the rest of the solutions in the search

of success in search and optimisation problems. The main reason for this

space when all the objectives are considered, but which are inferior to

success is their ability to exploit accumulative information about an

other solutions in the space occupied only by some of them. These

initially unknown search space in order to bias subsequent search into

solutions are known as non-dominated solutions (Chankong & Haimes,

useful subspaces, i.e., their robustness. This is their key feature, especially in

1983), while the rest of the solutions are known as dominated

large, complex and poorly understood search spaces, where the classical

solutions. Since none of the solutions in the non-dominated set is

search tools (enumerative, heuristic, etc.) are inappropriate, offering a

worse in all the objectives than the other ones, all of them are

valid approach to problems requiring efﬁcient and effective search.

acceptable solutions.

A genetic algorithm starts with a population of randomly generated

Mathematically, the concept of Pareto-optimality2 or non-dominance

solutions, chromosomes, and advances towards better solutions by

is deﬁned as follows. Let us consider, without loss of generality, a mul-

applying genetic operators, modelled on the genetic processes occurring

tiobjective maximization problem with m parameters (decision vari-

in nature. As previously mentioned, in these algorithms we maintain a

ables) and n objectives:

population of solutions (in our case, fuzzy rules) for a given problem; this

population undergoes evolution in a form of natural selection. In each

Maximise

f ðxÞ ¼ f

ð 1ðxÞ; f2ðxÞ; N ; fnðxÞÞ

generation, relatively good solutions reproduce to give offspring that

replace the relatively bad solutions, which die. An evaluation or ﬁtness

with x = (x1,x2,…,xm)∈X. A decision vector a∈X dominates b∈X (noted

function plays the role of the environment to distinguish between good

as a ⪯b) if, and only if:

and bad solutions. The process of evolving from the current population to

8ia 1

f ; N ; ngj f

f

gj f

the next one constitutes one generation in the execution of a genetic

i a

ð Þ z fi b

ð Þ and aja 1; N ; n j a

ð Þ N fj b

ð Þ:

algorithm.

Any vector that is not dominated by any other is said to be Pareto-

Although there are many possible variants of the basic genetic

optimal or non-dominated. These concepts are depicted graphically in

algorithm, the fundamental underlying mechanism involves three

Fig. 3.

operations (Goldberg, 1989):

(1) evaluation of individual ﬁtness,

2 The concept Pareto optimality is an important notion in neoclassical economics. It

(2) formation of a gene pool (intermediate population), and

is named after the French–Italian economist Vilfredo Pareto (1848, Paris–1923,

(3) crossover and mutation.

Geneva).

718

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

Fig. 3. Example of multiobjective optimisation.

Thanks to the use of a solution population, genetic algorithms can

features such as discontinuities, multimodality, disjoint feasible spaces,

simultaneously search for many Pareto-optimum solutions. For this

and noisy function evaluations, reinforces the potential effectiveness of

reason, genetic algorithms have been recognised as possibly being well

genetic algorithms in multiobjective search and optimisation. Generally,

suited to multiobjective optimisation (Coello, Van Veldhuizen, & Lamont,

the multiobjective approaches only differ from the rest of the genetic

2002). Furthermore, the ability to handle complex problems, involving

algorithms in the ﬁtness function and/or in the selection operator.

4. An illustrative example on how to extract knowledge from data to analyse consumer behaviour

This section serves as a bridge between the technical concepts included in the previous section and the modelling methodology proposed in

the next one. Therefore, to introduce the reader to the methodology, we propose extracting useful knowledge from data that can aid better

understanding the existing relationships between variables by presenting in this section a toy problem (with a few variables and a small data set

size) to illustrate the basic behaviour and powerful nature of the proposed KDD process. Some parts of the process have been intentionally

simpliﬁed with the aim of focusing on the most relevant aspects. The rigorous description of the proposal can be found in Section 5, while Section

6 amply describes the experimental results in a real-world problem.

To illustrate the proposed use of KDD, we will consider a simple measurement (causal) model depicted in Fig. 4(a), compounded by three

construct or latent variables (depicted by circles), two exogenous and one endogenous: (1) fashion consciousness, (2) conservatism, and

(3) hedonism; extracted from MacLean and Gray (1998). Likewise, imagine that the three constructs have been measured by means of several

seven-point interval scales (e.g. Likert-type and differential semantic scales). Finally, Fig. 4(b) shows an example of a data set available for this

problem, which consists of three variables, each made up of a set of values. There are just four cases (e.g., questionnaires), which are not realistic at

Fig. 4. Example of a simple measurement (causal) model–extracted from (MacLean & Gray, 1998)–and a data set from four hypothetical consumers' responses.

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

719

Fig. 5. Example of transformation of a seven-point Likert-type scale into a fuzzy semantic. According to that, the membership degree of 5 to the fuzzy set associated to the linguistic

term Medium is 0.67, while the membership degree of 6 is 0.33.

all–i.e. think that a consumer database usually has hundreds or even thousands of collected individuals' responses–, though it is useful for our

illustrative purpose.

The ﬁrst step we perform is to transform the interval scale into fuzzy semantics. This allows us to use linguistic terms to describe the different

items by means of linguistic variables. We can consider the following three membership functions to describe the terms Low, Medium, and High:

8

8 x À 1

>

>

>

>

(

>

< 4 À x

>

if 1 V x V 4

< 3

7 À x

A

ð Þ ¼

if 2 V x V 4

ð Þ ¼

ð Þ ¼

if 5 V x V 8

Low x

3

; AMedium x

7 À x

; AHigh x

3

:

>

>

> 0

otherwise

>

if 4 b x V 7

>

> 3

0

otherwise

:

: 0

otherwise

A graphical representation of these membership functions is depicted in Fig. 5.

Once we have deﬁned the variables in terms of fuzzy sets, we can use fuzzy rules to express relationships (i.e., patterns) among the variables

(refer to Section 3.2. for a description of these kinds of rules). To do that, we will consider the two exogenous variables and the endogenous one,

antecedents and consequent respectively in this example.

These fuzzy rules can represent many different relationships among the variables; however, not all of them will match the existing data

exactly. Therefore, we need some measures to assess the quality of each rule with respect to the data. These measures can be considered a kind

of statistical computation. In this paper, we will consider two important values: support and conﬁdence. On the one hand, support (whose real

value is in [0,1]) will give us an idea about in which degree the rule represents the cases of the data set. For example, a support of 0.25 could be

understood as the rule that covers 25% of the available cases. We are interested in obtaining fuzzy rules with a support as high as possible since

the rule will be more general and will represent a higher portion of the sample. On the other hand, conﬁdence (whose real value is also in [0,1]),

indicates how accurate the fuzzy rule is. Since the fuzzy rule predicts a relationship between the antecedent and the consequent, we need

to know in which degree such a prediction appears in the available data set. For example, if a fuzzy rule has a conﬁdence of 0.9, we can say

that, according to the available data, the fuzzy rule is 90% true. Of course, we are interested in obtaining fuzzy rules with a high degree of

conﬁdence.

As one can imagine, support and conﬁdence are two contradictory features. Inasmuch as the degree of representation is higher, it is more

difﬁcult to accurately express the relationships among variables. One fuzzy rule will be clearly preferable to another if the former has higher

values of both support and conﬁdence.

In the following, we will show some examples of fuzzy rules and the computation of the corresponding support and conﬁdence values from

the data set of Fig. 4(b).

R1: If Fashion_Consciousness is LOW and Conservatism is MEDIUM then Hedonism is MEDIUM

A

Yð1Þ Þ ¼

f

g ¼

f

g ¼

Low

x

max A

max 1; 0:67; 1

1

1

Low 1

ð Þ; ALow 2

ð Þ; ALow 1

ð Þ

A

Yð1Þ Þ ¼

f

g ¼

f

g ¼

Medium

x

max A

max 0:67; 0:33

0:67

2

Medium 5

ð Þ; AMedium 6

ð Þ

À

Á

n

o

A

ð Þ ¼

Yð1Þ

Yð1Þ

¼

f

g ¼

A 1

ð Þ x 1

min ALow x

; A

x

min 1; 0:67

0:67

1

Medium

2

A

Yð1Þ Þ ¼

f

g ¼

f

g ¼

Medium

y

max AMedium 1

ð Þ; AMedium 2

ð Þ

max 0; 0:33

0:33

À

Á

A

Yð2ÞÞ ¼

Yð2Þ Þ ¼

ð Þ ¼

Yð2Þ Þ ¼

Low

x

0; A

x

0:33; A

0; A

y

0:33

1

Medium

2

A 1

ð Þ x 2

Medium

À

Á

A

Yð3Þ Þ ¼

Yð3Þ Þ ¼

ð Þ ¼

Yð3Þ Þ ¼

Low

x

0; A

x

0:33; A

0; A

y

0

1

Medium

2

A 1

ð Þ x 3

Medium

À

Á

A

Yð4Þ Þ ¼

Yð4Þ Þ ¼

ð Þ ¼

Yð4Þ Þ ¼

Low

x

0; A

x

0:67; A

0; A

y

0:67

1

Medium

2

A 1

ð Þ x 4

Medium

1 X

4

YðeÞ

0:67 Á 0:33 þ 0 þ 0 þ 0

Support R

ð 1Þ ¼

A

Á A

y

¼

¼ 0:05556

4

A 1

ð Þ

x e

ð Þ

B 1

ð Þ

4

e¼1

P

À

Á

n

À

Á

o

4

A

ð Þ Á

ðeÞ

max 1 À A

; A

Y

y

e¼1

A 1

ð Þ x e

A 1

ð Þ x e

ð Þ

B 1

ð Þ

0:67 Á max 1

f À 0:67; 0:33g þ 0 þ 0 þ 0

Conf idence R

ð 1Þ ¼

P

¼

¼ 0:33333

4

A

ð Þ

ð

Þ

0:67 þ 0 þ 0 þ 0

e¼1

Að1Þ x e

720

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

R2: If Fashion_Consciousness is MEDIUM and Conservatism is MEDIUM then Hedonism is MEDIUM

À

Á

À

Á

A

Yð1Þ Þ ¼

ð Þ ¼

Yð2Þ Þ ¼

¼

Medium

x

0:33; A

0:33

A

x

0; A

0

1

A 2

ð Þ x 1

Medium

1

A 2

ð Þ x 2

ð Þ

À

Á

À

Á

A

Yð3Þ Þ ¼

ð Þ ¼

Yð4Þ Þ ¼

ð Þ ¼

Medium

x

0; A

0

A

x

1; A

0:67

1

A 2

ð Þ x 3

Medium

1

A 2

ð Þ x 4

0:33 Á 0:33 þ 0 þ 0 þ 0:67 Á 0:67

Support R

ð 2Þ ¼

¼ 0:13889

4

0:33 Á max 1

f À 0:33; 0:33g þ 0 þ 0 þ 0:67 Á max 1

f À 0:67; 0:67g

Conf idence R

ð 2Þ ¼

¼ 0:44445:

0:33 þ 0 þ 0 þ 0:67

As we can observe, the fact of using the linguistic term “medium” for the fashion consciousness variable instead of “low” (as in rule R1) allows

us to cover better the data set and, at the same time, to improve the accuracy of the rule.

R3: If Fashion_Consciousness is MEDIUM and Conservatism is {LOW or MEDIUM} then Hedonism is MEDIUM

n

o

À

Á

A

Yð1Þ ¼

Y 1

ð Þ

þ

Yð1Þ

¼

¼

Low or Medium

x

min

1; A

x

A

x

0:67; A

0:33

2

Low

2

Medium

2

A 3

ð Þ x 1

ð Þ

À

Á

A

Yð2Þ ¼

ð Þ ¼

Low or Medium

x

1; A

0:33

2

A 3

ð Þ x 2

À

Á

A

Yð3Þ ¼

ð Þ ¼

Low or Medium

x

1; A

0:33

2

A 3

ð Þ x 3

À

Á

A

Yð4Þ ¼

ð Þ ¼

Low or Medium

x

1; A

0:67

2

A 3

ð Þ x 4

Support R

ð 3Þ ¼ 0:16667

Conf idence R

ð 3Þ ¼ 0:66667:

This third rule includes two linguistic terms in the variable conservatism. Doing that, the support is higher since we can cover the data set to a

higher degree compared to rule R2 (it is obvious since R3 is more general than R2). Moreover, the conﬁdence is also improved, so this third rule is

clearly better than the previous ones.

Once we have shown some examples of fuzzy rules and how to compute their associated support and conﬁdence values from a data set, we will

illustrate a simpliﬁcation of how the data mining process works. Fig. 6 depicts a scheme of the behaviour of a genetic algorithm to reveal fuzzy rules

from data. The genetic algorithm, as explained in Section 3.3.1, optimises generation by generation the population, in our case a set of different

fuzzy rules, i.e., patterns. To analyse alternative fuzzy rules, new ones are generated from the existing one by applying the crossover and mutation

operators. The genetic algorithm encodes the rules in a format that is easily tractable in a computer, in this case by using a binary representation.

In the example of Fig. 6, the mutation takes a solution from the current population and applies a slight alteration; in this case, it changes the

linguistic term used in the ﬁrst variable from “low” to “medium.” The new generated rule is included in the next population since its corresponding

values of support and conﬁdence are better. In other example, the crossover takes two solutions and combines them by generating a new rule that

contains the linguistic terms considered in each parent rule. This new rule, better than its parents, is included in the new population.

5. A marketing intelligent system for consumer behaviour analysis

theoretical constructs (i.e. unobserved variables), should be made.

Consequently, we think that time should be spent analysing the

This section introduces the process in which we propose perform-

adaptation of the fuzzy rule-based KDD to the latter case, inasmuch as

ing knowledge discovery related to consumers by fuzzy rules.

its treatment seems to be the more controversial.

Basically, it consists of preparing the data and of ﬁxing the scheme

Previously, it could be said that measuring streams for these latent

we follow to represent the knowledge existing in the data. Once these

variables in consumer modelling was classiﬁed into two groups

aspects are deﬁned, a machine learning method is used to auto-

depending on if they declared that these constructs could or could not

matically extract interesting fuzzy rules. Finally, a post-processing

be perfectly measured by means of observed variables (indicators):

stage is carried out. All these questions are now presented in detail.

the operational deﬁnition philosophy and the partial interpretation

philosophy respectively. This latter approach of measurement, cur-

5.1. Data gathering

rently predominant in the marketing modelling discipline, recognises

the impossibility of doing perfect measurements of theoretical

First step is to collect the data related to the variables deﬁning the

constructs by means of indicators, so it poses joint consideration of

theoretical model of the consumer behaviour proposed. In this sense,

multiple indicators–imperfect when considered individually, though

as has been done traditionally in Marketing Science in particular,

reliable when considered together–of the subjacent construct to

and in Social Sciences in general, data is obtained by means of a

obtain valid measures (Steenkamp & Baumgartner, 2000).

questionnaire. This questionnaire gathers the measures for the set of

Therefore, our methodological approach should be aware of this

constituent elements of the model.

question when adapting the data (observed variables) to a fuzzy rule

learning method. Notwithstanding, we would like to highlight that

5.2. Data processing

our method does not have any problem with processing elements of a

model for which we have just a single variable or indicator associated

Next, it is necessary to adapt the collected data to a scheme easily

to each of them, even when they have been measured by varied

tractable by fuzzy rule learning methods. Thus, at ﬁrst, attention

measurement scales. The problem comes, hence the challenge to face,

should be paid to how modellers face and develop the measurement

when there are multiple variables related to the measurement of a

process of the elements/variables contained in the complex beha-

particular element of the model. Some intuitive solutions and aprioristic

vioural models. In this respect, reﬂections about the measurement of

analyses of the internal consistency of the multi-item scales associated

such variables, with a special focus on those usually known as

to such elements have been proposed, with the aim of keeping just

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

721

Fig. 6. A simpliﬁed example of the behaviour of a genetic algorithm when extracting knowledge in form of fuzzy rules from the data set available in Fig. 4(b).

one indicator (the best) per construct (see: Casillas, Martínez-López, &

5.3.1. Fuzzy semantics from expert knowledge

Martínez, 2004). The weakness of these approaches is that the data must

Once the marketing modeller has ﬁnally determined both the

be transformed, so relevant information may be lost.

elements of the model and the observed variables associated to each

We propose a solution based on a more sophisticated process

one (i.e. the measurement model), a transformation into linguistic

that allows working with the original format without any pre-

terms (fuzzy semantic) of the original marketing scales used for

processing stage (Martínez-López & Casillas, 2007): the multi-item

measuring those observed variables should be done. This is necessary

fuzziﬁcation. Thus, a T-conorm operator (e.g., maximum), tradition-

for the derivation of fuzzy rules later. This question implies treating

ally used in fuzzy logic to develop the union of fuzzy sets, can be

the application of the fuzzy set theory to the measurement in

applied to aggregate the partial information given by each item.

marketing. In this regard, as far as we know, Viswanathan, Bergen,

Since it is not pre-processing data but a component of the machine

Dutta, and Childers (1996) were the ones who ﬁrst researched this

learning design, the details of that treatment of the items is des-

question by proposing a methodology for the scale development in

cribed in Section 5.4.2.

marketing. In any case, as this is not the central theme of this paper,

we are not going to treat this issue in depth, though it is thoroughly

5.3. Representation and inclusion of the marketing expert's knowledge

analysed in the research that supports this study.

Several marketing scale types can be used to measure the variables

Several issues should be tackled at this step of our methodological

associated to the constituent elements of a consumer behaviour model.

proposal: the set of variables/constructs to be processed, the

With the aim of focusing the problem, we take Stevens (1946, 1959) as

transformation of the marketing scales used for measuring such

a base to summarize them in four categories with regard to their level of

variables into fuzzy semantic, the relations among constructs (i.e. the

measurement, i.e. nominal, ordinal, interval and ratio. Considering

causal model), and the fuzzy rules' sets to be generated. All of them are

those types, a transformation into fuzzy semantic is meaningful for

based on the expert's capability to express his knowledge in a hu-

the majority with the exception of variables measured by means of

manly understandable format by fuzzy logic.

a nominal scale, where the nature of categories deﬁning the scale

722

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

are purely deterministic. In general terms, this transformation should

This structure uses a more compact description that improves

be practiced taking into account two main questions:

interpretability. Moreover, the structure is a natural support to allow

for the absence of some input variables in each rule (simply making Ãi

a) The number of linguistic terms to be used, which determines the

to be the whole set of linguistic terms available).

granularity (the scale sensitivity) of certain fuzzy variable, must be

deﬁned. Thus, although more terms are used, the analysis of

5.4.2. Multi-item fuzziﬁcation

relations among variables is more accurate, but more complex too.

In order to properly consider the set of items available for each

Consequently, the marketing modeller should take time to think

input/output variable (as discussed in Section 5.2), we propose an

about what the most convenient degree of sensitivity is in the

extension of the membership degree computation, the so-called multi-

fuzzy scales used in his/her study. Three or ﬁve linguistic terms

item fuzziﬁcation. The process is based on a union of the partial

(fuzzy sets) seem good options.

information provided by each item. Given Xi and Yj measured by the

b) The membership function type and shapes deﬁning the behaviour of a

→

(i)

(i)

(i)

→

( j)

vectors of items x

( j)

( j)

i = (x1 ,…, xh ,…, xp ) and yj = (y1 ,…,yt ,…,yq ), res-

certain fuzzy variable should be also deﬁned. Such behaviour can be

i

i

j

j

pectively, the fuzzy propositions “Xi is Ãi” and “Yj is Bj” are respectively

broadly treated considering the use of linear vs. non-linear member-

interpreted as follows:

ship functions to characterise the fuzzy sets. Thus, trapezoidal and

triangular functions can be used to obtain a linear behaviour, while

p

q

A

Y

i

j

x

¼ max A xi

and A

Y

y

¼ max A yðjÞ :

Gaussian functions can be used for a non-linear one.

˜A

i

˜

B

j

B

i

h

t

h

Ai

i

j

j

j

i ¼1

tj¼1

We are now going to focus on those marketing scales mainly used for

measuring the observed variables related to the elements (theore-

Therefore, the T-conorm of maximum is considered to interpret the

tical constructs) of a particular marketing model; i.e.: Likert-type and

disjunction of items.

differential semantic. Firstly, we have considered that it is more

appropriate to use linear functions, inasmuch as it facilitates the

5.4.3. Discovery process

interpretation of relations later. Second, we believe that a trans-

In order to perform descriptive induction we will apply a method

formation into a triangular function is more convenient if special

with some similarities to subgroup discovery, widely used in learning

characteristics of these marketing scales are considered; scales

classiﬁcation rules (Lavrac, Cestnik, Gamberger, & Flach, 2004) where

valuations are punctual. Then, when the membership degree of

the interest property is the class associated to the consequent variable.

certain linguistic terms is equal to one, such a term should be

Therefore, this technique seeks to group the set of data into different

associated to a point of the scale. In this regard, this choice has also

subgroups, including in each of them the example set by the corres-

been justiﬁed in the marketing context, with the argument that

ponding consequent, and to discover a set of rules representing this

trapezoidal functions facilitate the later process of fuzzy inference

subgroup. In that case, the most usual approach involves running the

(Li et al., 2002).

algorithm once for each subset of examples holding the property ﬁxed

for the consequent.

To sum up, Fig. 5 shows an example based on the transformation of

Instead of that, our algorithm considers the subgroup division

a seven-point rating scale into a three-triangular fuzzy semantic, with

according to the used fuzzy set in the consequent; therefore, the

the three linguistic terms (Low, Medium, and High) represented by the

subsets of examples can be overlapped. Moreover, we propose per-

corresponding fuzzy sets characterised by the three membership

forming a simultaneous subgroup discovery where niches of fuzzy

functions shown in Section 4.

rules, in accordance with the consequent, are formed and optimised in

parallel to generate a ﬁnal set of suboptimal solutions in each sub-

5.3.2. Input/output linguistic variables from expert knowledge

group. To perform this process, as explained in the following sections,

Once the causal model has been ﬁxed by the marketing expert, fuzzy

we vary the concept of multiobjective dominance and we design the

rules are used to relate input (antecedents) with output (consequents)

genetic operators for acting only on the antecedent part.

variables. Obviously, the theoretic relations deﬁning the model can be

directly used to deﬁne the IF–THEN structures by considering the

5.4.4. Coding scheme

dependences shown among the variables. Thus, we obtain a set of fuzzy

Each individual of the population represents a fuzzy rule. The rule

rules for each considered consequent (i.e. endogenous element of the

is encoded by a binary string for the antecedent part and an integer

model) and its respective set of antecedents. Several examples of fuzzy

coding scheme for the consequent part. The antecedent part has a size

rules from the model included in Fig. 4(a) can be found in Section 4.

equal to the sum of the number of linguistic terms used in each input

variable. The allele ‘1’ means that the corresponding linguistic term is

5.4. Machine learning (data mining process)

used in the corresponding variable. The consequent part has a size

equal to the number of output variables. In that part, each gene

5.4.1. Fuzzy rule structure

contains the index of the linguistic term used for the corresponding

In data mining, it is crucial to use a learning process with a high

output variable.

degree of interpretability preservation. To do that, we can opt for

For example, assuming we have three linguistic terms (S [small],

using a compact description as the disjunctive normal form. This kind

M [medium], and L [large]) for each input/output variable, the fuzzy

of fuzzy rule structure has the following form (González & Pérez,

rule [IF X1 is S and X2 is {M or L} THEN Y is M] is encoded as [100|

1998):

011||2].

R: IF X1 is Ã1 and … and Xn is Ãn THEN Y1 is B1 and … Ym is Bm

5.4.5. Objective functions

with each input variable Xi, i∈{1,…, n}, taking as a value a set of linguistic

We consider the two criteria most often used to assess the quality

terms Ãi ={Ai1 or … or Ain}, whose members are joined by a disjunc-

of association rules (Dubois et al., 2005): support and conﬁdence. In

i

tive (T-conorm) operator, while the output variables Yj, j∈{1,…, m},

Section 4, the reader can see some examples of how these measures

remain a usual linguistic variable with single labels associated. We use

are computed.

the bounded sum as T-conorm in this paper:

(1) Support: This objective function measures the representation

(

)

degree of the corresponding fuzzy rule among the available

X

ni

A ˜A

data. It is computed as the mean covering degree of the rule for

iðxÞ ¼ min

1;

A ð Þ

A

x

:

ik

k¼1

each data. As covering, we consider the conjunction of the

F.J. Martínez-López, J. Casillas / Industrial Marketing Management 38 (2009) 714–731

723

membership degrees of both antecedent and consequent

To perform simultaneous subgroup discovery properly, we need to

variables. Therefore, the support measure (for maximization)

redeﬁne the dominance concept. Thus, one solution (fuzzy rule) do-

of the fuzzy rule R: A ⇒ B is deﬁned as follows:

minates another when, besides being better or equal in all the ob-

jectives and better in at least one of them, it has the same consequent

1 X

N

as the other rule. In that way, those rules with different consequents

YðeÞ

Sup R

ð Þ ¼

A

ð Þ

Á A

y

are not dominated between them, thus inducing the algorithm to form

N

A x e

B

e¼1

a search niche (Pareto set) for each considered consequent (subgroup).

→(e)

→(e)

→

with N being the data set size, x(e) =(x 1 ,…, x n ) and ye the

À

Á

5.4.7. Genetic operators

eth input/output multi-item data instance, and

A

ð Þ ¼

A x e

The initial population is built by deﬁning the same amount of

ðeÞ

min

A

Y

˜

x

the covering degree of the antecedent of the

i

groups (with the same size) as the consequents considered. In each of

ia 1

f ; N ;ng Ai

rule R for each example (i.e., the T-norm minimum is considered to

them, the chromosomes are generated by ﬁxing the consequent and

interpret the connective ‘and’ of the fuzzy rule). As shown, the T-

by randomly deﬁning a simple antecedent to which each variable is

norm of the product is considered as joint antecedent and con-

assigned only one linguistic term. The two genetic operators (cross-

sequent. Note that we use the multi-item fuzzi

ﬁcation described

over and mutation) act only on the antecedent part. This allows the

ðeÞ

→

in Section 5.4.2 to compute A

Y

(e)

algorithm to keep a constant size for each subgroup.

˜

x

and μ

).

A

i

B( y

i

The crossover operator randomly chooses two cross points (in the

(2) Conﬁdence: This second objective measures the reliability of the

antecedent) and exchanges the central string of the two selected

relation between antecedent and consequent described by the

parents. If all the linguistic terms of a variable are set off after cross-

analysed fuzzy rule. We have used a conﬁdence measure that

over, a linguistic term used in the parents is randomly chosen and set

avoids the accumulation of low cardinalities (Dubois et al.,

to ‘1’. It is interesting to note that no constraints are imposed on

2005). It is computed (for maximization) as following:

selecting the parents, so the crossover can be applied to parents with

different consequents (i.e., belonging to different subgroups). It allows

P

À Á

n

À

Á

o

N

A

ð Þ Á

ðeÞ

max 1 À A

ð Þ ; A Y

y

migrations between niches, thus improving the search process.

e¼1

A x e

A x e

B

Conf R

ð Þ ¼

P

:

The mutation operator randomly selects an input variable of the

N

A

ð Þ

ð

Þ

e¼1

A x e

fuzzy rule encoded in the chromosome and one of the three following

possibilities is applied: expansion, which ﬂips to ‘1’ a gene of the selected

Therefore, the Dienes' S-implication, I(a,b)=max{1 −a,b}, is used.

variable; contraction, which ﬂips to ‘0’ a gene of the selected variable; or

Note that this implication operator is a fuzzy interpretation of the

shift, which ﬂips to ‘0’ a gene of the variable and ﬂips to ‘1’ the gene

classical interpretation p ⇒q≡¬p∨q used in Boolean logic where

immediately before or after it. The selection of one of these mechanisms

the negation is interpreted as 1 −a and the disjunction as max{a,b}.

is made randomly among the available choices (e.g., contraction cannot

Multi-item fuzziﬁcation is again considered.

be applied if only one gene of the selected variable has the allele ‘1’). Note

that it is always possible to perform at least one of these options.

5.4.6. Evolutionary scheme

We consider a generational approach with the multiobjective elitist

6. Experimentation and knowledge interpretation

replacement strategy of NSGA-II (Deb, Pratap, Agarwal, & Meyarevian,

2002). Crowding distance in the objective function space is used.

6.1. Marketing model and data source used for the experimentation

Binary tournament selection based on the non-domination rank (or

the crowding distance when both solutions belong to the same front)

Regarding other published marketing-related studies that have

is

## Add New Comment