This is not the document you are looking for? Use the search form below to find more!

Report home > Education

Applied Data Mining for Business and Industry [Giudici Figini] Wiley 2nd Ed 2009

0.00 (0 votes)
Document Description
Applied Data Mining for Business and Industry, Second Edition PAOLO GIUDICI Department of Economics, University of Pavia, Italy SILVIA FIGINI Faculty of Economics, University of Pavia, Italy
File Details
Submitter
  • Name: Anonymous
Embed Code:

Add New Comment




Related Documents

Essentials of Statistics for Business and Economics Anderson 5th Edition Solutions Manual

by: gordonbarbier, 51 pages

Essentials of Statistics for Business and Economics Anderson 5th Edition Solutions Manual

Essentials of Statistics for Business and Economics Anderson 5th Edition Test Bank

by: gordonbarbier, 51 pages

Essentials of Statistics for Business and Economics Anderson 5th Edition Test Bank

Statistics for Business and Economics Newbold 7th Edition Solutions Manual

by: gordonbarbier, 48 pages

Statistics for Business and Economics Newbold 7th Edition Solutions Manual

Statistics for Business and Economics Newbold 7th Edition Solutions Manual

by: georgesheslers, 48 pages

Statistics for Business and Economics Newbold 7th Edition Solutions Manual

Adler - communicating at work: principles and practices for business and the professions - 10e, testbank 0073385174

by: jamiesmtb, 14 pages

Adler - communicating at work: principles and practices for business and the professions - 10e, testbank 0073385174 I have the following solutions manuals & test banks. You can contact me at ...

Jones - principles of taxation for business and investment planning - 13e, testbank 0073379646

by: jamiesmtb, 14 pages

Jones - principles of taxation for business and investment planning - 13e, testbank 0073379646 I have the following solutions manuals & test banks. You can contact me at ...

A NEW METHODOLOGY: DATA ELICITATION FOR SOCIAL AND REGIONAL ...

by: fadwa, 25 pages

This paper presents a new method of data elicitation for use in large-scale regional language variation studies, and for use in sociolinguistic studies of a given area. The methodology was devised ...

Course Design Document IS424: Data Mining and Business Analytics

by: jian, 15 pages

Course Design Document IS424: Data Mining and Business Analytics

Most Complete Solution manual and Testbank for Business Law: Text and Cases - Kenneth W. Clarkson (11th ed) (0324655223)

by: dishdash2010, 173 pages

Most Complete Solution manual and Testbank for Business Law: Text and Cases - Kenneth W. Clarkson (11th ed) (0324655223) For Download info contact me Dishdash2010@gmail.com I will send you a ...

Business and Professional Ethics for Directors, Executives and Accountants, 5th Edition, Leonard J. Brooks, Paul Dunn, INSTRUCTOR MANUAL

by: bestsmtb, 34 pages

Business and Professional Ethics for Directors, Executives and Accountants, 5th Edition, Leonard J. Brooks, Paul Dunn, INSTRUCTOR MANUAL --------------------------------------------------------- My ...

Content Preview

Applied Data Mining
for Business and Industry
Second Edition
PAOLO GIUDICI
Department of Economics, University of Pavia, Italy
SILVIA FIGINI
Faculty of Economics, University of Pavia, Italy
A John Wiley and Sons, Ltd., Publication


Applied Data Mining
for Business and Industry


Applied Data Mining
for Business and Industry
Second Edition
PAOLO GIUDICI
Department of Economics, University of Pavia, Italy
SILVIA FIGINI
Faculty of Economics, University of Pavia, Italy
A John Wiley and Sons, Ltd., Publication

This edition first published c 2009
c 2009 John Wiley & Sons Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for
permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the
Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK
Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be
available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and
product names used in this book are trade names, service marks, trademarks or registered trademarks of their
respective owners. The publisher is not associated with any product or vendor mentioned in this book. This
publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is
sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice
or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Giudici, Paolo.
Applied data mining for business and industry / Paolo Giudici, Silvia Figini. - 2nd ed.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-05886-2 (cloth) - ISBN 978-0-470-05887-9 (pbk.)
1.
Data mining. 2.
Business-Data processing. 3.
Commercial statistics. I. Figini, Silvia. II. Title.
QA76.9.D343G75 2009
005.74068--dc22
2009008334
A catalogue record for this book is available from the British Library
ISBN: 978-0-470-05886-2 (Hbk)
ISBN: 978-0-470-05887-9 (Pbk)
Typeset in 10/12 Times-Roman by Laserwords Private Limited, Chennai, India
Printed and bound in Great Britain by TJ International, Padstow, Cornwall, UK

Contents
1
Introduction
1
Part I
Methodology
5
2
Organisation of the data
7
2.1
Statistical units and statistical variables
7
2.2
Data matrices and their transformations
9
2.3
Complex data structures
10
2.4
Summary
11
3
Summary statistics
13
3.1
Univariate exploratory analysis
13
3.1.1
Measures of location
13
3.1.2
Measures of variability
15
3.1.3
Measures of heterogeneity
16
3.1.4
Measures of concentration
17
3.1.5
Measures of asymmetry
19
3.1.6
Measures of kurtosis
20
3.2
Bivariate exploratory analysis of quantitative data
22
3.3
Multivariate exploratory analysis of quantitative data
25
3.4
Multivariate exploratory analysis of qualitative data
27
3.4.1
Independence and association
28
3.4.2
Distance measures
29
3.4.3
Dependency measures
31
3.4.4
Model-based measures
32
3.5
Reduction of dimensionality
34
3.5.1
Interpretation of the principal components
36
3.6
Further reading
39
4
Model specification
41
4.1
Measures of distance
42
4.1.1
Euclidean distance
43
4.1.2
Similarity measures
44
4.1.3
Multidimensional scaling
46

vi
CONTENTS
4.2
Cluster analysis
47
4.2.1
Hierarchical methods
49
4.2.2
Evaluation of hierarchical methods
53
4.2.3
Non-hierarchical methods
55
4.3
Linear regression
57
4.3.1
Bivariate linear regression
57
4.3.2
Properties of the residuals
60
4.3.3
Goodness of fit
62
4.3.4
Multiple linear regression
63
4.4
Logistic regression
67
4.4.1
Interpretation of logistic regression
68
4.4.2
Discriminant analysis
70
4.5
Tree models
71
4.5.1
Division criteria
73
4.5.2
Pruning
74
4.6
Neural networks
76
4.6.1
Architecture of a neural network
79
4.6.2
The multilayer perceptron
81
4.6.3
Kohonen networks
87
4.7
Nearest-neighbour models
89
4.8
Local models
90
4.8.1
Association rules
90
4.8.2
Retrieval by content
96
4.9
Uncertainty measures and inference
96
4.9.1
Probability
97
4.9.2
Statistical models
99
4.9.3
Statistical inference
103
4.10
Non-parametric modelling
109
4.11
The normal linear model
112
4.11.1
Main inferential results
113
4.12
Generalised linear models
116
4.12.1
The exponential family
117
4.12.2
Definition of generalised linear models
118
4.12.3
The logistic regression model
125
4.13
Log-linear models
126
4.13.1
Construction of a log-linear model
126
4.13.2
Interpretation of a log-linear model
128
4.13.3
Graphical log-linear models
129
4.13.4
Log-linear model comparison
132
4.14
Graphical models
133
4.14.1
Symmetric graphical models
135
4.14.2
Recursive graphical models
139
4.14.3
Graphical models and neural networks
141
4.15
Survival analysis models
142
4.16
Further reading
144

CONTENTS
vii
5
Model evaluation
147
5.1
Criteria based on statistical tests
148
5.1.1
Distance between statistical models
148
5.1.2
Discrepancy of a statistical model
150
5.1.3
Kullback-Leibler discrepancy
151
5.2
Criteria based on scoring functions
153
5.3
Bayesian criteria
155
5.4
Computational criteria
156
5.5
Criteria based on loss functions
159
5.6
Further reading
162
Part II
Business case studies
163
6
Describing website visitors
165
6.1
Objectives of the analysis
165
6.2
Description of the data
165
6.3
Exploratory analysis
167
6.4
Model building
167
6.4.1
Cluster analysis
168
6.4.2
Kohonen networks
169
6.5
Model comparison
171
6.6
Summary report
172
7
Market basket analysis
175
7.1
Objectives of the analysis
175
7.2
Description of the data
176
7.3
Exploratory data analysis
178
7.4
Model building
181
7.4.1
Log-linear models
181
7.4.2
Association rules
184
7.5
Model comparison
186
7.6
Summary report
191
8
Describing customer satisfaction
193
8.1
Objectives of the analysis
193
8.2
Description of the data
194
8.3
Exploratory data analysis
194
8.4
Model building
197
8.5
Summary
201
9
Predicting credit risk of small businesses
203
9.1
Objectives of the analysis
203
9.2
Description of the data
203
9.3
Exploratory data analysis
205
9.4
Model building
206

Document Outline

  • Applied Data Mining for Business and Industry
    • Contents
    • 1 Introduction
    • Part I Methodology
      • 2 Organisation of the data
        • 2.1 Statistical units and statistical variables
        • 2.2 Data matrices and their transformations
        • 2.3 Complex data structures
        • 2.4 Summary
      • 3 Summary statistics
        • 3.1 Univariate exploratory analysis
          • 3.1.1 Measures of location
          • 3.1.2 Measures of variability
          • 3.1.3 Measures of heterogeneity
          • 3.1.4 Measures of concentration
          • 3.1.5 Measures of asymmetry
          • 3.1.6 Measures of kurtosis
        • 3.2 Bivariate exploratory analysis of quantitative data
        • 3.3 Multivariate exploratory analysis of quantitative data
        • 3.4 Multivariate exploratory analysis of qualitative data
          • 3.4.1 Independence and association
          • 3.4.2 Distance measures
          • 3.4.3 Dependency measures
          • 3.4.4 Model-based measures
        • 3.5 Reduction of dimensionality
          • 3.5.1 Interpretation of the principal components
        • 3.6 Further reading
      • 4 Model specification
        • 4.1 Measures of distance
          • 4.1.1 Euclidean distance
          • 4.1.2 Similarity measures
          • 4.1.3 Multidimensional scaling
        • 4.2 Cluster analysis
          • 4.2.1 Hierarchical methods
          • 4.2.2 Evaluation of hierarchical methods
          • 4.2.3 Non-hierarchical methods
        • 4.3 Linear regression
          • 4.3.1 Bivariate linear regression
          • 4.3.2 Properties of the residuals
          • 4.3.3 Goodness of fit
          • 4.3.4 Multiple linear regression
        • 4.4 Logistic regression
          • 4.4.1 Interpretation of logistic regression
          • 4.4.2 Discriminant analysis
        • 4.5 Tree models
          • 4.5.1 Division criteria
          • 4.5.2 Pruning
        • 4.6 Neural networks
          • 4.6.1 Architecture of a neural network
          • 4.6.2 The multilayer perceptron
          • 4.6.3 Kohonen networks
        • 4.7 Nearest-neighbour models
        • 4.8 Local models
          • 4.8.1 Association rules
          • 4.8.2 Retrieval by content
        • 4.9 Uncertainty measures and inference
          • 4.9.1 Probability
          • 4.9.2 Statistical models
          • 4.9.3 Statistical inference
        • 4.10 Non-parametric modelling
        • 4.11 The normal linear model
          • 4.11.1 Main inferential results
        • 4.12 Generalised linear models
          • 4.12.1 The exponential family
          • 4.12.2 Definition of generalised linear models
          • 4.12.3 The logistic regression model
        • 4.13 Log-linear models
          • 4.13.1 Construction of a log-linear model
          • 4.13.2 Interpretation of a log-linear model
          • 4.13.3 Graphical log-linear models
          • 4.13.4 Log-linear model comparison
        • 4.14 Graphical models
          • 4.14.1 Symmetric graphical models
          • 4.14.2 Recursive graphical models
          • 4.14.3 Graphical models and neural networks
        • 4.15 Survival analysis models
        • 4.16 Further reading
      • 5 Model evaluation
        • 5.1 Criteria based on statistical tests
          • 5.1.1 Distance between statistical models
          • 5.1.2 Discrepancy of a statistical model
          • 5.1.3 Kullback–Leibler discrepancy
        • 5.2 Criteria based on scoring functions
        • 5.3 Bayesian criteria
        • 5.4 Computational criteria
        • 5.5 Criteria based on loss functions
        • 5.6 Further reading
    • Part II Business case studies
      • 6 Describing website visitors
        • 6.1 Objectives of the analysis
        • 6.2 Description of the data
        • 6.3 Exploratory analysis
        • 6.4 Model building
          • 6.4.1 Cluster analysis
          • 6.4.2 Kohonen networks
        • 6.5 Model comparison
        • 6.6 Summary report
      • 7 Market basket analysis
        • 7.1 Objectives of the analysis
        • 7.2 Description of the data
        • 7.3 Exploratory data analysis
        • 7.4 Model building
          • 7.4.1 Log-linear models
          • 7.4.2 Association rules
        • 7.5 Model comparison
        • 7.6 Summary report
      • 8 Describing customer satisfaction
        • 8.1 Objectives of the analysis
        • 8.2 Description of the data
        • 8.3 Exploratory data analysis
        • 8.4 Model building
        • 8.5 Summary
      • 9 Predicting credit risk of small businesses
        • 9.1 Objectives of the analysis
        • 9.2 Description of the data
        • 9.3 Exploratory data analysis
        • 9.4 Model building
        • 9.5 Model comparison
        • 9.6 Summary report
      • 10 Predicting e-learning student performance
        • 10.1 Objectives of the analysis
        • 10.2 Description of the data
        • 10.3 Exploratory data analysis
        • 10.4 Model specification
        • 10.5 Model comparison
        • 10.6 Summary report
      • 11 Predicting customer lifetime value
        • 11.1 Objectives of the analysis
        • 11.2 Description of the data
        • 11.3 Exploratory data analysis
        • 11.4 Model specification
        • 11.5 Model comparison
        • 11.6 Summary report
      • 12 Operational risk management
        • 12.1 Context and objectives of the analysis
        • 12.2 Exploratory data analysis
        • 12.3 Model building
        • 12.4 Model comparison
        • 12.5 Summary conclusions
    • References
    • Index

Download
Applied Data Mining for Business and Industry [Giudici Figini] Wiley 2nd Ed 2009

 

 

Your download will begin in a moment.
If it doesn't, click here to try again.

Share Applied Data Mining for Business and Industry [Giudici Figini] Wiley 2nd Ed 2009 to:

Insert your wordpress URL:

example:

http://myblog.wordpress.com/
or
http://myblog.com/

Share Applied Data Mining for Business and Industry [Giudici Figini] Wiley 2nd Ed 2009 as:

From:

To:

Share Applied Data Mining for Business and Industry [Giudici Figini] Wiley 2nd Ed 2009.

Enter two words as shown below. If you cannot read the words, click the refresh icon.

loading

Share Applied Data Mining for Business and Industry [Giudici Figini] Wiley 2nd Ed 2009 as:

Copy html code above and paste to your web page.

loading