Go to main content
1/34
Contents
List of Examples
List of Figures
List of Tables
Title and Copyright Information
Preface
Audience
Documentation Accessibility
Related Documentation
Conventions
What's New in Oracle Data Mining?
Oracle Database 11
g
Release 2 (11.2.0.3) Oracle Data Mining
Oracle Database 11
g
Release 2 (11.2.0.2) Oracle Data Mining
Oracle Database 11
g
Release 1 (11.1) Oracle Data Mining
Part I Introductions
1
What Is Data Mining?
What Is Data Mining?
Automatic Discovery
Prediction
Grouping
Actionable Information
Data Mining and Statistics
Data Mining and OLAP
Data Mining and Data Warehousing
What Can Data Mining Do and Not Do?
Asking the Right Questions
Understanding Your Data
The Data Mining Process
Problem Definition
Data Gathering and Preparation
Model Building and Evaluation
Knowledge Deployment
2
Introducing Oracle Data Mining
Data Mining in the Database Kernel
Data Mining in Oracle Exadata
Data Mining Functions
Supervised Data Mining
Supervised Learning: Testing
Supervised Learning: Scoring
Unsupervised Data Mining
Unsupervised Learning: Scoring
Oracle Data Mining Functions
Data Mining Algorithms
Oracle Data Mining Supervised Algorithms
Oracle Data Mining Unsupervised Algorithms
Data Preparation
Automatic Data Preparation
Model Transparency
How Do I Use Oracle Data Mining?
Oracle Data Miner
PL/SQL Packages
SQL Scoring Functions
R-ODM
PMML Import
Oracle Spreadsheet Add-In for Predictive Analytics
Java API
Where Do I Find Information About Oracle Data Mining?
Oracle Data Mining Resources on the Oracle Technology Network
Oracle Data Mining and Oracle Database Analytics
3
Introducing Oracle Predictive Analytics
About Predictive Analytics
Predictive Analytics and Data Mining
How Does it Work?
Predictive Analytics Operations
Oracle Spreadsheet Add-In for Predictive Analytics
DBMS_PREDICTIVE_ANALYTICS
Example: PREDICT
Behind the Scenes
EXPLAIN
PREDICT
Accuracy
PROFILE
Part II Mining Functions
4
Regression
About Regression
How Does Regression Work?
Linear Regression
Multivariate Linear Regression
Regression Coefficients
Nonlinear Regression
Multivariate Nonlinear Regression
Confidence Bounds
Testing a Regression Model
Regression Statistics
Root Mean Squared Error
Mean Absolute Error
Regression Algorithms
5
Classification
About Classification
Testing a Classification Model
Confusion Matrix
Lift
Lift Statistics
Receiver Operating Characteristic (ROC)
The ROC Curve
Area Under the Curve
ROC and Model Bias
ROC Statistics
Biasing a Classification Model
Costs
Costs Versus Accuracy
Positive and Negative Classes
Assigning Costs and Benefits
Priors
Classification Algorithms
6
Anomaly Detection
About Anomaly Detection
One-Class Classification
Anomaly Detection for Single-Class Data
Anomaly Detection for Finding Outliers
Anomaly Detection Algorithm
7
Clustering
About Clustering
How are Clusters Computed?
Scoring New Data
Hierarchical Clustering
Rules
Support and Confidence
Evaluating a Clustering Model
Clustering Algorithms
8
Association
About Association
Association Rules
Market-Basket Analysis
Association Rules and eCommerce
Transactional Data
Association Algorithm
9
Feature Selection and Extraction
Finding the Best Attributes
About Feature Selection and Attribute Importance
Attribute Importance and Scoring
About Feature Extraction
Feature Extraction and Scoring
Algorithms for Attribute Importance and Feature Extraction
Part III Algorithms
10
Apriori
About Apriori
Association Rules and Frequent Itemsets
Antecedent and Consequent
Confidence
Data Preparation for Apriori
Native Transactional Data and Star Schemas
Items and Collections
Sparse Data
Calculating Association Rules
Itemsets
Frequent Itemsets
Example: Calculating Rules from Frequent Itemsets
Evaluating Association Rules
Support
Confidence
Lift
11
Decision Tree
About Decision Tree
Decision Tree Rules
Confidence and Support
Advantages of Decision Trees
XML for Decision Tree Models
Growing a Decision Tree
Splitting
Cost Matrix
Preventing Over-Fitting
Tuning the Decision Tree Algorithm
Data Preparation for Decision Tree
12
Generalized Linear Models
About Generalized Linear Models
GLM in Oracle Data Mining
Interpretability and Transparency
Wide Data
Confidence Bounds
Ridge Regression
Build Settings for Ridge Regression
Ridge and Confidence Bounds
Ridge and Variance Inflation Factor for Linear Regression
Ridge and Data Preparation
Tuning and Diagnostics for GLM
Build Settings
Diagnostics
Coefficient Statistics
Global Model Statistics
Row Diagnostics
Data Preparation for GLM
Data Preparation for Linear Regression
Data Preparation for Logistic Regression
Missing Values
Linear Regression
Coefficient Statistics for Linear Regression
Global Model Statistics for Linear Regression
Row Diagnostics for Linear Regression
Logistic Regression
Reference Class
Class Weights
Coefficient Statistics for Logistic Regression
Global Model Statistics for Logistic Regression
Row Diagnostics for Logistic Regression
13
k
-Means
About
k
-Means
Oracle Data Mining Enhanced
k
-Means
Centroid
Scoring
Tuning the
k
-Means Algorithm
Data Preparation for
k
-Means
14
Minimum Description Length
About MDL
Compression and Entropy
Values of a Random Variable: Statistical Distribution
Values of a Random Variable: Significant Predictors
Total Entropy
Model Size
Model Selection
The MDL Metric
Data Preparation for MDL
15
Naive Bayes
About Naive Bayes
Advantages of Naive Bayes
Tuning a Naive Bayes Model
Data Preparation for Naive Bayes
16
Non-Negative Matrix Factorization
About NMF
Matrix Factorization
Scoring with NMF
Text Mining with NMF
Tuning the NMF Algorithm
Data Preparation for NMF
17
O-Cluster
About O-Cluster
Partitioning Strategy
Partitioning Numerical Attributes
Partitioning Categorical Attributes
Active Sampling
Process Flow
Scoring
Tuning the O-Cluster Algorithm
Data Preparation for O-Cluster
User-Specified Data Preparation for O-Cluster
18
Support Vector Machines
About Support Vector Machines
Advantages of SVM
Advantages of SVM in Oracle Data Mining
Usability
Scalability
Kernel-Based Learning
Active Learning
Tuning an SVM Model
Data Preparation for SVM
Normalization
SVM and Automatic Data Preparation
SVM Classification
Class Weights
One-Class SVM
SVM Regression
Part IV Data Preparation
19
Automatic and Embedded Data Preparation
Overview
The Case Table
Data Type Conversion
Date Data
Text Transformation
Business and Domain-Sensitive Transformations
Automatic Data Preparation
Binning
Normalization
Outlier Treatment
Transformations With Automatic Data Preparation
Embedded Data Preparation
Transformations Lists and Automatic Data Preparation
Oracle Data Mining Transformation Routines
Binning Routines
Normalization Routines
Routines for Outlier Treatment
Transparency
Internal Transformations
Part V Mining Unstructured Data
20
Text Mining
About Unstructured Data
How Oracle Data Mining Supports Unstructured Data
Mixed Data
Text Data Types
Text Mining Algorithms
Text Classification
Multi-Class Document Classification
Multi-Target Document Classification
Document Classification Algorithms
Text Clustering
Text Feature Extraction
Text Association
Text Attribute Importance
Preparing Text for Mining
Oracle Data Mining and Oracle Text
Glossary
Index
Scripting on this page enhances content navigation, but does not change the content in any way.