7 The Data Mining Sample Programs

You can learn a great deal about the Oracle Data Mining APIs from the Data Mining sample programs. The programs illustrate typical approaches to data preparation, algorithm selection, algorithm tuning, testing, and scoring. Each program creates a mining model in the database. All the programs include extensive inline comments to help you understand the code.

See Also:

Oracle Data Mining User's Guide for information about the Data Mining APIs

Note:

The Oracle Data Mining Java API is deprecated in this release.

The Java sample programs are still shipped, but Oracle recommends that you not use the Oracle Data Mining Java API in new applications. Support for deprecated features is for backward compatibility only

This chapter includes the following sections:

Installation and Setup

The Data Mining sample programs are installed with Oracle Database Examples. They are also available for download from the Oracle Technology Network:

The programs require access to a database that includes the sample schemas. Before you can run the programs, you must run two configuration scripts to configure the data and assign the required privileges to your user ID.

Install the Sample Programs

Follow these steps to install the sample programs:

  1. Install Oracle Database with the sample schemas, or obtain access to a database that includes the sample schemas.

    • If you followed the instructions in "Install Oracle Database", the sample schemas are installed automatically in the starter database. Be sure to unlock the SH schema.

    • If the database does not include the sample schemas, you can install them manually or by using Oracle Database Configuration Assistant. See Oracle Database Sample Schemas for instructions.

  2. Determine whether or not Database Examples was installed with Oracle Database. Database Examples provides a set of sample programs that illustrate numerous features of Oracle Database, including Oracle Data Mining. The programs are loaded into the RDBMS/demo subdirectory of Oracle home.

    If Database Examples was not installed, you can perform the installation by following the instructions in "Optionally Install Oracle Database Examples". Alternatively, you can download the Data Mining sample programs from the Oracle Technology Network.

    http://www.oracle.com/technetwork/database/options/odm/index.html
    

Run the Configuration Scripts

Follow these steps to configure the sample data and grant the necessary privileges to your data mining user ID.

  1. Log in to SQL*Plus with system privileges.

        Enter user-name: sys / as sysdba
        Enter password: password
    
  2. If you do not have a user ID for your data mining activities, you can create one by following the instructions in "Example: Create a Database User in SQL*Plus".

  3. Run dmshgrants.sql to grant data mining privileges and SH access to your user ID. Several tables in SH are used by the Data Mining sample programs. Specify the data mining user name as the parameter. Specify the full path of the Oracle home directory.

     @ ORACLE_HOME\RDBMS\demo\dmshgrants dmuser
    
  4. Now connect to the database as the Data Mining user.

    CONNECT dmuser
    Enter password: password
    
  5. Run dmsh.sql to populate the schema of the Data Mining user with tables, views, and other objects needed by the sample programs. Specify the full path of the Oracle home directory.

    @ ORACLE_HOME\RDBMS\demo\dmsh
    COMMIT;
    

Locate the Sample Programs

This section explains how to locate the sample programs if they were installed with Database Examples.

To locate the PL/SQL programs, navigate to the parent directory and search for the files that start with dm and end with .sql.

For example, if Database Examples was installed in Oracle home C:\app\demotest\product\11.2.0\db_1\, then navigate to C:\app\demotest\product\11.2.0\db_1\RDBMS\demo\ and use Windows Search to find the files named dm*.sql. Windows Search returns the list of Data Mining PL/SQL programs, as shown in Figure 7-1.

Figure 7-1 The Data Mining Sample PL/SQL Programs

Description of Figure 7-1 follows
Description of "Figure 7-1 The Data Mining Sample PL/SQL Programs"

Note:

The files listed in Figure 7-1 include all the Data Mining PL/SQL programs. However, one of the files, dmhpdemo.sql, is not a Data Mining program.

Use Windows Search to find the files named dm*.java in the same directory. Windows Search returns the Data Mining Java programs, as shown in Figure 7-2.

Figure 7-2 The Data Mining Sample Java Programs

Description of Figure 7-2 follows
Description of "Figure 7-2 The Data Mining Sample Java Programs"

Run the Sample Programs

You can run the sample programs as many times as you wish. The programs clean up the results of the previous run before executing the current run.

While the program is running, it displays the program code and the program output.

Run the PL/SQL Sample Programs

To run the PL/SQL programs:

  1. Start SQL*Plus and log in as the Data Mining user.

        Enter user-name: dmuser
        Enter password: password
    
  2. Run the program by specifying an at sign (@) followed by the fully-qualified path of the program. In the following example, replace ORACLE_HOME with the path of the Oracle home directory.

    SQL>@ ORACLE_HOME\RDBMS\demo\dmnbdemo
    

    This example executes the program dmnbdemo.sql, which creates a Naive Bayes model.

Prepare to Run the Java Programs

Before you can run the Java programs, you must set up your Java environment and compile the programs.

  1. Check that the version of Java you are using is 1.5 or higher. You can execute the following in a command window to check the version of Java.

    >java -version
    
  2. Add ORACLE_HOME\jdk\bin\ to your PATH variable before the paths of any other Java versions. ORACLE_HOME is the full path to the Oracle home directory.

  3. Add the following Data Mining JAR files to your Windows CLASSPATH:

                ORACLE_HOME\RDBMS\jlib\jdm.jar
                ORACLE_HOME\RDBMS\jlib\ojdm_api.jar
                ORACLE_HOME\RDBMS\jlib\xdb.jar
                ORACLE_HOME\jdbc\lib\ojdbc5.jar
                ORACLE_HOME\oc4j\j2ee\home\lib\connector.jar
                ORACLE_HOME\jlib\orai18n.jar   
                ORACLE_HOME\jlib\orai18n-mapping.jar
                ORACLE_HOME\lib\xmlparserv2.jar
    
  4. Compile the programs listed in Figure 7-2. To use the JAVAC executable, open a command window and go to \RDBMS\demo in Oracle home.

    >javac program_name.java
    

    For example:

    >javac dmnbdemo.java
    

    If JAVAC is not found, then check the value of the PATH variable.

Run the Java Programs

You can run the Java programs from the operating system prompt with a command like this:

>java program_name host_name:port_number:database_identifier user password

List the Sample Models

The mining models created by the sample programs can be viewed with a query like the one shown in Example 7-1.

Example 7-1 Sample Data Mining Models

SQL> SELECT model_name, mining_function, algorithm FROM user_mining_models
            ORDER BY model_name;
 
MODEL_NAME                     MINING_FUNCTION                ALGORITHM
------------------------------ ------------------------------ ------------------------------
ABNMODEL_JDM                   CLASSIFICATION                 ADAPTIVE_BAYES_NETWORK
ABN_SH_CLAS_SAMPLE             CLASSIFICATION                 ADAPTIVE_BAYES_NETWORK
AIMODEL_JDM                    ATTRIBUTE_IMPORTANCE           MINIMUM_DESCRIPTION_LENGTH
AI_SH_SAMPLE                   ATTRIBUTE_IMPORTANCE           MINIMUM_DESCRIPTION_LENGTH
APMODEL_JDM                    CLASSIFICATION                 NAIVE_BAYES
ARMODEL_JDM                    ASSOCIATION_RULES              APRIORI_ASSOCIATION_RULES
AR_SH_SAMPLE                   ASSOCIATION_RULES              APRIORI_ASSOCIATION_RULES
AR_SH_SAMPLE_STR_XNAL          ASSOCIATION_RULES              APRIORI_ASSOCIATION_RULES
AR_SH_SAMPLE_XNAL_SVAL         ASSOCIATION_RULES              APRIORI_ASSOCIATION_RULES
DT_SH_CLAS_SAMPLE              CLASSIFICATION                 DECISION_TREE
GLMCMODEL_JDM                  CLASSIFICATION                 GENERALIZED_LINEAR_MODEL
GLMC_SH_CLAS_SAMPLE            CLASSIFICATION                 GENERALIZED_LINEAR_MODEL
GLMRMODEL_JDM                  REGRESSION                     GENERALIZED_LINEAR_MODEL
GLMR_SH_REGR_SAMPLE            REGRESSION                     GENERALIZED_LINEAR_MODEL
KMMODEL_JDM                    CLUSTERING                     KMEANS
KM_SH_CLUS_SAMPLE              CLUSTERING                     KMEANS
NBEXPIMPMODEL_JDM              CLASSIFICATION                 NAIVE_BAYES
NBMODEL_JDM                    CLASSIFICATION                 NAIVE_BAYES
NB_SH_CLAS_SAMPLE              CLASSIFICATION                 NAIVE_BAYES
NMFMODEL_JDM                   FEATURE_EXTRACTION             NONNEGATIVE_MATRIX_FACTOR
NMF_SH_SAMPLE                  FEATURE_EXTRACTION             NONNEGATIVE_MATRIX_FACTOR
OCMODEL_JDM                    CLUSTERING                     O_CLUSTER
OC_SH_CLUS_SAMPLE              CLUSTERING                     O_CLUSTER
SVMCMODEL_JDM                  CLASSIFICATION                 SUPPORT_VECTOR_MACHINES
SVMC_SH_CLAS_SAMPLE            CLASSIFICATION                 SUPPORT_VECTOR_MACHINES
SVMOMODEL_JDM                  CLASSIFICATION                 SUPPORT_VECTOR_MACHINES
SVMO_SH_CLAS_SAMPLE            CLASSIFICATION                 SUPPORT_VECTOR_MACHINES
SVMRMODEL_JDM                  REGRESSION                     SUPPORT_VECTOR_MACHINES
SVMR_SH_REGR_SAMPLE            REGRESSION                     SUPPORT_VECTOR_MACHINES
TREEMODEL_JDM                  CLASSIFICATION                 DECISION_TREE
TXTNMFMODEL_JDM                FEATURE_EXTRACTION             NONNEGATIVE_MATRIX_FACTOR
TXTSVMMODEL_JDM                CLASSIFICATION                 SUPPORT_VECTOR_MACHINES
T_NMF_SAMPLE                   FEATURE_EXTRACTION             NONNEGATIVE_MATRIX_FACTOR
T_SVM_CLAS_SAMPLE              CLASSIFICATION                 SUPPORT_VECTOR_MACHINES

The model names distinguish the models created by the Java programs from those created by the PL/SQL programs. The models created by the Java programs have "_JDM" appended to the name.

The PL/SQL Programs

The PL/SQL sample programs illustrate the use of the DBMS_DATA_MINING package for creating models and the DBMS_DATA_MINING_TRANSFORM package for performing transformations on the mining data.

See Also:

PL/SQL Programs: Algorithms

The PL/SQL programs are presented by algorithm in Table 7-1.

Table 7-1 Algorithms in PL/SQL Sample Programs

Program File Algorithm Mining Function or Task

dmaidemo.sql

Minimum Descriptor Length

Attribute Importance

dmardemo.sql

Apriori

Association

dmdtdemo.sql

Decision Tree

Classification

dmdtxvlddemo.sql

Decision Tree (cross validation)

Classification

dmglcdem.sql

Binary Logistic Regression (GLM)

Classification

dmglrdem.sql

Multivariate Linear Regression (GLM)

Regression

dmkmdemo.sql

k-Means

Clustering

dmnbdemo.sql

Naive Bayes

Classification

dmnmdemo.sql

Non-Negative Matrix Factorization

Feature Extraction

dmocdemo.sql

O-Cluster

Clustering

dmsvcdem.sql

Support Vector Machine

Classification

dmsvodem.sql

Support Vector Machine

Anomaly Detection

dmsvrdem.sql

Support Vector Machine

Regression

dmtxtfe.sql

Term extraction using Oracle Text

Text transformation for mining

dmtxtnmf.sql

Non-Negative Matrix Factorization

Text mining using NMF

dmtxtsvm.sql

Support Vector Machine

Text mining using SVM


PL/SQL Programs: Mining Functions

The PL/SQL sample programs are presented by mining function in Table 7-2. For detailed descriptions of the sample programs, see the comments in the source code.

Table 7-2 Mining Functions of PL/SQL Sample Programs

Mining Function Description

Classification

The classification programs demonstrate various preprocessing techniques and perform the following steps:

  • Build a classification model using training data

  • Display model details and settings

  • Test the model by applying the model on the test data

  • Compute test metrics, such as confusion matrix, lift, and ROC

  • Apply the model on the scoring data

  • Present apply results

  • Present ranked apply results, influenced by a cost matrix

dmnbdemo.sql illustrates Naive Bayes.

dmdtdemo.sql illustrates Decision Tree.

dmsvcdem.sql illustrates SVM classification.

dmglcdem.sql illustrates GLM classification (binary logistic regression)

The dmdtxvlddemo.sql program demonstrates cross-validation techniques for decision tree based-classification. With minor modifications, this program can be used to perform cross validation using other models/algorithms.

Regression

dmsvrdem.sql uses different test metrics, but otherwise performs most of the same steps used in the classification programs. Selected attributes of the input data are preprocessed (normalized).

NOTE: dmsvrdem.sql illustrates the use of Automatic Data Preparation.

dmglrdem.sql illustrates GLM regression (multivariate linear regression)

Anomaly Detection

dmsvodem.sql illustrates one-class SVM

Association

dmardemo.sql builds an association model and presents frequent itemsets and association rules as output.

Clustering

dmkmdemo.sql (k-Means) and dmocdemo.sql (0-Cluster) build clustering models and present cluster details, such as rules, centroid, and histogram for each cluster as output. The models are scored, and the probabilities associated with each cluster are returned as output. Selected attributes of the input data are preprocessed.

NOTE: dmkmdemo.sql illustrates the use of Automatic Data Preparation.

Feature extraction

dmnmdemo.sql builds a feature extraction model and presents model details as the output. The model is scored, and each feature ID is associated with a probability. Selected attributes of the input data are preprocessed (normalized).

Attribute importance

dmaidemo.sql builds an attribute importance model and presents a list of important attributes as the output of model details. Selected attributes of the input data are preprocessed (binned).


PL/SQL Text Mining Programs

Three sample programs illustrate the process of text mining using PL/SQL. One program illustrates the preprocessing that is required to transform the text for mining. The other two programs build models that use the transformed text.

The PL/SQL sample text mining programs are:

  • dmtxtfe.sql — Illustrates the process of term extraction that prepares text for mining

  • dmtxtnmf.sql — Creates a text mining model using the Non-Negative Matrix Factorization algorithm

  • dmtxtsvm.sql — Creates a text mining model using SVM classification

The Java Programs

The Java demos illustrate the features of the Oracle Data Mining Java API, which implements Oracle-specific extensions to the Java Data Mining (JDM) 1.0.1.1 standard. The sample Java programs demonstrate all the Data Mining algorithms as well as data transformation techniques, predictive analytics, export/import, and text mining.

Note:

The Oracle Data Mining Java API is deprecated in this release.

Oracle recommends that you not use deprecated features in new applications. Support for deprecated features is for backward compatibility only

See Also:

Java Programs: Algorithms

The Java programs are presented by algorithm in Table 7-3.

Table 7-3 Algorithms in Java Sample Programs

Program File Algorithm Mining Function or Task

dmaidemo.java

Minimum Description Length

Attribute importance

dmapplydemo.java

Naive Bayes

Scoring methods

dmardemo.java

Apriori

Association

dmexpimpdemo.java

export/import

Model Export/Import

dmglcdemo.java

Binary Logistic Regression (GLM)

Classification

dmglrdemo.java

Multivariate Linear Regression (GLM)

Regression

dmkmdemo.java

k-Means

Clustering

dmnbdemo.java

Naive Bayes

Classification

dmnmdemo.java

Non-Negative Matrix Factorization

Feature extraction

dmocdemo.java

O-Cluster

Clustering

dmpademo.java

Automated predict and explain

Predictive Analytics

dmsvcdemo.java

Support Vector Machine

Classification

dmsvodemo.java

Support Vector Machine (one class)

Classification

dmsvrdemo.java

Support Vector Machine

Regression

dmtreedemo.java

Decision Tree

Classification

dmtxtnmfdemo.java

Non-Negative Matrix Factorization

Text mining with NMF

dmtxtsvmdemo.java

Support Vector Machine

Text mining with SVM classification

dmxfdemo.java

Binning, clipping, and normalization

Data Transformations


Java Programs: Mining Functions

The Java sample programs are presented by mining function in Table 7-4. For detailed descriptions of the sample programs, see the comments in the source code.

Table 7-4 Mining Functions of the Java Sample Programs

Mining Function or Task Description

Classification

The classification programs demonstrate various preprocessing techniques and perform the following steps:

  • Build a classification model using training data

  • Display model details and settings

  • Test the model by applying the model on the test data

  • Compute test metrics, such as confusion matrix, lift, and ROC

  • Apply the model on the scoring data

  • Present apply results

  • Present ranked apply results, influenced by a cost matrix

The dmapplydemo.java program demonstrates several ways of applying a Naive Bayes model.

dmglcdemo.java illustrates GLM classification (binary logistic regression)

Regression

dmsvrdemo.java uses different test metrics, but otherwise performs most of the same steps used in the classification programs. Selected attributes of the input data are preprocessed (normalized).

dmglrdemo.java illustrates GLM regression (multivariate linear regression)

Association

dmardemo.java builds an association model and presents frequent itemsets and association rules as output. Selected attributes of the input data are preprocessed (binned).

Clustering

dmkmdemo.java (k-Means) and dmocdemo.java (0-Cluster) build clustering models and present cluster details, such as rules, centroid, and histogram for each cluster as output. The models are scored, and the probabilities associated with each cluster are returned as output. Selected attributes of the input data are preprocessed (normalized).

Feature extraction

dmnmdemo.java builds a feature extraction model and presents model details as the output. The model is scored, and each feature ID is associated with a probability. Selected attributes of the input data are preprocessed (normalized).

Attribute importance

dmaidemo.java builds an attribute importance model and presents a list of important attributes as the output of model details. Selected attributes of the input data are preprocessed (binned).

Data transformations

dmxfdemo.java demonstrates binning, clipping, and normalization transformations.

Predictive Analytics

dmpademo.java demonstrates PREDICT, EXPLAIN, and PROFILE functions.

Model import/export

dmexpimpdemo.java builds a Naive Bayes model, exports it to a dump file, then imports it from the dump file.


Java Text Mining Programs

Two Java programs illustrate the process of text mining. One builds a feature extraction model, the other builds a classification model.

See Also:

The Java text mining programs both use the dmtxtnmfdemo.java interface to transform the text for mining. The programs are as follows:

  • dmtxtnmf.sql — Creates a text mining model using the Non-Negative Matrix Factorization algorithm

  • dmtxtsvmdemo.java — Creates a text mining model using SVM classification

The Sample Data

The dmsh.sql script creates views, tables, and indexes in the user's schema. The views define columns of customer data from tables in the SH schema. This data is used by the Data Mining sample programs. The tables reference the same columns in SH, but they include an extra COMMENTS column for text mining. The indexes are used to extract terms from the text in the COMMENTS column and build a nested table column.

Customer Data for Data Mining

Views in the data mining user's schema define columns of data from the CUSTOMERS, SALES, PRODUCTS, COUNTRIES, and SUPPLEMENTARY_DEMOGRAPHICS tables in the SH schema. You can list these views with the following SQL statements.

SQL>CONNECT dmuser
Enter password: password
SQL>SELECT view_name FROM user_views;

The views are listed in Table 7-5.

Table 7-5 Views Used by the Data Mining Sample Programs

View Name Description

MINING_DATA_APPLY_STR_V

Scoring data for o-cluster

MINING_DATA_BUILD_STR_V

Training data for o-cluster

MINING_DATA_APPLY_V

Scoring data for data mining (not text mining)

MINING_DATA_BUILD_V

Training data for data mining (not text mining)

MINING_DATA_TEST_V

Test data for data mining (not text mining)

MARKET_BASKET_V

Data for association rules

MINING_DATA_ONE_CLASS_V

Data for one-class SVM


You can see the references to tables in SH by listing the view definitions. The definition of the view MINING_DATA_BUILD_V is shown as follows.

SQL> set long 1000000
SQL> set longc 100000
SQL> set pagesize 100
SQL> SELECT text FROM all_views WHERE 
    owner='dmuser3'AND view_name='mining_data_build_v';

      SELECT a.CUST_ID, a.CUST_GENDER, 2003-a.CUST_YEAR_OF_BIRTH AGE, 
             a.CUST_MARITAL_STATUS, c.COUNTRY_NAME, a.CUST_INCOME_LEVEL,
             b.EDUCATION, b.OCCUPATION, b.HOUSEHOLD_SIZE, b.YRS_RESIDENCE,
             b.AFFINITY_CARD, b.BULK_PACK_DISKETTES, b.FLAT_PANEL_MONITOR,
             b.HOME_THEATER_PACKAGE, b.BOOKKEEPING_APPLICATION, 
             b.PRINTER_SUPPLIES, b.Y_BOX_GAMES, b.OS_DOC_SET_KANJI 
       FROM  sh.customers a, 
             sh.supplementary_demographics b, 
             sh.countries c 
       WHERE a.CUST_ID = b.CUST_ID AND a.country_id = c.country_id 
             AND a.cust_id between 101501 and 103000 

The views are used to build, test, and score the sample models. Each view has a CUSTOMER_ID column, which is the case ID, and an AFFINITY_CARD column, which is the target used by the predictive models. Most of the views provide data for 1500 customers (1500 rows). The view used by the One-Class SVM model has data for 940 customers.

The columns of training data in the MINING_DATA_BUILD_V view are as follows.

SQL> DESCRIBE mining_data_build_v

CUST_ID                    NOT NULL            NUMBER
CUST_GENDER                NOT NULL            CHAR(1)
AGE                                            NUMBER
CUST_MARITAL_STATUS                            VARCHAR2(20)
COUNTRY_NAME               NOT NULL            VARCHAR2(40)
CUST_INCOME_LEVEL                              VARCHAR2(30)
EDUCATION                                      VARCHAR2(21)
OCCUPATION                                     VARCHAR2(21)
HOUSEHOLD_SIZE                                 VARCHAR2(21)
YRS_RESIDENCE                                  NUMBER
AFFINITY_CARD                                  NUMBER(10)
BULK_PACK_DISKETTES                            NUMBER(10)
FLAT_PANEL_MONITOR                             NUMBER(10)
HOME_THEATER_PACKAGE                           NUMBER(10)
BOOKKEEPING_APPLICATION                        NUMBER(10)
PRINTER_SUPPLIES                               NUMBER(10)
Y_BOX_GAMES                                    NUMBER(10)
OS_DOC_SET_KANJI                               NUMBER(10)

Market Basket Data for Association Rules

The association demos use the MARKET_BASKET_V data set, which includes columns of products from the PRODUCTS table and the CUSTOMER_ID column from the CUSTOMERS table in SH. The columns of the MARKET_BASKET_V view are as follows.

SQL> DESCRIBE market_basket_v

CUST_ID                     NOT NULL         NUMBER
EXTENSION_CABLE                              NUMBER
FLAT_PANEL_MONITOR                           NUMBER
CD_RW_HIGH_SPEED_5_PACK                      NUMBER
ENVOY_256MB_40GB                             NUMBER
ENVOY_AMBASSADOR                             NUMBER
EXTERNAL_8X_CD_ROM                           NUMBER
KEYBOARD_WRIST_REST                          NUMBER
SM26273_BLACK_INK_CARTRIDGE                  NUMBER
MOUSE_PAD                                    NUMBER
MULTIMEDIA_SPEAKERS_3INCH                    NUMBER
OS_DOC_SET_ENGLISH                           NUMBER
SIMM_16MB_PCMCIAII_CARD                      NUMBER
STANDARD_MOUSE                               NUMBER 

Customer Data for Text Mining

The text mining demos use the same customer data from tables in SH, but they include either an extra text column or a collection type column. The collection type is a nested table of type DM_NESTED_NUMERICALS.

Tip:

The process of extracting terms from a text column into a nested table column is described in Oracle Data Mining User's Guide.

You can list these tables with the following SQL statements.

SQL>CONNECT dmuser
Enter password: password
SQL>SELECT table_name FROM user_tables WHERE table_name LIKE '%MINING%';

The text mining tables are listed in Table 7-6.

Table 7-6 Tables Used by the Text Mining Sample Programs

Table Name Description

MINING_APPLY_NESTED_TEXT

Apply table with COMMENTS column as DM_NESTED_NUMERICALS

MINING_BUILD_NESTED_TEXT

Build table with COMMENTS column as DM_NESTED_NUMERICALS

MINING_TEST_NESTED_TEXT

Test table with COMMENTS column as DM_NESTED_NUMERICALS

MINING_APPLY_TEXT

Apply table with COMMENTS column as VARCHAR2(4000)

MINING_BUILD_TEXT

Build table with COMMENTS column as VARCHAR2(4000)

MINING_TEST_TEXT

Test table with COMMENTS column as VARCHAR2(4000)


In the MINING_BUILD_TEXT, MINING_TEST_TEXT, and MINING_APPLY_TEXT tables, the COMMENTS column is of type VARCHAR2(4000).

SQL> DESCRIBE mining_build_text
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 CUST_ID                                   NOT NULL NUMBER
 CUST_GENDER                               NOT NULL CHAR(1)
 AGE                                                NUMBER
 CUST_MARITAL_STATUS                                VARCHAR2(20)
 COUNTRY_NAME                              NOT NULL VARCHAR2(40)
 CUST_INCOME_LEVEL                                  VARCHAR2(30)
 EDUCATION                                          VARCHAR2(21)
 OCCUPATION                                         VARCHAR2(21)
 HOUSEHOLD_SIZE                                     VARCHAR2(21)
 YRS_RESIDENCE                                      NUMBER
 AFFINITY_CARD                                      NUMBER(10)
 BULK_PACK_DISKETTES                                NUMBER(10)
 FLAT_PANEL_MONITOR                                 NUMBER(10)
 HOME_THEATER_PACKAGE                               NUMBER(10)
 BOOKKEEPING_APPLICATION                            NUMBER(10)
 PRINTER_SUPPLIES                                   NUMBER(10)
 Y_BOX_GAMES                                        NUMBER(10)
 OS_DOC_SET_KANJI                                   NUMBER(10)
 COMMENTS                                           VARCHAR2(4000)

In the MINING_*_NESTED_TEXT tables, the COMMENTS column is of type DM_NESTED_NUMERICALS.

SQL> DESCRIBE mining_build_nested_text
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 CUST_ID                                   NOT NULL NUMBER
 CUST_GENDER                               NOT NULL CHAR(1)
 AGE                                                NUMBER
 CUST_MARITAL_STATUS                                VARCHAR2(20)
 COUNTRY_NAME                              NOT NULL VARCHAR2(40)
 CUST_INCOME_LEVEL                                  VARCHAR2(30)
 EDUCATION                                          VARCHAR2(21)
 OCCUPATION                                         VARCHAR2(21)
 HOUSEHOLD_SIZE                                     VARCHAR2(21)
 YRS_RESIDENCE                                      NUMBER
 AFFINITY_CARD                                      NUMBER(10)
 BULK_PACK_DISKETTES                                NUMBER(10)
 FLAT_PANEL_MONITOR                                 NUMBER(10)
 HOME_THEATER_PACKAGE                               NUMBER(10)
 BOOKKEEPING_APPLICATION                            NUMBER(10)
 PRINTER_SUPPLIES                                   NUMBER(10)
 Y_BOX_GAMES                                        NUMBER(10)
 OS_DOC_SET_KANJI                                   NUMBER(10)
 COMMENTS                                           DM_NESTED_NUMERICALS