Note: Expression indexing is available only in Oracle Database Enterprise Edition. |
You can define an index on a column storing expressions to quickly find expressions that evaluate to true for a data item. This is most helpful when a large expression set is evaluated for a data item. The SQL EVALUATE
operator determines whether or not to use the index based on its access cost. Expression Filter uses the indextype, EXFSYS.EXPFILTER
, to create and maintain indexes.
If an Expression column is not indexed, the SQL EVALUATE
operator builds a dynamic query for each expression stored in the column and executes it using the values passed in as the data item.
This chapter describes the basic approach to indexing including index representation (Section 12.3), index processing (Section 12.4), and user commands for creating and tuning indexes (Section 12.6).
Expressions in a large expression set tend to have certain commonalities in their predicates. An Expression Filter index, defined on an expression set, groups predicates by their commonalities to reduce processing costs. For example, in the case of two predicates with a common left-hand side, such as Year=1998
and Year=1999
, in most cases, the falseness or trueness of one predicate can be determined based on the outcome of the other predicate. The left-hand side of a predicate includes arithmetic expressions containing one or more elementary attributes and user-defined functions, for example, HORSEPOWER(model, year)
. An operator and a constant on the right-hand side (RHS) completes the predicate, for example, HORSEPOWER(model, year)>=150
.
An Expression Filter index defined on a set of expressions takes advantage of the logical relationships among multiple predicates by grouping them based on the commonality of their left-hand sides. These left-hand sides are arithmetic expressions that consist of one or more elementary attributes and user-defined functions, for example, HORSEPOWER(model,year)
.
The predicates that can be indexed with the Expression Filter indexing mechanism include any predicate with a constant on the right-hand side that uses one of the following predicate operators: =
, !=
, >
, <
, >=
, <=
, BETWEEN
, IS NULL
, IS NOT
NULL
, LIKE
, and NVL
.
The predicates that cannot be indexed are preserved in their original form and they are evaluated by value substitution in the last stage of expression evaluation. Some of the predicates that cannot be indexed include:
Predicates with a variable on the right-hand side.
IN
list predicates.
LIKE
predicates with a leading wild-card character.
Duplicate predicates in an expression with the same left-hand side. At most, two predicates with a duplicate left-hand side, for example Year>1995 and Year<2000,
can be indexed if the index is configured for BETWEEN
operators. A predicate with a BETWEEN
operator is treated as two predicates with binary operators, one with the '>='
operator and another with the '<='
operator. See the section about EXF$INDEXOPER for more information about the BETWEEN
operator.
The Expression Filter index uses persistent database objects internally to maintain the index information for an expression set. A relational table called the predicate table captures the grouping information for all the predicates in an expression set. Typically, the predicate table contains one row for each expression in the expression set. However, an expression containing one or more disjunctions (two simple expressions joined by OR
) is converted into a disjunctive-normal form (disjunction of conjunctions), and each disjunction in this normal form is treated as a separate expression with the same identifier as the original expression. The predicate table contains one row for each such disjunction.
The Expression Filter index can be tuned for better performance by identifying the most-common left-hand sides of the predicates (or discriminating predicate groups) in the expression set. The owner of the expression set (or the table storing expressions) can identify the predicate's left-hand sides or automate this process by collecting statistics on the expression set. For each common left-hand side, a predicate group is formed with all the corresponding predicates in the expression set. For example, if predicates with Model
, Price,
and HorsePower(Model, Year)
attributes are common in the expression set, three predicate groups are formed for these attributes. The predicate table captures the predicate grouping information, as shown in Figure 12-1.
For each predicate group, the predicate table has two columns: one to store the operator of the predicate and the other to store the constant on the right-hand side of the predicate. For a predicate in an expression, Expression Filter stores its operator and the right-hand side constant under the corresponding columns of the predicate group. The predicates that do not fall into one of the preconfigured groups are preserved in their original form and Expression Filter stores them in a VARCHAR2
column of the predicate table as sparse predicates. (For the example in Figure 12-1, the predicates on Mileage
and Year
fall in this category.) Expression Filter implicitly treats the predicates with IN
lists and the predicates with a varying right-hand side (not a constant) as sparse predicates. Expression Filter creates native indexes on the predicate table as described in Section 12.4.
To evaluate a data item for a set of expressions, Expression Filter computes the left-hand side of each predicate group in the data item and compares its value with the corresponding constants stored in the predicate table using an appropriate operator. For example, using the predicate table, if HORSEPOWER('TAURUS',2001)
returns 153, then the predicates satisfying this value are those interested in horsepower equal to 153 or those interested in horsepower greater than a value that is below 153, and so on. If the operators and right-hand side constants of the previous group are stored in the G3_OP
and G3_RHS
columns of the predicate table (in Figure 12-1), then the following query on the predicate table identifies the rows that satisfy this group of predicates:
SELECT Rid FROM predicate_table WHERE G3_OP = '=' AND G3_RHS = :rhs_val or G3_OP = '>' AND G3_RHS < :rhs_val or ... -- where :rhs_val is the value from the computation of the left-hand side --
Expression Filter uses similar techniques for less than (<
), greater than or equal to (>=
), less than or equal to (<=
), not equal to (!=
, <>
), LIKE
, IS NULL
, and IS NOT NULL
predicates. Predicates with the BETWEEN
operator are divided into two predicates with greater than or equal to and less than or equal to operators. Expression Filter can configure duplicate predicate groups for a left-hand side if it frequently appears more than once in a single expression, for example, Year >= 1996 and Year <= 2000
.
The WHERE
clause (shown in the previous query) is repeated for each predicate group in the predicate table, and the predicate groups are all joined by conjunctions. When the complete query (shown in the following example) is issued on the predicate table, it returns the row identifiers for the expressions that evaluate to true with all the predicates in the preconfigured groups. For these resulting expressions, Expression Filter evaluates the corresponding sparse predicates that are stored in the predicate table using dynamic queries to determine if an expression is true for a particular data item.
SELECT Rid, Sparse_predicate FROM predicate_table WHERE --- predicates in group 1 (G1_OP IS NULL OR --- no predicate involving this LHS ((:g1_val IS NOT NULL AND (G1_OP = '=' AND G1_RHS = :g1_val or G1_OP = '>' AND G1_RHS < :g1_val or G1_OP = '<' AND G1_RHS > :g1_val or ...) or (:g1_val IS NULL AND G1_OP = 'IS NULL'))) AND --- predicates in group 2 (G2_OP IS NULL OR ((:g2_val IS NOT NULL AND (G2_OP = '=' AND G2_RHS = :g2_val or G2_OP = '>' AND G2_RHS < :g2_val or G2_OP = '<' AND G2_RHS > :g2_val or ...) or (:g2_val IS NULL AND G2_OP = 'IS NULL'))) AND ...
For efficient execution of the predicate table query (shown previously), Expression Filter creates concatenated bitmap indexes on the {Operator, RHS constant}
columns of selected groups. These groups are identified either by user specification or from the statistics about the frequency of the predicates (belonging to a group) in the expression set. With the indexes defined on preconfigured predicate groups, Expression Filter divides the predicates from an expression set into three classes:
Indexed predicates: Predicates that belong to a subset of the preconfigured predicate groups that are identified as most discriminating. Expression Filter creates bitmap indexes for these predicate groups; thus, these predicates are also called indexed predicates. The previous query performs range scans on the corresponding index to evaluate all the predicates in a group and returns the expressions that evaluate to true with just that predicate. Similar scans are performed on the bitmap indexes of other indexed predicates, and the results from these index scans are combined using BITMAP AND
operations to determine all the expressions that evaluate to true with all the indexed predicates. This enables multiple predicate groups to be filtered simultaneously using one or more bitmap indexes.
Stored predicates: Predicates that belong to groups that are not indexed. Expression Filter captures these predicates in the corresponding {Operator,
RHS constant}
columns of the predicate table, with no bitmap indexes defined on them. For all the expressions that evaluate to true with the indexed predicates, the previous query compares the values of the left-hand sides of these predicate groups with those stored in the predicate table. Although Expression Filter creates bitmap indexes for a selected number of groups, the optimizer may choose not to use one or more indexes based on their access cost. Those groups are treated as stored predicate groups. The query issued on the predicate table remains unchanged for a different choice of indexes.
Sparse predicates: Predicates that do not belong to any of the preconfigured predicate groups. For expressions that evaluate to true for all the predicates in the indexed and stored groups, Expression Filter evaluates last sparse predicates (if any). If the expressions with sparse predicates evaluate to true, Expression Filter considers them true for the data item.
Optionally, you can specify the common operators that appear with predicates on the left-hand side and reduce the number of range scans performed on the bitmap index. See EXF$INDEXOPER
for more information. In the previous example, the Model
attribute commonly appears in equality predicates, and you can configure the Expression Filter index to check only for equality predicates while processing the indexed predicate groups. Expression Filter processes and evaluates sparse predicates along with any other form of predicate on the Model
attribute at the same time.
Once the predicate groups for an expression set are determined, the structure of the predicate table and the query to be issued on the predicate table are fixed. The choice of indexed or stored predicate groups does not change the query. As part of Expression Filter index creation, the predicate table query is determined and a function is dynamically generated for this query. Expression Filter uses the same query (with bind variables) for any data item passed in for the expression set evaluation. This ensures that the predicate table query is compiled once and reused for evaluating any number of data items.
The cost of evaluating a predicate in an expression set depends on the group to which it belongs. The index for an expression set can be tuned by identifying the appropriate predicate groups as the index parameters.
The steps involved in evaluating the predicates in an indexed predicate group are:
One-time computation of the left-hand side of the predicate group
One or more range scans on the bitmap indexes using the computed value
The steps involved in evaluating the predicates in a stored predicate group are:
One-time computation of the left-hand side of the predicate group
Comparison of the computed value with the operators and the right-hand side constants of all the predicates remaining in the working set (after filtering, based on indexed predicates)
The steps involved in evaluating the predicates in a sparse predicate group are:
Parse the subexpression representing the sparse predicates for all the expressions remaining in the working set.
Evaluate the subexpression through substitution of data values (using a dynamic query).
Creating an Index from Default Parameters
In a schema, an attribute set can be used for one or more expression sets, and you can configure the predicate groups for these expression sets by associating the default index parameters with the attribute set. The (discriminating) predicate groups can be chosen with the knowledge of commonly occurring left-hand sides and their selectivity for the expected data.
The following command uses the DBMS_EXPFIL.DEFAULT_INDEX_PARAMETERS
procedure to configure default index parameters with the Car4Sale
attribute set:
BEGIN DBMS_EXPFIL.DEFAULT_INDEX_PARAMETERS('Car4Sale', exf$attribute_list ( exf$attribute (attr_name => 'Model', --- LHS for predicate group attr_oper => exf$indexoper('='), attr_indexed => 'TRUE'), --- indexed predicate group exf$attribute (attr_name => 'Price', attr_oper => exf$indexoper('all'), attr_indexed => 'TRUE'), exf$attribute (attr_name => 'HorsePower(Model, Year)', attr_oper => exf$indexoper('=','<','>','>=','<='), attr_indexed => 'FALSE') --- stored predicate group ) ); END; /
For an expression set, create the Expression Filter index as follows:
CREATE INDEX InterestIndex ON Consumer (Interest) INDEXTYPE IS EXFSYS.EXPFILTER;
The index derives all its parameters from the defaults (Model
, Price
, and HorsePower(Model, Year)
) associated with the corresponding attribute set. If the defaults are not specified, it implicitly uses all the scalar elementary attributes (Model
, Year,Price
, and Mileage
) in the attribute set as its stored and indexed attributes.
You can fine-tune the default parameters derived from the attribute set for each expression set by using the PARAMETERS
clause when you create the index or by associating index parameters directly with the expression set. The following CREATE INDEX
statement with the PARAMETERS
clause configures the index with an additional stored predicate:
CREATE INDEX InterestIndex ON Consumer (Interest) INDEXTYPE IS exfsys.ExpFilter PARAMETERS ('ADD TO DEFAULTS STOREATTRS (CrashTestRating(Model, Year))');
For more information about creating indexes from default parameters, see DEFAULT_INDEX_PARAMETERS
Procedure and "CREATE INDEX" in Chapter 15.
Creating an Index from Exact Parameters
If there is a need to fine-tune the index parameters for each expression set associated with the common attribute set, you can assign the exact index parameters directly to the expression set, using the DBMS_EXPFIL.INDEX_PARAMETERS
procedure.
The following commands copy the index parameters from the defaults and then fine-tunes them for the given expression set. An expression filter index created for the expression set uses these parameters to configure its indexed and stored predicate groups.
BEGIN -- Derive index parameters from defaults -- DBMS_EXPFIL.INDEX_PARAMETERS(expr_tab => 'Consumer', expr_col => 'Interest', attr_list => null, operation => 'DEFAULT'); -- Fine-tune the parameters by adding another stored attribute -- DBMS_EXPFIL.INDEX_PARAMETERS(expr_tab => 'Consumer', expr_col => 'Interest', attr_list => exf$attribute_list ( exf$attribute ( attr_name => 'CrashTestRating(Model, Year)', attr_oper => exf$indexoper('all'), attr_indexed => 'FALSE')), operation => 'ADD'); END; / CREATE INDEX InterestIndex ON Consumer (Interest) INDEXTYPE IS EXFSYS.EXPFILTER;
For more information about creating indexes from exact parameters, see INDEX_PARAMETERS
Procedure and "CREATE INDEX" in Chapter 15.
See Chapter 13 for a discussion on indexing expressions with XPath predicates.
Creating an Index from Statistics
If a representative set of expressions is already stored in the table, the owner of the table can automate the index tuning process by collecting statistics on the expression set, using the DBMS_EXPFIL.GET_EXPRSET_STATS
procedure, and creating the index from these statistics, as shown in the following example:
BEGIN DBMS_EXPFIL.GET_EXPRSET_STATS (expr_tab => 'Consumer', expr_col => 'Interest'); END; / CREATE INDEX InterestIndex ON Consumer (Interest) INDEXTYPE IS EXFSYS.EXPFILTER PARAMETERS ('STOREATTRS TOP 4 INDEXATTRS TOP 2');
For the previous index, four stored attributes are chosen based on the frequency of the corresponding predicate left-hand sides in the expression set, and out of these four attributes, the top two are chosen as indexed attributes. When a TOP
n
clause is used, any defaults associated with the corresponding attribute set are ignored. The attributes chosen for an index can be viewed by querying the USER_EXPFIL_PREDTAB_ATTRIBUTES
view.
For more information about creating indexes from statistics, see GET_EXPRSET_STATS
Procedure and "CREATE INDEX" in Chapter 15.
A query using the SQL EVALUATE
operator on an Expression column can force the use of the index defined on such a column with an optimizer hint. (See the Oracle Database Performance Tuning Guide.) In other cases, the optimizer determines the cost of the Expression Filter index-based scan and compares it with the cost of alternate execution plans.
SELECT * FROM Consumer WHERE EVALUATE (Consumer.Interest, Car4Sale.getVarchar('Mustang',2000,18000,22000)) = 1 and Consumer.Zipcode BETWEEN 03060 and 03070;
For the previous query, if the Consumer
table has an Expression Filter index defined on the Interest
column and a native index defined on the Zipcode
column, the optimizer chooses the appropriate index based on their selectivity and their access cost. Beginning with release 10gR2 (10.2), the selectivity and the cost of an Expression Filter index are computed when statistics are collected on the expression column, the index, or the table storing expressions. These statistics are stored in the Expression Filter dictionary and are used to determine the optimal execution plan for the query with an EVALUATE
operator.
You can use the EXPLAIN PLAN
statement to see if the optimizer picked the Expression Filter index for a query.
The Expression Filter index uses persistent database objects to maintain the index on a column storing expressions. All these secondary objects are created in the schema in which the Expression Filter index is created. There are three types of secondary objects for each Expression Filter index, and they use the following naming conventions:
Conventional table called the predicate table: EXF$PTAB_
n
One or more indexes on the predicate table: EXF$PTAB_
n
_IDX_
m
Package called the Access Function package: EXF$AFUN_
n
To ensure the expression evaluation is valid, a table with an Expression column and the Expression Filter index on the Expression column should belong to the same schema. A user with CREATE INDEX
privileges on a table cannot create an Expression Filter index unless the user is the owner of the table. By default, the predicate table is created in the user's default tablespace. You can specify an alternate storage clause for the predicate table when you create the index by using the PREDSTORAGE
parameter clause. (See the section about the CREATE INDEX statement in Chapter 15.) The indexes on the predicate table are always created in the same tablespace as the predicate table.
An Expression Filter index created for an Expression column is automatically maintained to reflect any changes made to the expressions (with the SQL INSERT
, UPDATE
, or DELETE
statements or SQL*Loader). The bitmap indexes defined on the predicate table could become fragmented when a large number of expressions are modified, added to the set, or deleted. You can rebuild these indexes online to reduce the fragmentation using the DBMS_EXPFIL.DEFRAG_INDEX
procedure, as shown in the following example:
BEGIN DBMS_EXPFIL.DEFRAG_INDEX (idx_name => 'InterestIndex'); END; /
See DEFRAG_INDEX
Procedure for more information about this procedure.
You can rebuild the complete Expression Filter index offline by using the ALTER INDEX...REBUILD
statement. This is useful when the index is marked UNUSABLE following a table maintenance operation. When the default index parameters associated with an attribute set are modified, they can be incorporated into the existing indexes using the ALTER INDEX...REBUILD
statement with the DEFAULT
parameter clause. See the section about ALTER INDEX REBUILD statement in Chapter 15.