For a given expression, the WIDTH_BUCKET function returns the bucket number into which the value of this expression would fall after being evaluated.
The expression for which the histogram is being created. This expression must evaluate to a number or a datetime value. When expr evaluates to NA
, then the expression returns NA
.
An expression that resolves to the minimum end point of the acceptable range for expr. This expression must evaluate to number or datetime values, and cannot evaluate to NA
.
An expression that resolves to the maximum end point of the acceptable range for expr. This expression must evaluate to number or datetime values, and cannot evaluate to NA
.
An expression that resolves to a constant indicating the number of buckets. This expression must evaluate to a positive INTEGER
.
Underflow and Overflow Buckets
WIDTH_BUCKET also creates (when needed) an underflow bucket numbered 0
and an overflow bucket numbered num_buckets+1
. These buckets handle values less than min_value and more than max_value and are helpful in checking the reasonableness of endpoints.
Constructing Equiwidth Histograms
WIDTH_BUCKET lets you construct equiwidth histograms, in which the histogram range is divided into intervals that have identical size. (Compare this function with NTILE, which creates equiheight histograms.) Ideally each bucket is a "closed-open" interval of the real number line. For example, a bucket can be assigned to cores between 10.00
and 19.999...
to indicate that 10
is included in the interval and 20
is excluded. A bucket of this size is sometimes denoted as (10,
20)
.
Example 8-168 Grouping Values Into a Ten-Bucket Histogram
The following example limits the customers to those customers who have a customer_id
of 825 through 853 and, then, creates a ten-bucket histogram on the credit_limit
variable for those customer. It uses the credit_group
formula to calculate the bucket number for each customer. Customers with credit limits greater than the maximum value are assigned to the overflow bucket, 11.
DESCRIBE DEFINE customer_id DIMENSION INTEGER DEFINE cust_last_name VARIABLE TEXT <customer_id> DEFINE credit_limit VARIABLE INTEGER <customer_id> DEFINE credit_group FORMULA INTEGER <customer_id> EQ WIDTH_BUCKET(credit_limit, 100, 5000, 10) LIMIT customer_id to 825 TO 853 REPORT cust_last_name, credit_limit, credit_group CUSTOMER_ID CUST_LAST_NAME CREDIT_LIMIT CREDIT_GROUP -------------- ---------------- ---------------- ---------------- 825 Dreyfuss 500 1 826 Barkin 500 1 827 Siegel 500 1 828 Minnelli 2,300 5 829 Hunter 2,300 5 830 Dutt 3,500 7 831 Bel Geddes 3,500 7 832 Spacek 3,500 7 833 Moranis 3,500 7 834 Idle 3,500 7 835 Eastwood 1,200 3 836 Berenger 1,200 3 837 Stanton 1,200 3 838 Nicholson 3,500 7 839 Johnson 3,500 7 840 Elliott 1,400 3 841 Boyer 1,400 3 842 Stern 1,400 3 843 Oates 700 2 844 Julius 700 2 845 Fawcett 5,000 11 846 Brando 5,000 11 847 Streep 5,000 11 848 Olmos 1,800 4 849 Kaurusmdki 1,800 4 850 Finney 2,300 5 851 Brown 2,300 5 852 Tanner 2,300 5 853 Palin 400 1