Conversion between Oracle Database character sets and Unicode (16-bit, fixed-width Unicode encoding) is supported. Replacement characters are used if a character has no mapping from Unicode to the Oracle Database character set. Therefore, conversion back to the original character set is not always possible without data loss.
Table 22-6 lists the OCI character set conversion functions.
Table 22-6 OCI Character Set Conversion Functions
Function | Purpose |
---|---|
Indicate whether replacement characters were used for characters that could not be converted in the last invocation of |
|
Convert a multibyte string to Unicode |
|
Convert a string from one character set to another |
|
Convert a Unicode string into multibyte |
Indicates whether the replacement character was used for characters that could not be converted during the last invocation of OCICharSetToUnicode() or OCINlsCharSetConvert().
Conversion between the Oracle Database character set and Unicode (16-bit, fixed-width Unicode encoding) is supported. Replacement characters are used if there is no mapping for a character from Unicode to the Oracle Database character set. Thus, not every character can make a round-trip conversion to the original character. Data loss occurs with certain characters.
The function returns TRUE
if the replacement character was used when OCINlsCharSetConvert() or OCICharSetToUnicode() was last invoked. Otherwise the function returns FALSE
.
Converts a multibyte string pointed to by src
to Unicode out to the array pointed to by dst
.
sword OCICharSetToUnicode ( void *hndl, ub2 *dst, size_t dstlen, const OraText *src, size_t srclen, size_t *rsize );
Pointer to an OCI environment or user session handle.
Pointer to a destination buffer.
The size of the destination buffer in characters.
Pointer to a multibyte source string.
The size of the source string in bytes.
)
The number of characters converted. If it is a NULL
pointer, then nothing is returned.
The conversion stops when it reaches the source limitation or destination limitation. The function returns the number of characters converted into a Unicode string. If dstlen
is 0
, then the function scans the string, counts the number of characters, and returns the number of characters out to rsize
, but does not convert the string.
If OCI_UTF16ID
is specified for SQL CHAR
data in the OCIEnvNlsCreate() function, then this function produces an error.
Converts a string pointed to by src
in the character set specified by srcid
to the array pointed to by dst
in the character set specified by dstid
. The conversion stops when it reaches the data size limitation of either the source or the destination. The function returns the number of bytes converted into the destination buffer.
sword OCINlsCharSetConvert ( void *hndl, OCIError *errhp, ub2 dstid, void *dstp, size_t dstlen, ub2 srcid, const void *srcp, size_t srclen, size_t *rsize );
Pointer to an OCI environment or user session handle.
OCI error handle. If there is an error, then it is recorded in errhp
and the function returns a NULL
pointer. Diagnostic information can be obtained by calling OCIErrorGet().
Character set ID for the destination buffer.
Pointer to the destination buffer.
The maximum size in bytes of the destination buffer.
Character set ID for the source buffer.
Pointer to the source buffer.
The length in bytes of the source buffer.
The number of characters converted. If the pointer is NULL
, then nothing is returned.
Although either the source or the destination character set ID can be specified as OCI_UTF16ID
, the length of the original and the converted data is represented in bytes, rather than number of characters. Note that the conversion does not stop when it encounters null data. To get the character set ID from the character set name, use OCINlsCharSetNameToId(). To check if derived data in the destination buffer contains replacement characters, use OCICharSetConversionIsReplacementUsed(). The buffers should be aligned with the byte boundaries appropriate for the character sets. For example, the ub2
data type is necessary to hold strings in UTF-16.
sword OCIUnicodeToCharSet ( void *hndl, OraText *dst, size_t dstlen, const ub2 *src, size_t srclen, size_t *rsize );
Pointer to an OCI environment or user session handle.
Pointer to a destination buffer.
The size of the destination buffer in bytes.
Pointer to a Unicode string.
The size of the source string in characters.
The number of bytes converted. If it is a NULL
pointer, then nothing is returned.
The conversion stops when it reaches the source limitation or destination limitation. The function returns the number of bytes converted into a multibyte string. If dstlen
is zero, then the function returns the number of bytes out to rsize
without conversion.
If a Unicode character is not convertible for the character set specified in OCI environment or user session handle, then a replacement character is used. In this case, OCICharSetConversionIsReplacementUsed() returns TRUE
.
If OCI_UTF16ID
is specified for SQL CHAR
data in the OCIEnvNlsCreate() function, then this function produces an error.