PostgreSQL 8.2.3 中文文檔
後退	快退	章9. 函數和操作符	快進	前進

9.4. 字符串函數和操作符

本節描述了用於檢查和操作字符串數值的函數和操作符。在這個環境中的字符串包括所有 character, character varying, text 類型的值。除非另外說明，所有下麵列出的函數都可以處理這些類型，不過要小心的是，在使用 character 類型的時候，需要注意自動填充的潛在影響。通常這裡描述的函數也能用於非字符串類型，我們隻要先把那些數據轉化為字符串表現形式就可以了。有些函數還可以處理位串類型。

SQL 定義了一些字符串函數，它們有指定的語法(用特定的關鍵字而不是逗號來分隔參數)。詳情請見表9-5，這些函數也用正常的函數調用語法實現了(參閱表9-6)。

表9-5. SQL 字符串函數和操作符

函數	返回類型	描述	例子	結果
`string \|\| string`	`text`	字符串連接	`'Post' \|\| 'greSQL'`	`PostgreSQL`
`bit_length(string)`	`int`	字符串裡二進製位的個數	`bit_length('jose')`	`32`
`char_length(string)` 或 `character_length(string)`	`int`	字符串中的字符個數	`char_length('jose')`	`4`
`convert(string using conversion_name)`	`text`	使用指定的轉換名字改變編碼。轉換可以通過 `CREATE CONVERSION` 定義。當然係統裡有一些預定義的轉換名字。參閱表9-7獲取可用的轉換名。	`convert('PostgreSQL' using iso_8859_1_to_utf8)`	UTF8編碼的 `'PostgreSQL'`
`lower(string)`	`text`	把字符串轉化為小寫	`lower('TOM')`	`tom`
`octet_length(string)`	`int`	字符串中的字節數	`octet_length('jose')`	`4`
`overlay(string placing string from int [for int])`	`text`	替換子字符串	`overlay('Txxxxas' placing 'hom' from 2 for 4)`	`Thomas`
`position(substring in string)`	`int`	指定的子字符串的位置	`position('om' in 'Thomas')`	`3`
`substring(string [from int] [for int])`	`text`	抽取子字符串	`substring('Thomas' from 2 for 3)`	`hom`
`substring(string from pattern)`	`text`	抽取匹配 POSIX 正則表達式的子字符串。參見節9.7獲取更多關於模式匹配的信息。	`substring('Thomas' from '...$')`	`mas`
`substring(string from pattern for escape)`	`text`	抽取匹配 SQL 正則表達式的子字符串。參見節9.7獲取更多關於模式匹配的信息。	`substring('Thomas' from '%#"o_a#"_' for '#')`	`oma`
`trim([leading \| trailing \| both] [characters] from string)`	`text`	從字符串 `string` 的開頭/結尾/兩邊刪除隻包含 `characters` 中字符(缺省是一個空白)的最長的字符串	`trim(both 'x' from 'xTomxx')`	`Tom`
`upper(string)`	`text`	把字符串轉化為大寫	`upper('tom')`	`TOM`

還有額外的字符串操作函數可以用，它們在表9-6列出。它們有些在內部用於實現表9-5列出的 SQL 標準字符串函數。

表9-6. 其它字符串函數

函數	返回類型	描述	例子	結果
`ascii(string)`	`int`	參數第一個字符的 ASCII 碼	`ascii('x')`	`120`
`btrim(string text [, characters text])`	`text`	從 `string` 開頭和結尾刪除隻包含 `characters` 中字符(缺省是空白)的最長字符串	`btrim('xyxtrimyyx', 'xy')`	`trim`
`chr(int)`	`text`	給出 ASCII 碼的字符	`chr(65)`	`A`
`convert(string text, [src_encoding name,] dest_encoding name)`	`text`	把原來編碼為 `src_encoding` 的字符串轉換為 `dest_encoding` 編碼(如果省略了 `src_encoding` 將使用數據庫編碼)	`convert( 'text_in_utf8', 'UTF8', 'LATIN1')`	以ISO 8859-1編碼表示的 `text_in_utf8`
`decode(string text, type text)`	`bytea`	把早先用 `encode` 編碼的 `string` 裡麵的二進製數據解碼。參數類型和 `encode` 相同。	`decode('MTIzAAE=', 'base64')`	`123\000\001`
`encode(data bytea, type text)`	`text`	把二進製數據編碼為隻包含 ASCII 形式的數據。支持的類型有：`base64`, `hex`, `escape`	`encode( E'123\\000\\001', 'base64')`	`MTIzAAE=`
`initcap(string)`	`text`	把每個單詞的第一個子母轉為大寫，其它的保留小寫。單詞是一係列字母數字組成的字符，用非字母數字分隔。	`initcap('hi THOMAS')`	`Hi Thomas`
`length(string)`	`int`	`string` 中字符的數目	`length('jose')`	`4`
`lpad(string text, length int [, fill text])`	`text`	通過填充字符 `fill`(缺省時為空白)，把 `string` 填充為 `length` 長度。如果 `string` 已經比 `length` 長則將其尾部截斷。	`lpad('hi', 5, 'xy')`	`xyxhi`
`ltrim(string text [, characters text])`	`text`	從字符串 `string` 的開頭刪除隻包含 `characters` 中字符(缺省是一個空白)的最長的字符串。	`ltrim('zzzytrim', 'xyz')`	`trim`
`md5(string)`	`text`	計算 `string` 的MD5散列，以十六進製返回結果。	`md5('abc')`	`900150983cd24fb0 d6963f7d28e17f72`
`pg_client_encoding()`	`name`	當前客戶端編碼名稱	`pg_client_encoding()`	`SQL_ASCII`
`quote_ident(string)`	`text`	返回適用於 SQL 語句的標識符形式(使用適當的引號進行界定)。隻有在必要的時候才會添加引號(字符串包含非標識符字符或者會轉換大小寫的字符)。嵌入的引號被恰當地寫了雙份。	`quote_ident('Foo bar')`	`"Foo bar"`
`quote_literal(string)`	`text`	返回適用於在 SQL 語句裡當作文本使用的形式。嵌入的引號和反斜杠被恰當地寫了雙份。	`quote_literal( 'O\'Reilly')`	`'O''Reilly'`
`regexp_replace(string text, pattern text, replacement text [,flags text])`	`text`	替換匹配 POSIX 正則表達式的子字符串。參見節9.7以獲取更多模式匹配的信息。	`regexp_replace('Thomas', '.[mN]a.', 'M')`	`ThM`
`repeat(string text, number int)`	`text`	將 `string` 重複 `number` 次	`repeat('Pg', 4)`	`PgPgPgPg`
`replace(string text, from text, to text)`	`text`	把字符串 `string` 裡出現地所有子字符串 `from` 替換成子字符串 `to`	`replace( 'abcdefabcdef', 'cd', 'XX')`	`abXXefabXXef`
`rpad(string text, length int [, fill text])`	`text`	使用填充字符 `fill`(缺省時為空白)，把 `string` 填充到 `length` 長度。如果 `string` 已經比 `length` 長則將其從尾部截斷。	`rpad('hi', 5, 'xy')`	`hixyx`
`rtrim(string text [, characters text])`	`text`	從字符串 `string` 的結尾刪除隻包含 `characters` 中字符(缺省是個空白)的最長的字符串。	`rtrim('trimxxxx', 'x')`	`trim`
`split_part(string text, delimiter text, field int)`	`text`	根據 `delimiter` 分隔 `string` 返回生成的第 `field` 個子字符串(1為基)。	`split_part('abc~@~def~@~ghi', '~@~', 2)`	`def`
`strpos(string, substring)`	`int`	指定的子字符串的位置。和 `position(substring in string)` 一樣，不過參數順序相反。	`strpos('high', 'ig')`	`2`
`substr(string, from [, count])`	`text`	抽取子字符串。和 `substring(string from from for count)` 一樣	`substr('alphabet', 3, 2)`	`ph`
`to_ascii(string text [, encoding text])`	`text`	把 `string` 從其它編碼轉換為 ASCII (僅支持 `LATIN1`, `LATIN2`, `LATIN9`, `WIN1250` 編碼)。	`to_ascii('Karel')`	`Karel`
`to_hex(number int 或 bigint)`	`text`	把 `number` 轉換成十六進製表現形式	`to_hex(2147483647)`	`7fffffff`
`translate(string text, from text, to text)`	`text`	把在 `string` 中包含的任何匹配 `from` 中字符的字符轉化為對應的在 `to` 中的字符	`translate('12345', '14', 'ax')`	`a23x5`

表9-7. 內置的轉換

轉換名[a]	源編碼	目的編碼
`ascii_to_mic`	`SQL_ASCII`	`MULE_INTERNAL`
`ascii_to_utf8`	`SQL_ASCII`	`UTF8`
`big5_to_euc_tw`	`BIG5`	`EUC_TW`
`big5_to_mic`	`BIG5`	`MULE_INTERNAL`
`big5_to_utf8`	`BIG5`	`UTF8`
`euc_cn_to_mic`	`EUC_CN`	`MULE_INTERNAL`
`euc_cn_to_utf8`	`EUC_CN`	`UTF8`
`euc_jp_to_mic`	`EUC_JP`	`MULE_INTERNAL`
`euc_jp_to_sjis`	`EUC_JP`	`SJIS`
`euc_jp_to_utf8`	`EUC_JP`	`UTF8`
`euc_kr_to_mic`	`EUC_KR`	`MULE_INTERNAL`
`euc_kr_to_utf8`	`EUC_KR`	`UTF8`
`euc_tw_to_big5`	`EUC_TW`	`BIG5`
`euc_tw_to_mic`	`EUC_TW`	`MULE_INTERNAL`
`euc_tw_to_utf8`	`EUC_TW`	`UTF8`
`gb18030_to_utf8`	`GB18030`	`UTF8`
`gbk_to_utf8`	`GBK`	`UTF8`
`iso_8859_10_to_utf8`	`LATIN6`	`UTF8`
`iso_8859_13_to_utf8`	`LATIN7`	`UTF8`
`iso_8859_14_to_utf8`	`LATIN8`	`UTF8`
`iso_8859_15_to_utf8`	`LATIN9`	`UTF8`
`iso_8859_16_to_utf8`	`LATIN10`	`UTF8`
`iso_8859_1_to_mic`	`LATIN1`	`MULE_INTERNAL`
`iso_8859_1_to_utf8`	`LATIN1`	`UTF8`
`iso_8859_2_to_mic`	`LATIN2`	`MULE_INTERNAL`
`iso_8859_2_to_utf8`	`LATIN2`	`UTF8`
`iso_8859_2_to_windows_1250`	`LATIN2`	`WIN1250`
`iso_8859_3_to_mic`	`LATIN3`	`MULE_INTERNAL`
`iso_8859_3_to_utf8`	`LATIN3`	`UTF8`
`iso_8859_4_to_mic`	`LATIN4`	`MULE_INTERNAL`
`iso_8859_4_to_utf8`	`LATIN4`	`UTF8`
`iso_8859_5_to_koi8_r`	`ISO_8859_5`	`KOI8`
`iso_8859_5_to_mic`	`ISO_8859_5`	`MULE_INTERNAL`
`iso_8859_5_to_utf8`	`ISO_8859_5`	`UTF8`
`iso_8859_5_to_windows_1251`	`ISO_8859_5`	`WIN1251`
`iso_8859_5_to_windows_866`	`ISO_8859_5`	`WIN866`
`iso_8859_6_to_utf8`	`ISO_8859_6`	`UTF8`
`iso_8859_7_to_utf8`	`ISO_8859_7`	`UTF8`
`iso_8859_8_to_utf8`	`ISO_8859_8`	`UTF8`
`iso_8859_9_to_utf8`	`LATIN5`	`UTF8`
`johab_to_utf8`	`JOHAB`	`UTF8`
`koi8_r_to_iso_8859_5`	`KOI8`	`ISO_8859_5`
`koi8_r_to_mic`	`KOI8`	`MULE_INTERNAL`
`koi8_r_to_utf8`	`KOI8`	`UTF8`
`koi8_r_to_windows_1251`	`KOI8`	`WIN1251`
`koi8_r_to_windows_866`	`KOI8`	`WIN866`
`mic_to_ascii`	`MULE_INTERNAL`	`SQL_ASCII`
`mic_to_big5`	`MULE_INTERNAL`	`BIG5`
`mic_to_euc_cn`	`MULE_INTERNAL`	`EUC_CN`
`mic_to_euc_jp`	`MULE_INTERNAL`	`EUC_JP`
`mic_to_euc_kr`	`MULE_INTERNAL`	`EUC_KR`
`mic_to_euc_tw`	`MULE_INTERNAL`	`EUC_TW`
`mic_to_iso_8859_1`	`MULE_INTERNAL`	`LATIN1`
`mic_to_iso_8859_2`	`MULE_INTERNAL`	`LATIN2`
`mic_to_iso_8859_3`	`MULE_INTERNAL`	`LATIN3`
`mic_to_iso_8859_4`	`MULE_INTERNAL`	`LATIN4`
`mic_to_iso_8859_5`	`MULE_INTERNAL`	`ISO_8859_5`
`mic_to_koi8_r`	`MULE_INTERNAL`	`KOI8`
`mic_to_sjis`	`MULE_INTERNAL`	`SJIS`
`mic_to_windows_1250`	`MULE_INTERNAL`	`WIN1250`
`mic_to_windows_1251`	`MULE_INTERNAL`	`WIN1251`
`mic_to_windows_866`	`MULE_INTERNAL`	`WIN866`
`sjis_to_euc_jp`	`SJIS`	`EUC_JP`
`sjis_to_mic`	`SJIS`	`MULE_INTERNAL`
`sjis_to_utf8`	`SJIS`	`UTF8`
`tcvn_to_utf8`	`WIN1258`	`UTF8`
`uhc_to_utf8`	`UHC`	`UTF8`
`utf8_to_ascii`	`UTF8`	`SQL_ASCII`
`utf8_to_big5`	`UTF8`	`BIG5`
`utf8_to_euc_cn`	`UTF8`	`EUC_CN`
`utf8_to_euc_jp`	`UTF8`	`EUC_JP`
`utf8_to_euc_kr`	`UTF8`	`EUC_KR`
`utf8_to_euc_tw`	`UTF8`	`EUC_TW`
`utf8_to_gb18030`	`UTF8`	`GB18030`
`utf8_to_gbk`	`UTF8`	`GBK`
`utf8_to_iso_8859_1`	`UTF8`	`LATIN1`
`utf8_to_iso_8859_10`	`UTF8`	`LATIN6`
`utf8_to_iso_8859_13`	`UTF8`	`LATIN7`
`utf8_to_iso_8859_14`	`UTF8`	`LATIN8`
`utf8_to_iso_8859_15`	`UTF8`	`LATIN9`
`utf8_to_iso_8859_16`	`UTF8`	`LATIN10`
`utf8_to_iso_8859_2`	`UTF8`	`LATIN2`
`utf8_to_iso_8859_3`	`UTF8`	`LATIN3`
`utf8_to_iso_8859_4`	`UTF8`	`LATIN4`
`utf8_to_iso_8859_5`	`UTF8`	`ISO_8859_5`
`utf8_to_iso_8859_6`	`UTF8`	`ISO_8859_6`
`utf8_to_iso_8859_7`	`UTF8`	`ISO_8859_7`
`utf8_to_iso_8859_8`	`UTF8`	`ISO_8859_8`
`utf8_to_iso_8859_9`	`UTF8`	`LATIN5`
`utf8_to_johab`	`UTF8`	`JOHAB`
`utf8_to_koi8_r`	`UTF8`	`KOI8`
`utf8_to_sjis`	`UTF8`	`SJIS`
`utf8_to_tcvn`	`UTF8`	`WIN1258`
`utf8_to_uhc`	`UTF8`	`UHC`
`utf8_to_windows_1250`	`UTF8`	`WIN1250`
`utf8_to_windows_1251`	`UTF8`	`WIN1251`
`utf8_to_windows_1252`	`UTF8`	`WIN1252`
`utf8_to_windows_1253`	`UTF8`	`WIN1253`
`utf8_to_windows_1254`	`UTF8`	`WIN1254`
`utf8_to_windows_1255`	`UTF8`	`WIN1255`
`utf8_to_windows_1256`	`UTF8`	`WIN1256`
`utf8_to_windows_1257`	`UTF8`	`WIN1257`
`utf8_to_windows_866`	`UTF8`	`WIN866`
`utf8_to_windows_874`	`UTF8`	`WIN874`
`windows_1250_to_iso_8859_2`	`WIN1250`	`LATIN2`
`windows_1250_to_mic`	`WIN1250`	`MULE_INTERNAL`
`windows_1250_to_utf8`	`WIN1250`	`UTF8`
`windows_1251_to_iso_8859_5`	`WIN1251`	`ISO_8859_5`
`windows_1251_to_koi8_r`	`WIN1251`	`KOI8`
`windows_1251_to_mic`	`WIN1251`	`MULE_INTERNAL`
`windows_1251_to_utf8`	`WIN1251`	`UTF8`
`windows_1251_to_windows_866`	`WIN1251`	`WIN866`
`windows_1252_to_utf8`	`WIN1252`	`UTF8`
`windows_1256_to_utf8`	`WIN1256`	`UTF8`
`windows_866_to_iso_8859_5`	`WIN866`	`ISO_8859_5`
`windows_866_to_koi8_r`	`WIN866`	`KOI8`
`windows_866_to_mic`	`WIN866`	`MULE_INTERNAL`
`windows_866_to_utf8`	`WIN866`	`UTF8`
`windows_866_to_windows_1251`	`WIN866`	`WIN`
`windows_874_to_utf8`	`WIN874`	`UTF8`
【注意】a. 轉換名遵循一個標準的命名模式：將源編碼中的所有非字母數字字符用下劃線替換，後麵跟著 `_to_` ，然後後麵再跟著經過同樣處理的目標編碼的名字。因此這些名字可能和客戶的編碼名字不同。

後退	首頁	前進
數學函數和操作符	上一級	二進製字符串函數和操作符