AWK內置函數

AWK許多內置函數，隨時可為程序員使用。本教學介紹了AWK的算術，字符串，時間，位操作和其他雜項函數的例子：

算術函數

AWK具有以下內置的算術函數：

atan2(y, x)

它返回弧度的反正切(y/x) 。下麵簡單的例子說明了這一點：

[jerry]$ awk 'BEGIN {
  PI = 3.14159265
  x = -10
  y = 10
  result = atan2 (y,x) * 180 / PI;

  printf "The arc tangent for (x=%f, y=%f) is %f degrees\n", x, y, result
}'

在執行上麵的代碼後，得到以下結果：

The arc tangent for (x=-10.000000, y=10.000000) is 135.000000 degrees

cos(expr)

此函數返回expr的餘弦（以弧度形式）。下麵簡單的例子說明了這一點：

[jerry]$ awk 'BEGIN {
  PI = 3.14159265
  param = 60
  result = cos(param * PI / 180.0);

  printf "The cosine of %f degrees is %f.\n", param, result
}'

在執行上麵的代碼後，得到以下結果：

The cosine of 60.000000 degrees is 0.500000.

exp(expr)

此函數被用於找到指數值。

[jerry]$ awk 'BEGIN {
  param = 5
  result = exp(param);

  printf "The exponential value of %f is %f.\n", param, result
}'

在執行上麵的代碼後，得到以下結果：

The exponential value of 5.000000 is 148.413159.

int(expr)

這個函數截斷expr為整數值。下麵簡單的例子說明了這一點：

[jerry]$ awk 'BEGIN {
  param = 5.12345
  result = int(param)

  print "Truncated value =", result
}'

在執行上麵的代碼後，得到以下結果：

Truncated value = 5

log(expr)

此函數計算的自然對數。

[jerry]$ awk 'BEGIN {
  param = 5.5
  result = log (param)

  printf "log(%f) = %f\n", param, result
}'

在執行上麵的代碼後，得到以下結果：

log(5.500000) = 1.704748

rand

該函數返回一個隨機數N，在0和1之間，使得0<= N <1。例如下麵的例子會產生三個隨機數：

[jerry]$ awk 'BEGIN {
  print "Random num1 =" , rand()
  print "Random num2 =" , rand()
  print "Random num3 =" , rand()
}'

在執行上麵的代碼後，得到以下結果：

Random num1 = 0.237788
Random num2 = 0.291066
Random num3 = 0.845814

sin(expr)

此函數返回expr的正弦（以弧度形式）。下麵簡單的例子說明了這一點：

[jerry]$ awk 'BEGIN {
  PI = 3.14159265
  param = 30.0
  result = sin(param * PI /180)

  printf "The sine of %f degrees is %f.\n", param, result
}'

在執行上麵的代碼後，得到以下結果：

The sine of 30.000000 degrees is 0.500000.

sqrt(expr)

該函數返回expr的平方根。

[jerry]$ awk 'BEGIN {
  param = 1024.0
  result = sqrt(param)

  printf "sqrt(%f) = %f\n", param, result
}'

在執行上麵的代碼後，得到以下結果：

sqrt(1024.000000) = 32.000000

srand([expr])

這個函數使用產生種子值的隨機數。它使用expr作為隨機數生成的新的種子。如果冇有expr，它使用一天的時間值作為種子值。

[jerry]$ awk 'BEGIN {
  param = 10

  printf "srand() = %d\n", srand()
  printf "srand(%d) = %d\n", param, srand(param)
}'

在執行上麵的代碼後，得到以下結果：

srand() = 1
srand(10) = 1417959587

字符串函數

AWK具有以下內置字符串函數：

asort(arr [, d [, how] ])

這個函數排序arr，使用gawk的常規規則比較值的內容，並替換排序值的索引常用使用連續整數是從1開始。

[jerry]$ awk 'BEGIN {
	arr[0] = "Three"
	arr[1] = "One"
	arr[2] = "Two"

	print "Array elements before sorting:"
	for (i in arr) {
		print arr[i]
	}

	asort(arr)

	print "Array elements after sorting:"
	for (i in arr) {
		print arr[i]
	}
}'

在執行上麵的代碼後，得到以下結果：

Array elements before sorting:
Three
One
Two
Array elements after sorting:
One
Three
Two

asorti(arr [, d [, how] ])

此函數的行為類似於asort()，所不同的是數組索引用於排序。

[jerry]$ awk 'BEGIN {
	arr["Two"] = 1
	arr["One"] = 2
	arr["Three"] = 3

	asorti(arr)

	print "Array indices after sorting:"
	for (i in arr) {
		print arr[i]
	}
}'

在執行上麵的代碼後，得到以下結果：

Array indices after sorting:
One
Three
Two

gsub(regex, sub, string)

gsub代表全局替換。它用正則表達式分每個匹配。第三個參數是可選的，如果省略它，那麼$0被使用。

[jerry]$ awk 'BEGIN {
	str = "Hello, World"

	print "String before replacement = " str

	gsub("World", "Jerry", str)

	print "String after replacement = " str
}'

在執行上麵的代碼後，得到以下結果：

String before replacement = Hello, World
String after replacement = Hello, Jerry

index(str, sub)

它檢查sub是否是str的子字符串。如果成功則返回sub開始位置，否則返回0。str第一個字符的位置是1。

[jerry]$ awk 'BEGIN {
	str = "One Two Three"
	subs = "Two"

	ret = index(str, subs)

	printf "Substring \"%s\" found at %d location.\n", subs, ret
}'

在執行上麵的代碼後，得到以下結果：

Substring "Two" found at 5 location.

length(str)

它返回字符串字符串的長度。

[jerry]$ awk 'BEGIN {
	str = "Hello, World !!!"

	print "Length = ", length(str)
}'

在執行上麵的代碼後，得到以下結果：

Length = 16

match(str, regex)

它返回正則表達式的字符串str第一個最長的匹配索引。如果冇有找到匹配返回0。

[jerry]$ awk 'BEGIN {
	str = "One Two Three"
	subs = "Two"

	ret = match(str, subs)

	printf "Substring \"%s\" found at %d location.\n", subs, ret
}'

在執行上麵的代碼後，得到以下結果：

Substring "Two" found at 5 location.

split(str, arr, regex)

這個函數分割字符串str為正則表達式regex字段，字段被加載到數組arr。如果省略regex那麼fs被使用。

[jerry]$ awk 'BEGIN {
	str = "One,Two,Three,Four"

	split(str, arr, ",")

	print "Array contains following values"

	for (i in arr) {
		print arr[i]
	}
}'

在執行上麵的代碼後，得到以下結果：

Array contains following values
One
Two
Three
Four

sprintf(format, expr-list)

該函數返回按照expr-list格式構造一個字符串。

[jerry]$ awk 'BEGIN {
	str = sprintf("%s", "Hello, World !!!")

	print str
}'

在執行上麵的代碼後，得到以下結果：

Hello, World !!!

strtonum(str)

這個函數檢查str並返回它的數值。如果str以0開始，把它當作一個八進製數。如果str開頭是0x或0X，那麼它當作一個十六進製數。否則，假設它是一個十進製數。

[jerry]$ awk 'BEGIN {
	print "Decimal num = " strtonum("123")
	print "Octal num = " strtonum("0123")
	print "Hexadecimal num = " strtonum("0x123")
}'

在執行上麵的代碼後，得到以下結果：

Decimal num = 123
Octal num = 83
Hexadecimal num = 291

sub(regex, sub, string)

這個函數執行單一的替代。它用正則表達式子第一次出現。第三個參數是可選的，如果它被刪去，$0被使用。

[jerry]$ awk 'BEGIN {
	str = "Hello, World"

	print "String before replacement = " str

	sub("World", "Jerry", str)

	print "String after replacement = " str
}'

在執行上麵的代碼後，得到以下結果：

String before replacement = Hello, World
String after replacement = Hello, Jerry

substr(str, start, l)

該函數返回字符串str的子字符串，起始於長度l為索引開始。如果省略長度，則返回str的後綴為索引起始。

[jerry]$ awk 'BEGIN {
	str = "Hello, World !!!"
	subs = substr(str, 1, 5)

	print "Substring = " subs
}'

在執行上麵的代碼後，得到以下結果：

Substring = Hello

tolower(str)

該函數返回字符串str具有轉換為小寫全部大寫字符的副本。

[jerry]$ awk 'BEGIN {
	str = "HELLO, WORLD !!!"

	print "Lowercase string = " tolower(str)
}'

在執行上麵的代碼後，得到以下結果：

Lowercase string = hello, world !!!

toupper(str)

該函數返回字符串str具有轉換為大寫小寫字符的副本。

[jerry]$ awk 'BEGIN {
	str = "hello, world !!!"

	print "Uppercase string = " toupper(str)
}'

在執行上麵的代碼後，得到以下結果：

Uppercase string = HELLO, WORLD !!!

時間函數

AWK擁有的內置時間函數如下：

systime

該函數返回當天的當前時間以來的大紀元（1970-01-0100:00:00 UTC在POSIX係統）的秒數。

[jerry]$ awk 'BEGIN {
	print "Number of seconds since the Epoch = " systime()
}'

在執行上麵的代碼後，得到以下結果：

Number of seconds since the Epoch = 1418574432

mktime(datespec)

返回由systime()這個函數轉換的timespec字符串進入相同的形式的時間標記。所述的timespec形式如YYYY MM DD HH MM SS的字符串。

[jerry]$ awk 'BEGIN {
	print "Number of seconds since the Epoch = " mktime("2014 12 14 30 20 10")
}'

在執行上麵的代碼後，得到以下結果：

Number of seconds since the Epoch = 1418604610

strftime([format [, timestamp[, utc-flag]]])

根據格式規範此函數格式化時間戳。

[jerry]$ awk 'BEGIN {
	print strftime("Time = %m/%d/%Y %H:%M:%S", systime())
}'

在執行上麵的代碼後，得到以下結果：

Time = 12/14/2014 22:08:42

以下是由AWK支持的各種時間格式：

日期格式規範	描述
%a	The locale’s abbreviated weekday name.
%A	The locale’s full weekday name.
%b	The locale’s abbreviated month name.
%B	The locale’s full month name.
%c	The locale’s appropriate date and time representation. (This is %A %B %d %T %Y in the C locale.)
%C	The century part of the current year. This is the year divided by 100 and truncated to the next lower integer.
%d	The day of the month as a decimal number (01–31).
%D	Equivalent to specifying %m/%d/%y.
%e	The day of the month, padded with a space if it is only one digit.
%F	Equivalent to specifying %Y-%m-%d. This is the ISO 8601 date format.
%g	The year modulo 100 of the ISO 8601 week number, as a decimal number (00–99). For example, January 1, 1993 is in week 53 of 1992. Thus, the year of its ISO 8601 week number is 1992, even though its year is 1993. Similarly, December 31, 1973 is in week 1 of 1974. Thus, the year of its ISO week number is 1974, even though its year is 1973.
%G	The full year of the ISO week number, as a decimal number.
%h	Equivalent to %b.
%H	The hour (24-hour clock) as a decimal number (00–23).
%I	The hour (12-hour clock) as a decimal number (01–12).
%j	The day of the year as a decimal number (001–366).
%m	The month as a decimal number (01–12).
%M	The minute as a decimal number (00–59).
%n	A newline character (ASCII LF).
%p	The locale’s equivalent of the AM/PM designations associated with a 12-hour clock.
%r	The locale’s 12-hour clock time. (This is %I:%M:%S %p in the C locale.)
%R	Equivalent to specifying %H:%M.
%S	The second as a decimal number (00–60).
%t	A TAB character.
%T	Equivalent to specifying %H:%M:%S.
%u	The weekday as a decimal number (1–7). Monday is day one.
%U	The week number of the year (the first Sunday as the first day of week one) as a decimal number (00–53).
%V	The week number of the year (the first Monday as the first day of week one) as a decimal number (01–53).
%w	The weekday as a decimal number (0–6). Sunday is day zero.
%W	The week number of the year (the first Monday as the first day of week one) as a decimal number (00–53).
%x	The locale’s appropriate date representation. (This is %A %B %d %Y in the C locale.)
%X	The locale’s appropriate time representation. (This is %T in the C locale.)
%y	The year modulo 100 as a decimal number (00–99).
%Y	The full year as a decimal number (e.g. 2011).
%z	The time-zone offset in a +HHMM format (e.g., the format necessary to produce RFC 822/RFC 1036 date headers).
%Z	The time zone name or abbreviation; no characters if no time zone is determinable.

位操作函數

AWK具有以下內置位操作功能：

and

執行按位與運算。

[jerry]$ awk 'BEGIN {
	num1 = 10
	num2 = 6

	printf "(%d AND %d) = %d\n", num1, num2, and(num1, num2)
}'

在執行上麵的代碼後，得到以下結果：

(10 AND 6) = 2

compl

執行按位補操作。

[jerry]$ awk 'BEGIN {
	num1 = 10

	printf "compl(%d) = %d\n", num1, compl(num1)
}'

在執行上麵的代碼後，得到以下結果：

compl(10) = 9007199254740981

lshift

執行按位左移運算。

[jerry]$ awk 'BEGIN {
	num1 = 10

	printf "lshift(%d) by 1 = %d\n", num1, lshift(num1, 1)
}'

在執行上麵的代碼後，得到以下結果：

lshift(10) by 1 = 20

rshift

執行按位向右移位操作。

[jerry]$ awk 'BEGIN {
	num1 = 10

	printf "rshift(%d) by 1 = %d\n", num1, rshift(num1, 1)
}'

在執行上麵的代碼後，得到以下結果：

rshift(10) by 1 = 5

or

執行按位或運算。

[jerry]$ awk 'BEGIN {
	num1 = 10
	num2 = 6

	printf "(%d OR %d) = %d\n", num1, num2, or(num1, num2)
}'

在執行上麵的代碼後，得到以下結果：

(10 OR 6) = 14

xor

執行按位異或操作。

[jerry]$ awk 'BEGIN {
	num1 = 10
	num2 = 6

	printf "(%d XOR %d) = %d\n", num1, num2, xor(num1, num2)
}'

在執行上麵的代碼後，得到以下結果：

(10 bitwise xor 6) = 12

其它函數

AWK具有以下輔助功能：

close(expr)

這個函數將關閉管道文件。

[jerry]$ awk 'BEGIN {
	cmd = "tr [a-z] [A-Z]"
	print "hello, world !!!" |& cmd
	close(cmd, "to")
	cmd |& getline out
	print out;
	close(cmd);
}'

在執行上麵的代碼後，得到以下結果：

HELLO, WORLD !!!

讓我們來看看下麵的腳本解釋：

第一條語句，cmd = "tr [a-z] [A-Z]" -是要從AWK建立雙向通信的命令。

下一個語句即打印命令，提供輸入到tr命令。此處&|指示雙向通信。

第三個語句，即close- 關閉完成後執行處理。

接下來語句cmd |& getline函數輸出到出變量，在函數getline函數的幫助下。

接下來print語句打印輸出，並最終close函數關閉命令。

delete

這個函數會刪除數組中的元素。下麵簡單的例子顯示了這個函數的使用：

[jerry]$ awk 'BEGIN {
	arr[0] = "One"
	arr[1] = "Two"
	arr[2] = "Three"
	arr[3] = "Four"

	print "Array elements before delete operation:"
	for (i in arr) {
		print arr[i]
	}

	delete arr[0]
	delete arr[1]

	print "Array elements after delete operation:"
	for (i in arr) {
		print arr[i]
	}
}'

在執行上麵的代碼後，得到以下結果：

Array elements before delete operation:
One
Two
Three
Four

Array elements after delete operation:
Three
Four

exit

這個函數停止腳本的執行。它也接受一個可選的expr變成AWK的返回值。下麵的例子說明了exit函數的用法。

[jerry]$ awk 'BEGIN {
	print "Hello, World !!!"

	exit 10

	print "AWK never executes this statement."
}'

在執行上麵的代碼後，得到以下結果：

Hello, World !!!

fflush

這個函數用於清空打開輸出文件或管道相關的緩衝區。下麵是函數的語法。

fflush([output-expr])

如果冇有提供output-expr，它清空標準輸出。如果輸出expr為空字符串（“”），則清空所有打開的文件和管道。

getline

這個函數指示AWK讀取下一行。下麵的例子讀取和使用getline函數的顯示marks.txt文件的內容。

[jerry]$ awk '{getline; print $0}' marks.txt

在執行上麵的代碼後，得到以下結果：

2)	Rahul	Maths	90
4)	Kedar	English	85
5)	Hari	History	89

我們來一步一步解釋以上代碼：

在開始時，AWK從marks.txt文件讀取第一行並將其存儲到$0變量。

在接下來的語句，指示AWK使用函數getline讀取下一行。因此AWK讀取第二行，並將其存儲到$0變量。

最後AWK的print語句打印第二行。這個過程繼續，直到文件內容被讀取完。

next函數改變程序流程。它會導致模式空間的當前處理停止。程序讀取下一行，並開始使用新的行再次執行的命令。例如下麵的程序模式匹配成功時不執行任何處理。

[jerry]$ awk '{if ($0 ~/Shyam/) next; print $0}' marks.txt

在執行上麵的代碼後，得到以下結果：

1)	Amit	Physics	80
2)	Rahul	Maths	90
4)	Kedar	English	85
5)	Hari	History	89

nextfile

nextfile函數改變程序流。它停止處理當前輸入的文件，並通過模式/程序語句啟動新的周期，下一個文件的第一條記錄開始。比如下麵的例子模式匹配成功時將停止第一個文件的處理。

首先，創建兩個文件。file1.txt內容看起來如下：

file1:str1
file1:str2
file1:str3
file1:str4

file2.txt內容看起來如下：

file2:str1
file2:str2
file2:str3
file2:str4

現在，讓我們使用nextfile函數：

[jerry]$ awk '{ if ($0 ~ /file1:str2/) nextfile; print $0 }' file1.txt file2.txt

在執行上麵的代碼後，得到以下結果：

file1:str1
file2:str1
file2:str2
file2:str3
file2:str4

return

這個函數可以在用戶定義函數內使用以返回該值。請注意，函數的返回值是不確定的，如果冇有提供expr 。下麵的例子說明了return函數的使用。

首先，創建一個包含AWK命令functions.awk文件，如下所示：

function addition(num1, num2)
{
	result = num1 + num2

	return result
}

BEGIN {
	res = addition(10, 20)
	print "10 + 20 = " res
}

在執行上麵的代碼後，得到以下結果：

10 + 20 = 30

system

這個函數執行指定的命令，並返回它的退出狀態。返回狀態0表示命令執行成功。非零值表示命令執行的故障。例如下麵的例子顯示當前日期以及也顯示命令的返回狀態。

[jerry]$ awk 'BEGIN { ret = system("date"); print "Return value = " ret }'

在執行上麵的代碼後，得到以下結果：

Sun Dec 21 23:16:07 IST 2014
Return value = 0