hpr2114 :: Gnu Awk - Part 1
An introduction the the awk text parsing tool

Hosted by Mr. Young on Thursday, 2016-09-08 is flagged as Explicit and is released under a CC-BY-SA license. 
             awk,        bash,        linux.
    
(Be the first). 
 Listen in ogg,
opus,
or mp3 format.  Play now:
	
	
	Duration: 00:22:30
Download the transcription and  
	subtitles.
Learning Awk.
Episodes about using Awk, the text manipulation language. It comes in various forms called awk, nawk, mawk and gawk, but the standard version on Linux is GNU Awk (gawk). It's a programming language optimised for the manipulation of delimited text.
Introduction to Awk
Awk is a powerful text parsing tool for unix and unix-like systems.
The basic syntax is:
awk [options] 'pattern {action}' fileHere is a simple example file that we will be using, called file1.txt:
name       color  amount
apple      red    4
banana     yellow 6
strawberry red    3
grape      purple 10
apple      green  8
plum       purple 2
kiwi       brown  4
potato     brown  9
pineapple  yellow 5First command:
awk '{print $2}' file1.txtAs you can see, the “print” command will display the whatever follows. In this case we are showing the second column using “$2”. This is intuitive. To display all columns, use “$0”.
This example will output:
color
red
yellow
red
purple
green
purple
brown
brown
yellowSecond command:
awk '$2=="yellow"{print $1}' file1.txtThis will output:
banana
pineappleAs you can see, the command matches items in column 2 matching “yellow”, but prints column 1.
Field separator
By default, awk uses white space as the file separator. You can change this by using the -F option. For instance, file1.csv looks like this:
name,color,amount
apple,red,4
banana,yellow,6
strawberry,red,3
grape,purple,10
apple,green,8
plum,purple,2
kiwi,brown,4
potato,brown,9
pineapple,yellow,5A similar command as before:
awk -F"," '$2=="yellow" {print $1}' file1.csvwill still output:
banana
pineappleRegular expressions work as well:
awk '$2 ~ /p.+p/ {print $0}' file1.txtThis returns:
grape   purple  10
plum    purple  2Numbers are interpreted automatically:
awk '$3>5 {print $1, $2}' file1.txtWill output:
name    color
banana  yellow
grape   purple
apple   green
potato  brownUsing output redirection, you can write your results to file. For example:
awk -F, '$3>5 {print $1, $2}' file1.csv > output.txtThis will output a file with the contents of the query.
Here’s a cool trick! You can automatically split a file into multiple files grouped by column. For example, if I want to split file1.txt into multiple files by color, here is the command.
awk '{print > $2".txt"}' file1.txtThis will produce files named yellow.txt, red.txt, etc. In upcoming episodes, we will show how to improve the outputs.
Resources
- https://www.theunixschool.com/p/awk-sed.html
- https://www.tecmint.com/category/awk-command/
- https://linux.die.net/man/1/awk
Coming up
- More options
- Built-in Variables
- Arithmetic operations
- Awk language and syntax