Table of Contents
ToggleIntroduction
If you’ve ever worked with Linux, you know how important it is to quickly process and analyze text files. That’s where AWK in Linux comes in. AWK is a powerful text-processing command and scripting language that helps you filter data, extract columns, and even automate reporting.
Professionals across a variety of fields rely on tools like AWK and GREP, including:
- System administrators (log analysis, user management, performance monitoring)
- Data analysts and scientists (cleaning and parsing structured/unstructured data)
- Cybersecurity specialists (log forensics, intrusion detection, pattern matching)
- Software developers (code searching, quick text transformations)
- DevOps engineers (automation, CI/CD pipelines, configuration checks)
- Network engineers (packet/log inspection, network monitoring)
- Researchers and academics (processing large datasets, text mining)
If you’re new to text searching, you might want to check out our beginner’s guide to GREP in Linux first, since GREP is often the starting point before diving into AWK.
In this guide, we’ll cover everything you need to know about the AWK command in Linux. From simple print statements to embedding AWK in shell scripts, you’ll see practical examples you can try immediately. We’ll also compare AWK with GREP and SED, explain GAWK (GNU AWK), and provide ready-to-use code snippets.
What is AWK in Linux?
AWK is a scripting language created by Alfred Aho, Peter Weinberger, and Brian Kernighan in the 1970s. Its name comes from their initials (A-W-K).
In Linux, AWK is both:
- A command-line tool used for text processing.
- A programming language with variables, loops, and functions.
AWK makes it easy to:
- Extract specific columns from files.
- Filter lines based on conditions.
- Generate reports.
There are different versions of AWK, the most common being GAWK (GNU AWK).
How to use AWK in Linux
The basic syntax of AWK is:
awk 'pattern { action }' filename
In this sample syntax, the pattern is the condition (and is optional), while the action is what you want AWK to perform (print, calculate, etc.).
For some of our examples, we are going to work with a file called access.log, which contains the following data:
root@606e308656fe:/var/log/nginx# cat access.log
127.0.0.1 - - [31/Jul/2025:14:31:15 +0000] "GET /index.html HTTP/1.1" 200 1024
192.168.1.15 - - [31/Jul/2025:14:31:16 +0000] "POST /login HTTP/1.1" 302 512
203.0.113.42 - - [31/Jul/2025:14:31:17 +0000] "GET /about.html HTTP/1.1" 404 230
127.0.0.1 - - [31/Jul/2025:14:31:18 +0000] "GET /healthcheck HTTP/1.1" 200 64
198.51.100.23 - - [31/Jul/2025:14:31:19 +0000] "GET /contact.html HTTP/1.1" 200 840
Example: print every line from a file:
awk '{print}' access.log
The above command is more like a default AWK command, which, when run against our log file, returns the entire contents of the file, as shown below:
root@606e308656fe:/# awk '{print}' /var/log/nginx/access.log
127.0.0.1 - - [31/Jul/2025:14:31:15 +0000] "GET /index.html HTTP/1.1" 200 1024
192.168.1.15 - - [31/Jul/2025:14:31:16 +0000] "POST /login HTTP/1.1" 302 512
203.0.113.42 - - [31/Jul/2025:14:31:17 +0000] "GET /about.html HTTP/1.1" 404 230
127.0.0.1 - - [31/Jul/2025:14:31:18 +0000] "GET /healthcheck HTTP/1.1" 200 64
198.51.100.23 - - [31/Jul/2025:14:31:19 +0000] "GET /contact.html HTTP/1.1" 200 840
One of the ways we can use AWK to manipulate our data is by printing only the first column of our log file (the IP addresses).
Print First Column Using Awk in Linux
root@606e308656fe:/# awk '{print $1}' /var/log/nginx/access.log
127.0.0.1
192.168.1.15
203.0.113.42
127.0.0.1
198.51.100.23
As you can see, only the IP addresses have been printed. To include the 4th column of the log file in the output, we can modify our awk command as follows:
Print Multiple Columns Using AWK in Linux
root@606e308656fe:/# awk '{print $1, $4}' /var/log/nginx/access.log
127.0.0.1 [31/Jul/2025:14:31:15
192.168.1.15 [31/Jul/2025:14:31:16
203.0.113.42 [31/Jul/2025:14:31:17
127.0.0.1 [31/Jul/2025:14:31:18
198.51.100.23 [31/Jul/2025:14:31:19
This ability to print multiple columns makes the awk tool very useful when working with CSV files.
String Operations
In this section on AWK in Linux, we will demonstrate more advanced usage with a log file called messages. We will show how awk can manipulate strings by selecting columns 4 and 6 (server1 and Failed) and converting them to upper and lower case, respectively.
Refer to the sample log below:
root@606e308656fe:/var/log# cat messages
Aug 16 09:12:04 server1 sshd[1324]: Failed password for invalid user admin from 192.168.1.20 port 51234 ssh2
Aug 16 09:12:05 server1 sshd[1324]: Failed password for root from 203.0.113.5 port 51102 ssh2
Aug 16 09:12:07 server1 sshd[1324]: Failed password for root from 203.0.113.5 port 51102 ssh2
Aug 16 09:13:15 server1 systemd[1]: Failed to start Apache HTTP Server.
Aug 16 09:14:21 server1 kernel: [12345.678901] critical temperature reached: shutting down CPU!
Aug 16 09:14:55 server1 systemd[1]: Failed to start MariaDB database server.
Aug 16 09:15:22 server1 sshd[1401]: error: maximum authentication attempts exceeded for root from 198.51.100.10 port 42412 ssh2
Example:
root@606e308656fe:/# awk '{print toupper($4), tolower($6)}' /var/log/messages
SERVER1 failed
SERVER1 failed
SERVER1 failed
SERVER1 failed
SERVER1 [12345.678901]
SERVER1 failed
SERVER1 error:
SERVER1 [12346.123456]
SERVER1 failed
SERVER1 [12347.654321]
SERVER1 pam_unix(sudo:auth):
SERVER1 pam_unix(sudo:auth):
SERVER1 failed
SERVER1 [12348.987654]
SERVER1 failed
SERVER1 error:
SERVER1 [12349.543210]
SERVER1 failed
SERVER1 [12350.111111]
SERVER1 failed
The 4th and 6th columns have been converted to uppercase and lowercase, respectively.
Using AWK in Shell Scripts and Bash
AWK can be embedded in Bash scripts for automation.
Example: extract some data from /var/log/messages
#!/bin/bash
awk -F: '{print $4}' /var/log/messages
Failed password for invalid user admin from 192.168.1.20 port 51234 ssh2
Failed password for root from 203.0.113.5 port 51102 ssh2
Failed password for root from 203.0.113.5 port 51102 ssh2
Failed to start Apache HTTP Server.
[12345.678901] critical temperature reached
Failed to start MariaDB database server.
error
[12346.123456] ata2.00
Failed to mount /data.
[12347.654321] critical
pam_unix(sudo
pam_unix(sudo
Failed to start Docker Application Container Engine.
[12348.987654] usb 1-1
Failed to start Network Manager Wait Online.
error
[12349.543210] nvme0n1
Failed to start LSB
[12350.111111] EXT4-fs error (device sda1)
Failed to start PostgreSQL database server.
Explanation:
-F:sets the field separator as:.$4prints the 4th column.
AWK vs GREP: What’s the Difference?
Both AWK and GREP are text-processing tools, but they serve different purposes.
- GREP → best for searching patterns.
- AWK → best for extracting and processing data.
Example with GREP:
root@606e308656fe:/# grep 'ERROR' /var/log/app.log
2025-07-31 14:31:15,789 ERROR Failed to connect to database: timeout
And an example with AWK in Linux Command:
root@606e308656fe:/# awk '/ERROR/ {print $3}' /var/log/app.log
ERROR
- GREP finds the line.
- AWK finds the line and processes it (extracts third column).
AWK and SED: Working Together
SED is another Linux command for text manipulation. AWK and SED often complement each other.
Example: Remove empty lines with SED, then print the first column with AWK in Linux:
root@606e308656fe:/# sed '/^$/d' /var/log/app.log | awk '{print $4}'
Starting
Cache
Failed
User
Processing
Request
Advanced AWK Program Examples
Find the Longest Line
root@606e308656fe:/# awk '{ if (length($0) > max) { max = length($0); line = $0 } } END { print line }' /var/log/app.log
2025-07-31 14:31:15,789 ERROR Failed to connect to database: timeout
This sample awk command scanned through our access.log file, identified the longest line, and printed it out.
What is GAWK? (GNU AWK)
GAWK is the GNU implementation of AWK. It includes all AWK features plus:
- Better performance.
- Additional built-in functions.
- Better portability across systems.
To check if you’re using GAWK:
gawk --version
FAQs about AWK in Linux
Q: What is AWK in Linux used for?
A: AWK is used for text processing, filtering, and report generation.
Q: Is AWK better than GREP?
A: They’re different — GREP searches, AWK processes. Use AWK when you need to extract columns or perform calculations.
Q: How do I use AWK in Bash scripts?
A: You can embed AWK commands inside Bash scripts to parse files automatically.
Q: What’s the difference between AWK and GAWK?
A: GAWK is the GNU version of AWK with more features.
Conclusion
The AWK in Linux command is one of the most versatile tools for working with text. From simple printing to advanced data processing, AWK belongs in every sysadmin’s toolbox.
Next, you should learn how AWK works alongside GREP and SED. Check out my detailed guide on GREP in Linux to continue mastering Linux text-processing commands.
You can also watch the video if you prefer.
In this **AWK Tutorial in Linux**, we explore how to use the **awk command in bash** with real-world examples.
Whether you’re a beginner in **awk programming language** or curious about **awk vs sed vs grep**, this video will guide you step by step.












