1.8 Comparing and Manipulating File Content with Find, Grep, Awk, and Sed

1.8 Comparing and Manipulating File Content with Find, Grep, Awk, and Sed

File manipulation is a fundamental skill for anyone working with the command line. In this blog, we'll explore how to compare and manipulate file content using find, grep, awk, and sed. Regular expressions will be our trusty companion throughout, enhancing the power and flexibility of these commands.

The Role of Regular Expressions

Before diving into specific commands, let's understand the importance of regular expressions. Regular expressions, often abbreviated as regex, are patterns that define a search. They enable us to express flexible and complex search criteria, making them invaluable for text processing.

💡
For more details about Regx (Regular Expressions) operators please refer 1.7 blog of Module 1.

Find Command

The find command is excellent for locating files based on various attributes. When combined with regex, it becomes a potent tool for precision searches.

Example 1: Find all text files in the current directory

find . -type f -regex ".*\.txt"

Example 2: Find files modified within the last 7 days

find . -type f -mtime -7

Example 3: Find files starting with "report"

find . -type f -regex "./report.*"

Example 4: Find all directories containing "data"

find . -type d -name "*data*"

Example 5: Find files with names matching a specific pattern

find . -type f -name "file[0-9].txt"

Grep Command

grep is a versatile tool for searching text patterns within files. It's an indispensable command for quickly isolating relevant information.

Example 1: Search for lines containing "error" in a log file

grep "error" logfile.txt

Example 2: Search for lines starting with "warning" in multiple files

grep "^warning" *.log

Example 3: Search for lines not containing "success" in a file

grep -v "success" data.txt

Example 4: Search for lines with words starting with "cat" in a file

grep "\<cat" animals.txt

Example 5: Search for lines with exactly 8 characters in a file

grep -E "^.{8}$" passwords.txt

Awk Command

awk is a powerful text processing tool that operates on a per-line basis. Regular expressions enhance its ability to extract and manipulate data.

Example 1: Print the second column of a CSV file

awk -F, '{print $2}' data.csv

Example 2: Print lines containing more than 3 fields in a file

awk 'NF > 3' report.txt

Example 3: Replace "apple" with "orange" in the third column of a file

awk '{gsub(/apple/, "orange", $3); print}' fruits.txt

Example 4: Print lines where the first column starts with "A"

awk '$1 ~ /^A/' names.txt

Example 5: Calculate and print the average of the numbers in the second column

awk '{sum+=$2} END {print sum/NR}' numbers.txt

Sed Command

sed is a stream editor used for text transformations. Regular expressions make it a robust tool for search and replace operations.

Example 1: Replace all occurrences of "apple" with "orange" in a file

sed 's/apple/orange/g' filename.txt

Example 2: Delete lines containing the word "obsolete"

sed '/obsolete/d' data.txt

Example 3: Append " - Processed" to lines starting with "Task:"

sed '/^Task:/ s/$/ - Processed/' tasks.txt

Example 4: Replace the first occurrence of "blue" with "red" on each line

sed 's/blue/red/' colors.txt

Example 5: Insert a new line after lines containing "important"

sed '/important/a\New Line Inserted' notes.txt

Regular expressions, when coupled with these commands, provide a powerful toolkit for anyone dealing with file content manipulation in the command line. Mastering these concepts will significantly enhance your efficiency and effectiveness in handling textual data.

Did you find this article valuable?

Support Vijay Kumar Singh by becoming a sponsor. Any amount is appreciated!