1.9 Sort and Uniq: Comparing and Manipulating File Content

1.9 Sort and Uniq: Comparing and Manipulating File Content

Comparing and Manipulating File Content with Sort and Uniq

Continuing our exploration of command-line file manipulation, this part of the blog focuses on the sort and uniq commands. We'll delve into how these commands, when paired with regular expressions, can efficiently compare and manipulate file content.

Sort Command

The sort command is instrumental in arranging lines within a file, be it alphabetically or numerically. Regular expressions come into play when defining custom sorting criteria.

Example 1: Sort lines in a file alphabetically

sort filename.txt

Example 2: Reverse sort lines based on the second column (numerically)

sort -k2,2nr data.txt

Example 3: Sort lines based on a custom pattern (e.g., month abbreviation)

sort -t"-" -k2,2M -k3,3n dates.txt

Example 4: Sort lines based on the last word in each line

sort -t" " -kNF file.txt

Example 5: Sort lines ignoring leading whitespaces

sort -b data.txt

Uniq Command

The uniq command, as the name suggests, is designed to identify and filter out repeated lines within a file. Regular expressions add depth to its functionality.

Example 1: Display only unique lines in a sorted file

sort data.txt | uniq

Example 2: Count and display the number of occurrences of each line

sort logfile.txt | uniq -c

Example 3: Display only repeated lines in a sorted file

sort data.txt | uniq -d

Example 4: Display only unique lines, ignoring the first 5 characters

sort file.txt | uniq -s 5

Example 5: Display only the first occurrence of each repeated line

sort data.txt | uniq -u

By combining sort and uniq with regular expressions, you gain precise control over how data is ordered and filtered. These commands, when used in conjunction with the previously discussed ones, provide a comprehensive toolkit for efficient file content manipulation on the command line.

Understanding these examples and experimenting with various patterns will empower you to handle a wide range of file manipulation tasks effectively. Regular expressions, as the common thread, tie together these commands into a cohesive and powerful set for text processing and analysis.

💡
Congratulations on completing module 1. You have learned a lot in this module and I hope you enjoyed it. Now you are ready to move on to module 2 and explore new topics. I look forward to hearing from you and wish you a happy learning experience.

Did you find this article valuable?

Support Vijay Kumar Singh by becoming a sponsor. Any amount is appreciated!