2.7 Ensuring System Health: A Comprehensive Guide to Verifying Resource Integrity and Availability

2.7 Ensuring System Health: A Comprehensive Guide to Verifying Resource Integrity and Availability

In today's dynamic and ever-evolving IT landscape, ensuring the integrity and availability of system resources is paramount for maintaining optimal performance. In this technical blog, we will delve into various commands used to verify the integrity and availability of essential resources on a Linux system. From disk space to CPU information, we will explore the applications of commands such as df, du, free, uptime, lscpu, and lspci, providing insights into their significance and practical usage.

#check the disk space
df -h

#check disk usage of a particular directory : s-summariezied h-human readable
du -sh dirName

#check RAM usage
free -h

#processor usage
uptime
17:24:55 up 32 min, 1 user, load average: 0.05, 0.05, 0.01
#after load avg
#0.05 - tells the load avg for last min
#0.05 - load avg for last 5 min
#0.01 - load avg for last 15 min

#check details of cpu
lscpu

#details about other hardware
lspci

#to check a filesystem for error we 1st unmount it
#to verify xfs system
sudo xfs_repair -v /dev/vdb1

#to check and repair ext4 file system (-v:verbose output, -f:forces a check, -p:preen mode)
sudo fsck.ext4 -v -f -p /dev/vdb2

#how to make sure if imp processes and programs are working correctly?
systemctl list-dependencies
#green:service running
#white:service stopped
  1. Disk Space Verification (df):

    • Command: df -h

    • Example 1: Check overall disk space usage.

    • Example 2: Display disk space for a specific file system.

    • Example 3: List disk space statistics in human-readable format.

    • Example 4: Show available and used inodes.

    • Example 5: Exclude certain file systems from the output.

  2. Disk Usage (du):

    • Command: du -h

    • Example 1: Analyze disk usage for a specific directory.

    • Example 2: Display the sizes of subdirectories within a directory.

    • Example 3: Sort and display top disk space-consuming directories.

    • Example 4: Exclude specific directories from the analysis.

    • Example 5: Provide a summary of disk space usage for multiple directories.

  3. System Memory Information (free):

    • Command: free -m

    • Example 1: Display total, used, and free memory in megabytes.

    • Example 2: Show memory usage with buffers and cache.

    • Example 3: Highlight swap usage alongside physical memory.

    • Example 4: Update memory statistics at regular intervals.

    • Example 5: Display memory information in a continuous stream.

  4. System Uptime (uptime):

    • Command: uptime

    • Example 1: Obtain the current system uptime.

    • Example 2: Display the number of logged-in users.

    • Example 3: Show system load averages over different time intervals.

    • Example 4: Customize the output format to include additional information.

    • Example 5: Check system uptime in a more human-readable format.

  5. CPU Information (lscpu):

    • Command: lscpu

    • Example 1: Retrieve detailed information about the CPU architecture.

    • Example 2: Display information about the number of CPU sockets.

    • Example 3: Show the CPU's maximum and minimum clock speeds.

    • Example 4: Highlight the CPU's thread, core, and socket information.

    • Example 5: Customize the output format to include specific details.

  6. PCI Devices Information (lspci):

    • Command: lspci

    • Example 1: List all PCI devices connected to the system.

    • Example 2: Display detailed information about a specific PCI device.

    • Example 3: Show PCI devices in a specific domain or bus.

    • Example 4: Output information in a machine-readable format.

    • Example 5: Filter and display only network-related PCI devices.

Regularly verifying the integrity and availability of system resources using commands like df, du, free, uptime, lscpu, and lspci is crucial for maintaining a healthy IT infrastructure. By incorporating these commands into your system monitoring and maintenance routines, you can proactively identify and address potential issues, ensuring the smooth operation of your systems.

Did you find this article valuable?

Support Vijay Kumar Singh by becoming a sponsor. Any amount is appreciated!