Bash idioms

  • by

In a nutshell

bash idioms are tiny scripts, mostly one-liners, that accomplish a lot and can be used as building blocks in bigger scripts.

1.0 Find most frequent words

Suppose we have a bunch of text files and we wish to find the most frequently used words in those files, we can do that with the command,

cat * | tr -sc '[:graph:]' '\n' | sort | uniq -c | sort -nr

First, we capture all input with cat and pipe it to the tr command. tr translates the complement of graphic (printable) characters, that is whitespace, into newlines, squeezing multiple whitespace characters into one. Then, it sorts the input. This puts words in sorted order, one word per line. Duplicate words, if any, are put on consecutive lines. With the uniq -c command, we replace duplicates with a single line containing the count and the word. Finally, we sort the file numerically in the reverse order to get the most frequent words at the top. For example, if the above script is put in a file named freq, and the file is made executable, we can find the ten most frequent words with the command,

$ ./freq info.txt | head
    259 the
    126 and
    109 of
    106 to
     96 a
     73 in
     67 software
     63 is
     45 be
     41 The

2.0 Copy files and directories recursively

It turns out that rsync is the best copying command around. Mostly, we wish to copy files. When we copy files, we, mostly, want to preserve the file attributes. If a file is a directory, we want it to be copied to the target recursively. And sometimes we wish to skip some files or directories during the copy process. rsync provides all these facilities.

$ # copy file to current directory.
$ sudo rsync -avz ~/www/index.php .
sending incremental file list

sent 59 bytes  received 12 bytes  142.00 bytes/sec
total size is 529  speedup is 7.45
$
$ # copy directory recursively to current directory.
$ sudo rsync -avz ~/www/sites .
sending incremental file list
sites/
sites/example.sites.php
sites/all/
sites/all/modules/
sites/all/modules/admin_menu/
...
$
$ # copy sites to current directory recursively but skip the
$ # "all" and "default" sub-directories
$ sudo rsync -avz --exclude all --exclude default ~/www/sites .
sending incremental file list
sites/
sites/example.sites.php

sent 1,102 bytes  received 39 bytes  2,282.00 bytes/sec
total size is 2,365  speedup is 2.07

3.0 List files with names sorted numerically

If the version number is embedded in the file name, ls does not list those files in the correct numerical order. Using the -v option, we get the correct file order in the ls output.

$ ls syslog*
syslog    syslog.10.gz  syslog.20.gz  syslog.2.gz   syslog.3.gz  syslog.5.gz  syslog.7.gz  syslog.9.gz
syslog.1  syslog.11.gz  syslog.24.gz  syslog.30.gz  syslog.4.gz  syslog.6.gz  syslog.8.gz
$ ls -v syslog*
syslog    syslog.2.gz  syslog.4.gz  syslog.6.gz  syslog.8.gz  syslog.10.gz  syslog.20.gz  syslog.30.gz
syslog.1  syslog.3.gz  syslog.5.gz  syslog.7.gz  syslog.9.gz  syslog.11.gz  syslog.24.gz

The same result is obtained by passing the ls output through sort and using . as the field separator and sorting numerically based on the second key.

$ ls syslog* | sort -t . -n -k2,2
syslog
syslog.1
syslog.2.gz
syslog.3.gz
syslog.4.gz
syslog.5.gz
syslog.6.gz
syslog.7.gz
syslog.8.gz
syslog.9.gz
syslog.10.gz
syslog.11.gz
syslog.20.gz
syslog.24.gz
syslog.30.gz

The same result is achieved by the ls -v command.

$ ls -v syslog* | more
syslog
syslog.1
syslog.2.gz
syslog.3.gz
syslog.4.gz
syslog.5.gz
syslog.6.gz
syslog.7.gz
syslog.8.gz
syslog.9.gz
syslog.10.gz
syslog.11.gz
syslog.20.gz
syslog.24.gz
syslog.30.gz

4.0 Find files based on matching patterns in contents

Consider the case where the find command gives a list of files and we wish to grep for a pattern in those files. This is easily accomplished by the xargs command, which is used for building command line from its standard input. For example,

$ find . -name '*.c' | xargs grep 'fread'
./alt/texttags.c:    fread( &lenght, sizeof( gsize ), 1, input );
./alt/texttags.c:    fread( data, sizeof( guint8 ), lenght, input );
./save.c:	fread( &lenght, sizeof( gsize ), 1, input );
./save.c:	fread( data, sizeof( guint8 ), lenght, input );

5.0 List all sub-directories under a directory

ls -al lists all the files and sub-directories under a directory. But what about the case when you just want the sub-directory listing? The answer is to pipe the ls output to grep, selecting all lines starting with a d.

$ ls -al | grep '^d'
drwxrwxr-x  5 user1 user1 4096 Apr  1 07:11 . 
drwxr-xr-x 65 user1 user1 4096 Apr  1 07:06 ..
drwxrwxr-x  8 user1 user1 4096 Mar 31 19:43 HelloWorld
drwxrwxr-x  2 user1 user1 4096 Apr  1 07:10 new
drwxrwxr-x  2 user1 user1 4096 Apr  1 07:11 tmp

6.0 Manipulating file and path names

6.1 Remove extension form filename

${FILENAME%.*}

For example, if you have a bunch of files with extension .MOV and you wish to make the extension lowercase, i.e., .mov,

$ for FILENAME in *MOV
> do
> mv ${FILENAME%.*}.MOV ${FILENAME%.*}.mov
> done
$ ls -s -l *MOV
ls: cannot access '*MOV': No such file or directory
$ ls -s -l *mov
  31544 -rwxr-xr-x 1 alice alice   32298265 Sep 24 13:43 DSC_2122.mov
 725760 -rwxr-xr-x 1 alice alice  743173020 Sep 24 13:43 DSC_2123.mov
         ...
1174020 -rwxr-xr-x 1 alice alice 1202190508 Sep 24 13:44 DSC_2142.mov

6.2 Get extension from filename

${FILENAME##*.}

6.3 Get directory name from pathname

${PATHNAME%/*}

6.4 Get filename from pathname

${PATHNAME##*/}

7.0 Print the value of π

$ PI=`echo "4*a(1)" | bc -l`
$ echo $PI
3.14159265358979323844

8.0 See also

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments

You may like these, also

  • POSIX message queues in LinuxPOSIX message queues in Linux 1.0 POSIX Message queues POSIX interprocess comunication (IPC) was introduced in the POSIX.1b standard (IEEE Std 1003.1b-1993) for real time…
  • POSIX Shared Memory in LinuxPOSIX Shared Memory in Linux 1.0 Shared Memory Shared memory is the fastest method of interprocess communication (IPC) under Linux and other Unix-like systems. The…
  • POSIX Semaphores in LinuxPOSIX Semaphores in Linux 1.0 Semaphores Semaphores are used for process and thread synchronization. Semaphores are clubbed with message queues and shared memory under…
  • fork and exec system calls in Linuxfork and exec system calls in Linux 1.0 fork and exec system calls Suppose we wish to write a "shell program" which would execute another program. Now,…
  • Connecting two computers with Ethernet LAN cableConnecting two computers with Ethernet LAN cable Quite often, we wish to connect two computers back to back using an Ethernet LAN cable. It may be because…
  • D-Bus TutorialD-Bus Tutorial 1.0 D-Bus D-Bus is a mechanism for interprocess communication under Linux and other Unix-like systems. D-Bus has a layered architecture.…
  • Socket programming using the select system callSocket programming using the select system call 1.0 Client-Server Paradigm The Client-Server paradigm divides the software architecture of a system in two parts, the server and its…
  • System V message queues in LinuxSystem V message queues in Linux 1.0 Message queues Message queues are one of the interprocess communication mechanisms available under Linux. Message queues, shared memory and…
  • POSIX Threads Synchronization in CPOSIX Threads Synchronization in C 1.0 POSIX Threads Synchronization POSIX Threads provide multiple flows of execution within a process. The threads have their own stacks…
  • System V Shared Memory in LinuxSystem V Shared Memory in Linux 1.0 Shared Memory Shared memory is one of the three interprocess communication (IPC) mechanisms available under Linux and other Unix-like…