Bash idioms

bash idioms are tiny scripts, mostly one-liners, that accomplish a lot and can be used as building blocks in bigger scripts.

1.0 Find most frequent words

Suppose we have a bunch of text files and we wish to find the most frequently used words in those files, we can do that with the command,

cat * | tr -sc '[:graph:]' '\n' | sort | uniq -c | sort -nr

First, we capture all input with cat and pipe it to the tr command. tr translates the complement of graphic (printable) characters, that is whitespace, into newlines, squeezing multiple whitespace characters into one. Then, it sorts the input. This puts words in sorted order, one word per line. Duplicate words, if any, are put on consecutive lines. With the uniq -c command, we replace duplicates with a single line containing the count and the word. Finally, we sort the file numerically in the reverse order to get the most frequent words at the top. For example, if the above script is put in a file named freq, and the file is made executable, we can find the ten most frequent words with the command,

$ ./freq info.txt | head
    259 the
    126 and
    109 of
    106 to
     96 a
     73 in
     67 software
     63 is
     45 be
     41 The

2.0 Copy files and directories recursively

It turns out that rsync is the best copying command around. Mostly, we wish to copy files. When we copy files, we, mostly, want to preserve the file attributes. If a file is a directory, we want it to be copied to the target recursively. And sometimes we wish to skip some files or directories during the copy process. rsync provides all these facilities.

$ # copy file to current directory.
$ sudo rsync -avz ~/www/index.php .
sending incremental file list

sent 59 bytes  received 12 bytes  142.00 bytes/sec
total size is 529  speedup is 7.45
$
$ # copy directory recursively to current directory.
$ sudo rsync -avz ~/www/sites .
sending incremental file list
sites/
sites/example.sites.php
sites/all/
sites/all/modules/
sites/all/modules/admin_menu/
...
$
$ # copy sites to current directory recursively but skip the
$ # "all" and "default" sub-directories
$ sudo rsync -avz --exclude all --exclude default ~/www/sites .
sending incremental file list
sites/
sites/example.sites.php

sent 1,102 bytes  received 39 bytes  2,282.00 bytes/sec
total size is 2,365  speedup is 2.07

3.0 List files with names sorted numerically

If the version number is embedded in the file name, ls does not list those files in the correct numerical order. Using the -v option, we get the correct file order in the ls output.

$ ls syslog*
syslog    syslog.10.gz  syslog.20.gz  syslog.2.gz   syslog.3.gz  syslog.5.gz  syslog.7.gz  syslog.9.gz
syslog.1  syslog.11.gz  syslog.24.gz  syslog.30.gz  syslog.4.gz  syslog.6.gz  syslog.8.gz
$ ls -v syslog*
syslog    syslog.2.gz  syslog.4.gz  syslog.6.gz  syslog.8.gz  syslog.10.gz  syslog.20.gz  syslog.30.gz
syslog.1  syslog.3.gz  syslog.5.gz  syslog.7.gz  syslog.9.gz  syslog.11.gz  syslog.24.gz

The same result is obtained by passing the ls output through sort and using . as the field separator and sorting numerically based on the second key.

$ ls syslog* | sort -t . -n -k2,2
syslog
syslog.1
syslog.2.gz
syslog.3.gz
syslog.4.gz
syslog.5.gz
syslog.6.gz
syslog.7.gz
syslog.8.gz
syslog.9.gz
syslog.10.gz
syslog.11.gz
syslog.20.gz
syslog.24.gz
syslog.30.gz

The same result is achieved by the ls -v command.

$ ls -v syslog* | more
syslog
syslog.1
syslog.2.gz
syslog.3.gz
syslog.4.gz
syslog.5.gz
syslog.6.gz
syslog.7.gz
syslog.8.gz
syslog.9.gz
syslog.10.gz
syslog.11.gz
syslog.20.gz
syslog.24.gz
syslog.30.gz

4.0 Find files based on matching patterns in contents

Consider the case where the find command gives a list of files and we wish to grep for a pattern in those files. This is easily accomplished by the xargs command, which is used for building command line from its standard input. For example,

$ find . -name '*.c' | xargs grep 'fread'
./alt/texttags.c:    fread( &lenght, sizeof( gsize ), 1, input );
./alt/texttags.c:    fread( data, sizeof( guint8 ), lenght, input );
./save.c:	fread( &lenght, sizeof( gsize ), 1, input );
./save.c:	fread( data, sizeof( guint8 ), lenght, input );

5.0 List all sub-directories under a directory

ls -al lists all the files and sub-directories under a directory. But what about the case when you just want the sub-directory listing? The answer is to pipe the ls output to grep, selecting all lines starting with a d.

$ ls -al | grep '^d'
drwxrwxr-x  5 user1 user1 4096 Apr  1 07:11 . 
drwxr-xr-x 65 user1 user1 4096 Apr  1 07:06 ..
drwxrwxr-x  8 user1 user1 4096 Mar 31 19:43 HelloWorld
drwxrwxr-x  2 user1 user1 4096 Apr  1 07:10 new
drwxrwxr-x  2 user1 user1 4096 Apr  1 07:11 tmp

6.0 Manipulating file and path names

6.1 Remove extension form filename

${FILENAME%.*}

6.2 Get extension from filename

${FILENAME##*.}

6.3 Get directory name from pathname

${PATHNAME%/*}

6.4 Get filename from pathname

${PATHNAME##*/}

7.0 Print the value of π

$ PI=`echo "4*a(1)" | bc -l`
$ echo $PI
3.14159265358979323844

8.0 See also