tr Command in Linux

  • by

1.0 tr

The tr command is a filter which reads the standard input, translates or deletes characters and writes on its standard output. The tr command syntax is,

tr [OPTION]... SET1 [SET2]

tr transliterates characters from SET1 into corresponding characters of SET2 in input and writes resulting text on the standard output. For example, to convert lowercase to uppercase and vice-versa,

$ cat names
Alan Bloggs
Erika Mustermann
James Bond
Jane Doe
Jimmy Fernandes
Joe Bloggs
John Doe
John Roe
Max Mustermann
Richard Roe
Tommy Atkins
$ # Convert lowercase to uppercase
$ tr 'a-z' 'A-Z' < names
ALAN BLOGGS
ERIKA MUSTERMANN
JAMES BOND
JANE DOE
JIMMY FERNANDES
JOE BLOGGS
JOHN DOE
JOHN ROE
MAX MUSTERMANN
RICHARD ROE
TOMMY ATKINS
$ # Convert uppercase to lowercase
$ tr 'A-Z' 'a-z' < names
alan bloggs
erika mustermann
james bond
jane doe
jimmy fernandes
joe bloggs
john doe
john roe
max mustermann
richard roe
tommy atkins

Ideally, SET1 and SET2 should be of the same size. If SET2 is smaller than SET1, the last character of SET2 is repeated as many times as necessary to make both the same size. If SET2 is larger than SET1, the excess characters of SET2 are ignored.

2.0 SPECIFYING SETS

Sets are strings of characters. Each character in a set specifies itself. However, when there is a backslash (\), it indicates a sequence defining a special character. Also, there are representations that indicate character sequences.

tr - Interpreted sequences
Sequence Description
\NNN Character with octal value NNN.
\\ Backslash.
\a Bell.
\b Backspace.
\f Form feed.
\n Newline.
\r Carriage return.
\t Horizontal tab.
\v Vertical tab.
CHAR1 - CHAR2 Sequence of characters from CHAR1 to CHAR2, in ascending order.
[CHAR*] Copies of CHAR in SET2 so that the size of SET2 becomes equal to that of SET1.
[CHAR*REPEAT] REPEAT copies of CHAR. CHAR is considered octal if it starts with 0.
[:alnum:] Alphanumeric; letters and digits.
[:alpha:] Alphabetic: letters only.
[:blank:] Horizontal white space characters.
[:cntrl:] Control characters.
[:digits:] Digits, 0 - 9
[:graph:] Printable characters, excluding white space characters.
[:lower:] All the lowercase characters.
[:print:] All the printable characters, including space.
[:punct:] All the punctuation characters.
[:space:] White space characters, horizontal and vertical.
[:upper:] All the uppercase characters.
[:xdigit:] All hexadecimal digits, 0-9, a-f and A-F.

Using the above definitions, we can re-write the tr commands for changing case,

$ # change uppercase to lowercase
$ tr '[:upper:]' '[:lower:]' < names
alan bloggs
erika mustermann
james bond
jane doe
jimmy fernandes
joe bloggs
john doe
john roe
max mustermann
richard roe
tommy atkins

3.0 Delete characters

The -d option is for deleting characters specified in SET1 from the input. For example, the text files in Windows have CR-LF at the end of each line. In Linux, the text files just have an LF at the end of each line. Converting a Windows text file to Linux involves deleting CR from each line. We can do this using the tr command,

$ file 404.php 
404.php: PHP script, ASCII text, with CRLF line terminators
$ tr -d '\r' < 404.php > 404-new.php
$ file 404-new.php
404-new.php: PHP script, ASCII text

4.0 Squeeze repeated characters

With the -s option, we can replace an occurrence of a repeated character which is given in SET1 with a single occurrence of that character. For example, if the input has multiple space and blank lines, we can replace multiple spaces with a single space and delete blank lines with the tr command.

$ cat names
Alan            Bloggs
Erika     Mustermann

James Bond



Jane Doe
Jimmy       Fernandes
Joe Bloggs
John Doe
John Roe
Max Mustermann
Richard Roe
Tommy Atkins
$ tr -s '[:space:]' < names
Alan Bloggs
Erika Mustermann
James Bond
Jane Doe
Jimmy Fernandes
Joe Bloggs
John Doe
John Roe
Max Mustermann
Richard Roe
Tommy Atkins

5.0 Complement of SET1

With the -c option, we can ask tr to use the complement of SET1. For example, if wish to delete all the unprintable characters in a file, leaving only alphanumeric characters and space, we can execute the command,

$ sed 's/$/ /' names | tr -cd '[:print:]'
Alan Bloggs Erika Mustermann James Bond Jane Doe Jimmy Fernandes Joe Bloggs John Doe John Roe Max Mustermann Richard Roe Tommy Atkins $

We first add a space at the end of each line of file using sed. Then we pipe the output to the tr command. tr deletes all the unprintable characters, including the newlines.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments

You may like these, also

  • POSIX message queues in LinuxPOSIX message queues in Linux 1.0 POSIX Message queues POSIX interprocess comunication (IPC) was introduced in the POSIX.1b standard (IEEE Std 1003.1b-1993) for real time…
  • POSIX Shared Memory in LinuxPOSIX Shared Memory in Linux 1.0 Shared Memory Shared memory is the fastest method of interprocess communication (IPC) under Linux and other Unix-like systems. The…
  • POSIX Semaphores in LinuxPOSIX Semaphores in Linux 1.0 Semaphores Semaphores are used for process and thread synchronization. Semaphores are clubbed with message queues and shared memory under…
  • fork and exec system calls in Linuxfork and exec system calls in Linux 1.0 fork and exec system calls Suppose we wish to write a "shell program" which would execute another program. Now,…
  • Connecting two computers with Ethernet LAN cableConnecting two computers with Ethernet LAN cable Quite often, we wish to connect two computers back to back using an Ethernet LAN cable. It may be because…
  • D-Bus TutorialD-Bus Tutorial 1.0 D-Bus D-Bus is a mechanism for interprocess communication under Linux and other Unix-like systems. D-Bus has a layered architecture.…
  • Socket programming using the select system callSocket programming using the select system call 1.0 Client-Server Paradigm The Client-Server paradigm divides the software architecture of a system in two parts, the server and its…
  • System V message queues in LinuxSystem V message queues in Linux 1.0 Message queues Message queues are one of the interprocess communication mechanisms available under Linux. Message queues, shared memory and…
  • POSIX Threads Synchronization in CPOSIX Threads Synchronization in C 1.0 POSIX Threads Synchronization POSIX Threads provide multiple flows of execution within a process. The threads have their own stacks…
  • System V Shared Memory in LinuxSystem V Shared Memory in Linux 1.0 Shared Memory Shared memory is one of the three interprocess communication (IPC) mechanisms available under Linux and other Unix-like…