tr - translate characters
Synopsis
Description
Environment Variables
Examples
See Also
Notes
tr [-cCds] [string1 [string2]]
Tr copies the standard input to the standard output with substitution or deletion of selected characters. Input characters found in string1 are mapped into the corresponding characters of string2. When string2 is shorter than string1, no mapping occurs beyond its last character. Any combination of the options -cds may be used:
-c complements the set of characters in string1 with respect to the universe of characters whose byte codes are 01 through 0377 octal, or with a multibyte character set, whose wide character codes start at 1. -d deletes all input characters in string1. -s squeezes all strings of repeated output characters that are in string2 to single characters. The following option has been introduced by POSIX.1-2001: -C complements the set of characters in string1 like -c, but orders ranges according to the collation sequence. In either string the following character sequences are treated specially:
\octal The character '\' followed by 1, 2 or 3 octal digits stands for the character whose byte code is given by those digits. Multibyte characters can be specified as a sequence of octal bytes. \char The escape sequences '\a' (bell), '\b' (backspace), '\f' (form feed), '\n' (newline), '\r' (carriage return), '\t' (horizontal tabulator), and '\v' (vertical tabulator) are supported. A '\' followed by any other character (other than an octal digit) stands for that character. [a-z] (/usr/5bin/tr, /usr/5bin/s42/tr) means a range of characters from a to z in increasing byte order, or with a multibyte character set, in increasing wide character order. With the -C option, characters are ordered according to the collation sequence. a-z (/usr/5bin/posix/tr, /usr/5bin/posix2001/tr) means a range of characters from a to z in increasing byte order, or with a multibyte character set, in increasing wide character order. With the -C option, characters are ordered according to the collation sequence. [:class:] means all characters that belong to character class class in the current LC_CTYPE locale in increasing byte order, or with a multibyte character set, in increasing wide character order. With the -C option, characters are ordered according to the collation sequence. If both [:upper:] and [:lower:] appear at the same position in either string, upper-case characters are mapped to lower-case characters (and vice versa). [=c=] where c is a collating symbol in the current LC_COLLATE locale, means all characters that belong to the same equivalence class as c, i. e. have the same collating weight as c. [a*n] means n repetitions of the character a, with n as an octal number if it starts with '0' and as a decimal number otherwise. If n is omitted or zero, it is taken to be huge (useful for padding string2 to the length of string1).
LANG, LC_ALL See locale(7). LC_COLLATE Affects the composition of equivalence classes. LC_CTYPE Determines the mapping of bytes to characters in translation strings and input files, and the availability and composition of character classes.
The following examples create a list of all the words in 'file1' one per line in 'file2', where a word is taken to be a maximal string of alphabetics. The strings are quoted to protect '\' and '[' from the Shell. 012 is the ASCII code for newline.
/usr/5bin/tr -cs '[A-Z][a-z]' '[\012*]' <file1 >file2 /usr/5bin/s42/tr -cs '[A-Z][a-z]' '[\012*]' <file1 >file2 /usr/5bin/posix/tr -cs A-Za-z '[\012*]' <file1 >file2 /usr/5bin/posix2001/tr -cs A-Za-z '[\012*]' <file1 >file2
ed(1), ascii(7), locale(7)
/usr/5bin/tr and /usr/5bin/s42/tr do not handle ASCII NUL in string1 or string2; they always delete NUL from input.The LC_COLLATE variable is not respected; equivalence classes consist of exactly one character, and the -c and -C options produce identical results.
Portable programs must prefix the '[' and '-' characters with a backslash and cannot use them otherwise, unless a '[a-z]' sequence appears at exactly the same position in both operand strings. The '\octal' construct cannot be used portably preceding or following the '-' range metacharacter; the '\char' escapes for control characters are not present in old implementations, use octal escapes instead.
Heirloom Toolchest | TR (1) | 8/6/05 |