tr - (BSD) translate characters
Synopsis
Description
Environment Variables
Examples
See Also
Notes
/usr/ucb/tr [-cCds] [string1 [string2]]
Tr copies the standard input to the standard output with substitution or deletion of selected characters. Input characters found in string1 are mapped into the corresponding characters of string2. When string2 is shorter than string1, it is padded to the length of string1 by duplicating its last character. Any combination of the options -cds may be used:
-c complements the set of characters in string1 with respect to the universe of characters whose byte codes are 01 through 0377 octal, or with a multibyte character set, whose wide character codes start at 1. -d deletes all input characters in string1. -s squeezes all strings of repeated output characters that are in string2 to single characters. The following option has been introduced by POSIX.1-2001: -C complements the set of characters in string1 like -c, but orders ranges according to the collation sequence. In either string the following character sequences are treated specially:
\octal The character '\' followed by 1, 2 or 3 octal digits stands for the character whose byte code is given by those digits. Multibyte characters can be specified as a sequence of octal bytes. \char The escape sequences '\a' (bell), '\b' (backspace), '\f' (form feed), '\n' (newline), '\r' (carriage return), '\t' (horizontal tabulator), and '\v' (vertical tabulator) are supported. A '\' followed by any other character (other than an octal digit) stands for that character. a-z means a range of characters from a to z in increasing byte order, or with a multibyte character set, in increasing wide character order. With the -C option, characters are ordered according to the collation sequence. [:class:] means all characters that belong to character class class in the current LC_CTYPE locale in increasing byte order, or with a multibyte character set, in increasing wide character order. If both [:upper:] and [:lower:] appear at the same position in either string, upper-case characters are mapped to lower-case characters (and vice versa). [=c=] where c is a collating symbol in the current LC_COLLATE locale, means all characters that belong to the same equivalence class as c, i. e. have the same collating weight as c. [a*n] means n repetitions of the character a, with n as an octal number if it starts with '0' and as a decimal number otherwise. If a is omitted or zero, it is taken to be huge (useful for padding string2 to the length of string1).
LANG, LC_ALL See locale(7). LC_COLLATE Affects the composition of equivalence classes. LC_CTYPE Determines the mapping of bytes to characters in translation strings and input files, and the availability and composition of character classes.
The following examples create a list of all the words in 'file1' one per line in 'file2', where a word is taken to be a maximal string of alphabetics. The second string is quoted to protect '\' from the Shell. 012 is the ASCII code for newline.
/usr/ucb/tr -cs A-Za-z '\012' <file1 >file2
ed(1), ascii(7), locale(7)
tr does not handle ASCII NUL in string1 or string2; it always deletes NUL from input.The LC_COLLATE variable is not respected; equivalence classes consist of exactly one character, and the -c and -C options produce identical results.
Heirloom Toolchest | TR (1B) | 8/6/05 |