cat is one of the simplest UNIX commands that often makes people go 'why?' until they come upon some deeper enlightenment about the underlying nature of UNIX.
At its very basic cat is short for
'concatenate'. When run cat takes the files on the command line and prints them back to standard output in order. If only one file is given as arguments it only prints that one file.
Common to all versions of cat are a few options: benstuv. Some versions (most notably those of GNU and FSF distributions) have some others. For the examples - the following file will be used:
% cat file
This is a file
It has several lines.
And some bell special characters.
This line has 3 spaces after the.
The line above has one space.
%
- -b
-
The -b option is one of two options for numbering lines - in different ways. If both are specified as options, this one is used. This option numbers lines that have any content in them (including white space and non-printable characters). Lines that don't are left blank and skipped in counting.
% cat -b file
1 This is a file
2 It has several lines.
3 And some special characters.
4 This line has 3 spaces after the.
5
6 The line above has one space.
%
- -e
-
This option has two functions:
- print a '$' at the end of the line
- print control characters using the '^' notation
The -e option is at times split into two options in some versions, where -e becomes the equivalent of '-vE' to show non-printing characters and place the '$' at the end of the line as two distinct functions.
% cat -e file
This is a file$
$
$
It has several lines.$
And some ^G special characters.$
This line has 3 spaces after the. $
$
The line above has one space.$
%
- -n
-
Similar to the -b option above, the -n option numbers the output lines. However, in this case, all lines are numbered - including those that are blank.
% cat -n file
1 This is a file
2
3
4 It has several lines.
5 And some special characters.
6 This line has 3 spaces after the.
7
8 The line above has one space.
%
- -s
-
The -s option is short for 'squeeze'. This option will change multiple instances of a blank line to a single blank line. However the number of times this is blank is lost (if you want to maintain this information, consider the uniq program).
% cat -s file
This is a file
It has several lines.
And some special characters.
This line has 3 spaces after the.
The line above has one space.
%
- -t
-
Similar to the -e option above the -t option prints out control characters with the ^ notation. The -t option doesn't put the '$' at the end of the line and furthermore renders tabs as '^I' while -e (and -v) will print out a tab as a tab rather than its control character equivalent.
% cat -t file
This is a file
It has several lines.
And some ^G special characters.
This line has 3 spaces after the.
The line above has one space.
%
- -u
The -u is mostly something left over from days of old though still does have occasional use. This option causes the output to be unbuffered. In many cases data being sent is sent in chunks from a buffer rather than as it is generated. This is often better to the network where each chunk of data (big or small) has the same overhead - and one big chunk is preferable to lots of small chunks of data. The output is only noticeable (and maybe not even then) if this was running across a network (rather than local system and one was snooping the traffic.
- -v
-
'-v' is the option that is at the core of '-e' and '-t' above. Without any fuss - the '-v' option prints out the file with control characters with the ^ notation. Not mention above (though still part of it), high ASCII characters are printed out as 'M-#' where '#' is the character for the lower 7 bits (the 'M' stands for 'meta' which is the key used to get these chracters on most UNIX systems).
% cat -v file
This is a file
It has several lines.
And some ^G special characters.
This line has 3 spaces after the.
The line above has one space.
%
A special 'file' that may be used my many programs is that of '-'. Specifying '-' as a file instructs the program to read from standard input rather than a named file on the disk. As each line is entered, cat does as it is supposed to. If no files are specified, then standard input is read.
% cat -n - file
abc
1 abc
def
2 def
ghi
3 ghi
^D
4 This is a file
5
6
7 It has several lines.
8 And some special characters.
9 This line has 3 spaces after the.
10
11 The line above has one space.
%
On the flip side of the use of cat is what is known as 'cat abuse' in the UNIX world. Many of the regulars of comp.lang.perl often pointed out instances of when cat is used when it doesn't need to be.
An example of this can be seen:
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: Calculating word frequencies within files?
Date: 1997/01/18
Message-ID: <5br1i3$q5l$1@csnews.cs.colorado.edu>
references: <32DAFC78.347B@ecst.csuchico.edu>
<5bgtqe$aj42@quest.lmtas.lmco.com>
<5bopbi$jo@engnews1.Eng.Sun.COM>
content-type: text/plain; charset=ISO-8859-1
organization: Perl Consulting and Training
reply-to: tchrist@mox.perl.com (Tom Christiansen)
newsgroups: comp.lang.perl.misc
originator: tchrist@mox.perl.com
[courtesy cc of this posting sent to cited author via email]
In comp.lang.perl.misc,
falk@peregrine.eng.sun.com (Ed Falk) writes:
: cat file |
cat abuse.
: sed -e '/\./d' | # remove troff directives
: fmt -1 | # one word per line
: sort | # group words together
: uniq -c | # produce counts
: sort -rn # sort by decreasing frequency.
--tom
--
Tom Christiansen Perl Consultant, Gamer, Hiker tchrist@mox.perl.com
/* This is the one truly awful dwimmer necessary to conflate C and sed. */
--Larry Wall, from toke.c in the v5.0 perl distribution
In the above example one doesn't need to use cat - instead
a of starting with
cat file |
sed -e '/\./d' |
...
This can be done with
sed -e '/\./d' file |
...
The kin of cat are head, tail, and more - though countless
other programs exist to print files to the standard out (if one is a real glutton for punishment, dd is recommended).
In the DOS world, this program is similar to the program type.