Since the interpreter is passed the filename containing the file, it is useful for it to interpret the hash sign (#) as introducing a comment.

A short self-reproducing "program" in such systems (actually not properly a program, unless all executable files are considered one language) is

#!/bin/cat
This "script" copies itself when executed!
Store this in a file "self", then chmod +x self. Executing self will print the contents of the file.

Whilst it appears that #! does not need to be read by Perl (but by the Unixesque Operating system), Windows has no inbuilt understanding of the #!. Instead, Windows Perl interpreters (such as ActiveState Perl) will, when running a perl program with a #! line read it and run as if those commands were given on the command line. For example:

#!d:\perl\bin\perl -w

will run Perl with the use warnings pragma, telling Perl to throw a wobbly at any slightly wrong statements.

The interesting thing is that it ignores the pathname since it is already running under perl, so using:

#!/usr/bin/perl -w

gives exactly the same effects, as well as incorporating Unix compatibility.

It should be remembered, however, that Apache Web Server does parse #!'s on Windows machines in the usual way, so #!/usr/bin/perl -w won't work: you'll need the correct path.


With thanks to ariels, who pointed out that Perl always checks the parameters at the end of a shebang line... for 'various arcane reasons', and rp who pointed out it's Linux, not Bash which does the work.

Well, I would like to correct and add to the above writeups slightly.

I must correct slightly the choice of language of one of the posters. I believe someone implied that the entire contents of the script file get passed to the interpreter. This is not the case.

The only thing that gets passed to the interpreter is an argument list. Any of you that have done any C programming know that one of the things that C programs have (provided that the host operating system supports this) is something called an argument list, which is basically an array of c-style strings, and usually comes in as a parameter to main(). The prototype for the main() C function can look like the following:

int main (int argc, char *argv[]);

In the above prototype, the second parameter is the argument list, it is an array of C-style strings (char *'s).

The interesting thing that I always thought was cool was the piggy-back nature of arguments and how they ultimately end up being tacked on to the end of each other and given to the interpreter in the #! line. It is implemented kernel-side as follows on Unix-style systems:

  1. 'Receive' an exec() system call from a user process. This exec() call has as one of its parameters the argument list for the new process. This is entirely what will become the new process's char * argv[] array.
  2. Find the file indicated by the exec() call (that's another parameter to exec() but often also by convention the first element of the argv[] array, or argv[0]) and read the first two bytes to determine the type of executable.
  3. If first two bytes equal '#' and '!' then do the following:
  4. Parse from after the '!' until the first ' ' (space) or carriage return and take that as the name of an executable program to use as a script interpreter.
  5. Parse all the other tokens on that #! line after the name of the interpreter script. These need to be separated by one or more spaces. These tokens become the first few elements of a new argv[] array that is custom-built by the kernel and will be given to the new interpreter that will be executed. If said script interpreter exists, do an exec() on that binary instead, going back to, essentially step '1'.

Now, let me translate this into quasi-English:

Let's say you have a perl script called 'echo.pl' and you want to simply echo its parameters back to you so that a sample run would be:

   [me@mybox]$ ./echo.pl everything2 is a great site
    everything2 is a great site
   [me@mybox]$

Now let's say the contents of echo.pl were:

#!/usr/bin/perl -w

foreach my $arg(@ARGV) { print "$arg "; } print "\n";

What really happens behind the scenes when you execute the above script with the example argument list is that first your shell program builds an argv[] array for the exec() system call. The elements of the array are as follows:

argv[0] = "./echo.pl";
argv[1] = "everything2";
argv[2] = "is";
argv[3] = "a";
argv[4] = "great";
argv[5] = "site";
argv[6] = 0; /* null pointer termination required */

Now the kernel realizes that ./echo.pl is actually a script because it begins with #!. It then does the 'piggybacking' I described above. It takes the original argv[] array and tacks on to the beginning all the arguments that should go to your interpreter, plus the name of your script file so that finally when the perl binary is exec'd, it sees:

argv[0] = "/usr/bin/perl";
argv[1] = "-w";
argv[2] = "./echo.pl";
argv[3] = "everything2";
argv[4] = "is";
argv[5] = "a";
argv[6] = "great";
argv[7] = "site";
argv[8] = 0; /* null pointer termination required */

I think that's cool. Note also the possibility for infinite recursion. (If you have a script that is its own interpreter you maybe will have it trying to execute itself infinite times!)

Linux happens to do something weird: It only applies the #! rule once. I am not sure if this is a standard Unix thing, or if the linux kernel just doesn't like having too many chains of indirection before getting to a real binary interpreter, or what. But basically... that means that under linux you can *never* have an interpreted script that is the interpreter for another language.

All this basically means that you couldn't write a Perl interpreter in Perl.. :(

Log in or register to write something here or to contact authors.