Different platforms use different end of line characters. If you're programming in something like Perl, this information may just save your butt one day:

  • Mac OS uses carriage return (\r, or 0x0D)
  • DOS/Windows uses a carriage return followed by a line feed (\r\n, or 0x0D0A)
  • UNIX/Linux uses a line feed (\n, or 0x0A)
newgroup wars = N = NeWS

newline /n[y]oo'li:n/ n.

1. [techspeak, primarily Unix] The ASCII LF character (0001010), used under Unix as a text line terminator. Though the term `newline' appears in ASCII standards, it never caught on in the general computing world before Unix. 2. More generally, any magic character, character sequence, or operation (like Pascal's writeln procedure) required to terminate a text record or separate lines. See crlf, terpri.

--The Jargon File version 4.3.1, ed. ESR, autonoded by rescdsk.

The character representation of 'Newline' is different between different operating systems. Besides being a major source of software portability problems, a difference in newline representation can result in text files from another computer not being directly readable.

History of the newline

The old Westrex teletypes have a friction fed roll of paper, and a print head (like the golf ball in an electric type writer). There are two mechanical functions connected with a new line, one is to return the print head to the beginning of the line, and the other is to advance the paper roll by a line.

It makes sense to assign different codes to these, essentially distinct, functions. The ASCII code uses \015 for "Carriage Return" and \012 for "Line Feed". In order for the teletype to do the right thing, it must receive both codes.

If you switch one of these teletypes to local mode, it behaves like a type writer. Any keys you type are echoed. In this mode, you have to do both a carriage return and a line feed, for your new line.

When it comes to VDUs and command windows, the same rules apply. A carriage return moves the cursor to the left hand column of the screen (window), and line feed moves it down a line, scrolling if necessary.

The Fortran programming language reserves the first 'column' (byte) of every line, to provide what is called carriage control. Normally, this is a space, which causes the line to be printed between newlines (actually, on VMS Fortran and probably others, it causes a line feed to be sent to the printer before the line, and a carriage return afterwards). By varying the carriage control byte, you can alter this behaviour.

Representation in files

Most early operating systems stored text files as is, and files are copied to the printer byte for byte. The operating system that went away from this rule was Unix. Here, a single character, Line feed, was use to represent a new line. This means that translation needs to be done in order to print a file; the operating system adds extra carriage returns needed for display or printing. Similarly, the Macintosh used a carriage return.

VMS uses a different concept, as it considers lines in a text file as records, and it outputs a carriage return and line feed when it sees an end of record.

FTP and file sharing

Sharing files between computers running different operating systems highlights the problem with line termination. If you try and view a Unix text file with notepad, you will see a continuous splurge all on one line (wordpad and textpad are cleverer). Editing an MsDOS file with vi shows <^M> at the end of each line, which are the carriage returns.

To avoid these difficulties, ftp makes a distinction between ASCII (text) files and binary files. Any text files are converted to a common repesentation (the Unix form with line feed), as they are transmitted. Similarly, the line termination may need converting again as it arrives, into the destination platform's representation.

NFS has similar issues. Here, the standard is a network served Unix file system, hence any text files on it are expected to be linefeed line terminated. Samba however is designed to look like a Windows file share, hence will add carriage returns as the files are mapped from Unix.

The difference in binary storage of newlines in text files (<lf> versus <crlf>  versus <cr>) is something of a nuisance, but well programmed applications can cope with this in a transparent, automatic manner. A much greater difficulty arises from cultural differences in how newlines should be used.

This is how Emacs does it. ("line wrapped text"):
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do<lf>
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad<lf>
minim veniam, quis nostrud exercitation ullamco laboris nisi ut<lf>
aliquip ex ea commodo consequat. Duis aute irure dolor in<lf>
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla<lf>
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in<lf>
culpa qui officia deserunt mollit anim id est laborum.<lf>
<lf>
Next paragraph, etc.

This is produced by the fill command or by auto-fill-mode.

This is how most newer applications work ("flowed text"):

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ... ... laborum.<lf>
Next paragraph, etc.

In this case the application usually word-wraps the text according to the current window size.

Emacs won't gracefully produce flowed text. Turning off auto-fill-mode reverts to letter-wrap.

 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor i*
*ncididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud*
* exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute *
*irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat null*
*a pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui off*
*icia deserunt mollit anim id est laborum.

('*' indicates an on-screen curly arrow line-wrap indicator.)

Here, words are arbitrarily broken at the end of the line, and using the up-down key will move past whole paragraphs at once.  Searching the web, it seems that Emacs users are very hostile to the idea of supporting flowed text, although it should be possible, given the extendible nature of Emacs. Converting line wrapped text to flowed text is sometimes possible, assuming blank lines separating paragraphs and no intra-paragraph newlines (such as would be found in a list). RFC 2646 describes a method for email (transmitted as line wrapped text) to be converted to flowed text.

Some of the more advanced Windows text editors, (such as textpad) will produce line wrapped text. It is a simple operation to convert flowed text to line wrapped text.

Each style has its own advantages. Line text offers an unambiguous on-screen representation of the text file and is more suitable for use with standard Unix tools such as grep and diff. Line text is the style of choice for program source code. Flowed text seems more suitable when writing documents. It automatically fits any column width and is more malleable. For example, when re-editing in Emacs you get something like this:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do<lf>
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad<lf>
minim veniam, quis nostrud exercitation ullamco laboris nisi ut<lf>
aliquip ex ea commodo consequat. Foo bar foo bar blah blah blah. Duis
aute irure *
*dolor in<lf>
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla<lf>
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in<lf>
culpa qui officia deserunt mollit anim id est laborum.<lf>
<lf>
Next paragraph, etc.

To fix this requires the user to issue a manual 're-fill' command (M-q). There is a re-fill mode to handle this automatically, but the documentation describes this as preliminary and probably not robust.

Log in or register to write something here or to contact authors.