Ahhum. Let me clarify:

The 'bug' I referred to demonstrates itself in C compilers that aren't prepared to Microsoft stupidity. You see, normally, when you want to continue a literal (#define, "") over a linefeed, you escape it, because normally linefeed terminates #defines and gives syntax error inside "". So, you put '\' in the end of the line. Now, when compiler reads it, to it the bytes are: '\' 0x0a, that is, backlash and '\n'. Since it's possible to escape so that backlash-anycharacter will do something interesting, backlash-linefeed was defined as "nothing" so as to make this line continuing easy.

Now, in the light of above writeup... what do you get when you escape? Right. You get "backlash-0x0d-0x0a" (or '\' '\r' '\n'). Now the compiler will escape the \r, leaving \n there. And since DOS idiots just had to force the linefeeds work that way so there is no way to insert another \ between \r and \n ... bye-bye line continuing.

I don't know, but I'd guess, that most DOS compilers work around this in an undstandard way that really makes three-character escape of '\' '\r' '\n' which surely broke something and added to the bloat, but decent unix compilers like cygwin port of gcc do just what is logical; complain about syntax error as if you had not put that '\' there.


Now, as for the '0x0d 0x0a' behaviour itself being a bug... no comments. This whole DoS thing is a boot sector virus, so why not...

This isn't really DOS' or Microsoft's fault. IBM should probably be blamed. The problem originates with the video BIOS. (which, unless I'm mistaken, was written by IBM)

The standard method for writing text to the screen via BIOS is to use interrupt 0x10, subfunction 0x0e. This is the "teletype" command, it writes a character to the screen and updates the character position.

This function interprets "line feed" as a line feed; it drops to the next line without changing the cursor's column. It interprets "carriage return" as carriage return; it returns the cursor to the first column of the current row.

Now it's possible to just write directly to the video memory if the OS keeps track of the cursor position itself, and it's certainly possible for the OS to send 0x0d 0x0a to the video card whenever 0x0a is printed, but the simple thing to do is just pass strings on to BIOS one character at a time and let it worry about everything.

So this 'bug' isn't Microsoft's fault, they just didn't fix it.

...then by the time you get to Windows, Microsoft was already trapped by backwards compatibility.

Using CRLF as the line separator is not DOS-specific. Up until Unix, this was the standard for ASCII-using systems that treated text files as strings of bytes (as opposed to, say, record-based systems). For most ARPANET and Internet protocols, the line separator is and always has been the two-character CR-LF sequence. When Unix and C were introduced, they used a single character for newline, for simplicity.

When the IBM PC and MS-DOS were introduced in the very early 80s, C and Unix were not yet the industry-dominating forces they would later become. It only made sense to have text files compatible with the major commercial operating systems. No one was writing programs for the PC in C: it was usually assembly, BASIC, or Pascal.

When C started being ported to non-Unix OSes, the incompatibility had to be bridged somehow. Thus C compilers for DOS and other OSes have two modes for files: binary and text. In text mode, "\n" was converted to "\r\n" before being written, and "\r\n" was converted to "\n" when read; thus a single byte could represent the newline in memory, yet the OS standard could be followed on-disk (and for terminal I/O).

There is no sense blaming either of Microsoft or IBM for the mistakes of others. They have made enough on their own.

Log in or register to write something here or to contact authors.