cIf nsEnglish vptUsed pdoThis, psIt vppWould vLook advLike pcThis.

But interestingly enough, many natural languages have cases, which are slightly similar markers! Of course, "part of speech" is not the same as "type"; the same part of speech is usually used for many different types in a programming language. For that matter, "cases" are rather less than full identification of the part of speech.

Hungarian notation is a naming convention for variables in source code. It dictates rules for choosing the name for variable that reflects its type. Specifically, each type has a prefix which is prepended to a meaningful name for each declaration of that type. For example, an integer blat would be iBlat and a float would be fBlat. The following shows some common prefixes:

The prefixes are cumulative: more than one can be used by stacking them up at the start of the variable name. For example, lpszGreep would be a long pointer to a nul-terminated string named greep.

The idea is that the variable, when seen outside the context of its declaration, carries type information with it. The programmer does not have to refer back to the declaration to find out what it is. In practice, this turns out to be of limited use, and most non-Microsoft coding styles steer clear of Hungarian. The style tends to make code look cluttered and unreadable. It can be confusing in situations where the type of an object is less important than the interface, such as when polymorphism is used.

A variant of Hungarian notation is sometimes used with object oriented languages such as C++ or Java. Here, the scope of the variable, rather than the type, is represented by the prefix. Some examples are prepending "m" or "m_" to member variables or prepending an underscore to private and protected members and methods.

This naming convention gets most of its popularity from Microsoft's influence. The variables and types one interacts with when programming for their Windows operating system all use Hungarian notation. The name comes from its inventor, Charles Simonyi, a talented programmer and Chief Architect at Microsoft. Simonyi is Hungarian, and some think the resulting names look like they are written in a foreign language.

If I were to wax polemical, I would say the fundamental problem with Hungarian notation is that it emphasizes exactly the information that is already syntactically apparent. That is, if I see

class Foo {
    int *bar;
    ...

the one thing I can say about bar is that it is a member variable of Foo and that it is a pointer to int. It is impossible to deduce anything about what the programmer intended it to be used for, which is why meaningful variable names are important. If the code is changed to read

class Foo {
    int *m_piBar;
    ...

then we still get no more information, but we used up space that could have been used for a more clarifying name. In other words, if we happen to program in a language that forces you to write out the type of every variable, we might as well take advantage of that.

Some advocates of Hungarian notation stress that it can give us more information about a variable that the type alone. For example, char *szFoo explicitly states that Foo points to a zero-terminated string, while char *Foo might point to a single char somwhere in memory.

One way of including such extra information in a way that is closer to how C types are usually written is to use typedefs. For example we can typedef char *string, and then use string foo instead of char *foo where appropriate. A less trivial example of this idea: when writing an intepreter for say Lisp or another dynamically typed language, one typically ends up with some type POBJECT representing pointers to generic Lisp objects. When then writing C code to act on Lisp data, I would use typedefs like these:

typedef POBJECT Object;
typedef POBJECT Fixnum;
typedef POBJECT String;
typedef POBJECT Cons;
/* ... */

Object car(Cons x) {
    CHECK_ISCONS(x);
    /* return the car field of the cons cell x... */
}

Using this style, one can include more information about what kind of object x points to without cluttering up every variable name.

Hungarian notation is a computer programming technique. It is a method of naming identifiers in source code. Identifier here means mostly variable names, but really refers to anything in the program that you need to name: variables, constants, functions, classes etc. A prefix is pretended onto the variable name to indicate the intended type and scope of the variable.

Hungarian notation encodes type information into variable names

Hungarian notation was developed by ace computer programmer Charles Simonyi. According to one source he started coding like this in early 1970s while working for Xerox PARC. However he joined Microsoft in 1981 and was senior programmer for "more than a decade". During this time his naming convention became better known.

Since his co-workers couldn't pronounce "Simonyi notation" they named it after his nationality, and after the way in which the resulting variable names resemble some foreign east-European language with not enough vowels more than they do plain English. (Assuming that you are English-speaking. No offence meant if you speak plain Hungarian and find English bizarre and counterintuitive). The name is also joking reference to reverse polish notation.

The system is designed to make variable names contain more information, so that a programmer can tell at a glance that lpszName is a "long pointer to a null-terminated string described as "name".

It is also meant to standardise and simplify the naming of local scratch variables, for which the prefix may be all the variable name that is needed (see "i" is a meaningful variable name )

In Simonyi's original paper "Program Identifier Naming Conventions" he outlines four benefits of his identifier naming convention:
1. Mnemonic value: The programmer will remember the name as it is constructed logically not whimsically or randomly.
2. Suggestive value: The name will help others understand the use of the variable on reading the code. Variable names can only help a little, but do not underestimate the importance of readable (i.e. maintainable) code.
3. Consistency: The names used throughout the program will be consistent, as they will have been produced by the same rules.
4. Speed of the decision: When naming variables, the variable name decision will be mechanical and thus quick.

As with most conventions (e.g. on which side of the road to drive), there is no one right way to do things, however the people who must work together all benefit if they can all agree on a common standard. The number of people working together can be as low as two programmers on a project, but could be all programmers working with an operating system API. Considering that Hungarian notation is used on the Windows API, this is indeed a large number of people.

This system became widely used inside Microsoft. The Hungarian naming convention is quite useful—it's one technique among many that helps programmers produce better code faster.

Perhaps the most important publication that encouraged the use of Hungarian notation was the first book read by almost every Windows programmer: Charles Petzold's Programming Windows. It used a dialect of Hungarian notation throughout and briefly described the notation in its first chapter.

Hungarian notation is rather out of favour with many programmers who have never grokked it. The prefixes are refered to as warts. To a programmer this is a fairly obvious term - the prefix is the ugly lowercase protrusion before the first capital letter in the variable name.

This distrust is strongest in non-Microsoft aligned circles. This may be due to mistrust of its originating company. This reason is in my opinion foolish. Microsoft wouldn't be big without doing some things right, and it is well-documented that they do use their wealth to hire talented programmers such as Charles Simonyi. Don't judge ideas by Ad-hominem standards.

Hungarian Notation is the tactical nuclear weapon of source code obfuscation techniques; use it! Due to the sheer volume of source code contaminated by this idiom nothing can kill a maintenance engineer faster than a well planned Hungarian Notation attack.
- How to write unmaintainable code

The other reason is that the Hungarian notation system in all its detail is seen as inflexible, over complex and unreadable. "What happens when I change my short int to a long int?" cries the C programmer. "That shouldn't necessitate a variable name change!"

Also the names can become very long for e.g. the name g_ppllarrPixels is typed as "global pointer to a pointer to an array of LONGLONG structures containing pixels" But then again, if you really have a data structure like that, you are in fact doing something really complex in a low-level language and perhaps need to be reminded of that.

Hungarian notation can refer to systems that range between two extremes

  • In most specific, the system for C and C++ programs as initiated by Charles Simonyi and detailed within Microsoft.
  • In most general, any scheme that prefixes variable names with characters representing type and scope.

My own personal Hungary

I don't favour Hungarian notation in the first meaning in my code. I can't anyway, I'm not coding in C/C++. I do favour the second, for much the reason that Hungarian notation was proposed in the first place: It makes code more readable.

I get around the C programmer's objection by not distinguishing between different integral types (or other similar types) – When reading code, it doesn't often matter if the int is long or short, but it matters a lot if the variable is an int or a string. If you are changing an int variable into a string variable, that is a big enough change that you probably do want to change the variable name anyway, in order to look at everywhere it is being used. (I work in a strongly typed language).

I try to keep it simple. A multi-page specification of multiple levels of prefixing will take a lot of time to specify and to use, and thus is not recommended unless you feel like being anal, bureaucratic and counterproductive.

I like to use two characters in the variable name prefix.

The first character represents variable scope:
P = parameter to current routine
L = local to current routine
G = global to program
M = global to the module
F = private or protected field on current object or record

The second character represents data type
B = Boolean
I = integer
S = string
F = float
C = object (class) type
P = pointer
Etc.

You can't find a prefix for every type in an object-oriented language, so cease striving for completeness, hit the common ones, make defaults (e.g. 'e' = any enumerated type) and move on to more important problems.
So at a glance, I can tell a lot about code like
if lbStatusOK then
    fcNames[liIndex] := psNewName;

It's not complex, nor inflexible. And it makes code more readable.


Sources: Microsoft website, kuro5hin, other websites via google, own experience.

IMHO pfft is dead wrong. Variable declaration and variable use could be separated by 1000 or more lines of code. Local information is very helpful.

Log in or register to write something here or to contact authors.