Suppose you have a Python class. For the sake of example, let's call it
Bishop:
class Bishop(ChessPiece):
canMoveForward = canMoveSideways = False
canMoveDiagonally = True
reach = ChessPiece.unlimitedReach
When a programmer needs an instance, he calls the class to get one:
whitePieces.append(Bishop(ChessPiece.White))
A programmer of C++ or
Java
will assume the almost certain existence of a constructor function in the class,
named in accordance with the language's rules. Those languages require at least one constructor,
which the compiler will supply if the programmer does not. Java classes should always have a
constructor, because every instance is created dynamically and its attributes would have
undefined (well, "zero" actually, as appropriate to the type) values, while a C++ class may be dynamic
or be allocated with static linkage, which would give default initial values to its attributes.
Python's extreme take on dynamicness (dynamicity, dynamism, …) means that the instance
attributes are not initialized either, but that is simply because they do not exist. So a Python
class usually wants to initialize them also, if only to avoid a NameError
later.
The use of the name constructor may predate its use in C++, though that was my
first experience with it.[1] In early versions of that language, the function actually had to
construct – i.e., allocate – the instance. This is no longer the case (and never
was with Java). Since the actual object is created by the language, Guido
chose to emphasize the difference; the name of the class initialization method, which
is the same in all classes, is __init__. Unlike the other languages I've been
mentioning, Python does not allow method overloading, so there can be only one initializer
in each class; if necessary, it can use the usual tricks of default parameter values and
*args and **kwargs to support apparently differing signatures.
The initializer method is optional; if present, it is not allowed to return a value. The value
returned to the caller must be the instance, so the only reference you could return would be
self, but that's taken care of by Python. Its call signature looks the same as any
other method: the first parameter is self, followed by whatever arguments were present
in the call to the class object.
Following Python's normal rules of name resolution, if a
class does not define a __init__ method, one will be searched for in the inheritance hierarchy (if any).
So the Bishop class, as written above, will use the __init__ from ChessPiece.
With Python version 2.2, it became possible to subclass built-in types. (Prior to that, a base class for
a class implemented in Python, itself had to be in Python.) Great!, I thought. I
had a need for that. I wanted to make a class to represent an IP address; the class would be a subclass
of int, but would display its values in dot-quad notation. I figured, my __init__
method would accept an int or a string parameter, convert the string to the equivalent int if necessary,
then call the __init__ method of int to set the value of the object.
This totally didn't work.
As usual, Guido was way ahead of me. As he says, if that had worked, you could take an int
object in hand, call its __init__, and change its value! Go back to FORTRAN if you want to
play those games. But he didn't leave us totally twisting in the wind.
Along with the ability to inherit from builtins came the exposure of an earlier step in the object
instantiation process: the __new__ method (which is a class method).
Unlike __init__, __new__
does return a value, being the new instance. But, since there is no concept of "allocating memory"
in Python, this is not redolent of that early C++; instead, the instance is generally creating by instantiating
another type, most typically the type of the base class. That is then returned, and the instantiation machinery
later passes it to __init__. Note that this gets around the value modification problem mentioned
above, because if you call __new__ on an existing object, it just returns to you a new object without
affecting the original one.
Here is an example of how you could implement my IP address class:
class IPAddress(int):
def __new__(cls, value):
try:
# Convert a dotquad string to int
"" + value
value = DotQuadToInt(value)
except TypeError:
pass
return int.__new__(cls, value)
def __str__(self):
return IntToDotQuad(self)
>>> a = IPAddress(16909060)
>>> print a
1.2.3.4
>>> a = IPAddress('1.2.3.4')
>>> print a
1.2.3.4
>>> print int(a)
16909060
>>> print a + 1
16909061
>>>
Note that, since my instance of IPAddress is an int, it can be passed to int's methods,
which is why I can now choose whether to get a dotquad representation or an integer representation when I convert
it to a string. And, since my class does not implement any arithmetic methods, when I added 1 to my address, I
got the result as a simple integer (and not as a new instance of IPAddress).
[1] AT and ariels tell me that the term originated with Smalltalk; believable since much of OO can trace its roots back to there.