Python Pitfalls - Chapter: Classes

Python Pitfalls - Chapter: Classes

Uncovering a few common misconceptions behind Python classes

1. The legend(read, myth) of __init__

From the dawn of my time at learning and practising Python almost every article or video I saw about OOP in Python in some way or the other cement that __init__ is the constructor in Python. Even the GeeksforGeeks' article in their examples emulsifies this.

Prepare to get scandalised as I say it, this is not the case and it has never been.

__init__ method is not a constructor in Python Classes, it does not create any new instances of a class. Instances in Python are created by __new__ method. The instance, when created by __new__ method, is passed to __init__ as self.

__init__ is a value initializer. It just populates instance variables with their values.

The fact that __new__ is almost never seen in any of the classes we create and __init__ is the first magic method to appear justifies its hogging up the limelight. And it is due to this fact that the thin line between __init__ and __new__ has blurred out even more.

2. The Constructor War

Another pick from my "Ever since I started learning Python ... " list would be the struggle I endured with the datetime module. I have never been able to figure out what is the correct way to get a timestamp because there are so many.

print(datetime(2021, 3, 3))
> 2021-03-03 00:00:00

print(datetime.fromtimestamp(1263383616))
> 2010-01-13 17:23:36

print(datetime.fromordinal(734000))
> 2010-08-16 00:00:00

print(datetime.now())
> 2021-09-22 11:22:19.569373

And this is not an exhaustive list of how you can create a timestamp from datetime module, there are other ways as well. Wait, but why are there so many ways to initialize a datetime module? Turns out, there are different system in world which deals with different time values, example, ISO 8601 ordinal date format. And in order to build an application that incorporates ordinal date format, there must be a easy way to create timestamp from this format instead of having to write multiple lines of function just to convert one time format into the other. This is why we have fromordinal() function among many others.

Another question, but how are there so many ways to initialize a datetime module since there can be only a single __init__ method?

Python has support for initializing a class with completely different set of values which allows your classes to be extensible. And this is done using classmethod decorator.

Let's see how it works.

Suppose we have a Service class which takes in a YAML configuration file for its initialization.

class Service:

    def __init__(self, yaml_file):
        self.yaml_file = yaml_file

    # code to parse yaml and return config

Looks good enough. When at a later point of time if there is a use case that requires the service to be initialized with an XML file, should we add another parameter in __init__ and break the code for everyone using its current version? No, we do not.

Instead, we write a classmethod that accepts an XML file and return the same service object as when we initialize the class with a YAML file.

class Service:

    def __init__(self, yaml_file):
        self.yaml_file = yaml_file

    @classmethod
    def from_xml(cls, xml_file):
        xml = self.parse_xml(xml_file)
        return cls(xml)

Now we can initialize our class using both YAML and XML files.

svc = Service('config.yaml')
svc_xml = Service.from_xml('config.xml')

Here svc is a <__main__.Service object at 0x00000171F06F2F98> and svc_xml is <__main__.Service object at 0x00000171F06F25F8>

Also, notice how from_xml can be directly accessed by Service class itself and does not need any object to be accessed.

Similarly, this Service class can extend support for parsing other configuration file formats like .ini or .py. How cool is that?

3. Python Privacy Facade

It has been well established that Python isn't about privacy. There's no concept of public, private and protected variables in Python. Everything in Python is public (and also an object, of course!).

More often than not, it has been found in articles that prefixing a variable with a double underscore makes it private in Python.

Embrace as you attain another enlightening myth buster. Well, it is false. __name is not a private 'name' variable. And I would reiterate the fact that Python is just not concerned about making objects private.

We all are consenting adults in a Python world.
- Raymond Hettinger

Then what __name actually is?
__name is a class local reference variable. And by that I mean this variable can only be accessed by the class it is defined in.

Doesn't that makes it a private variable?
No, __name can still be accessed outside the class.

How?
Python uses name mangling to change to name of any object that starts with double underscores and convert it to _Class_objectname.

So, if we define:

class Service:
    __type = "WebService"

and try to access the __type attribute from svc object, we would encounter an AttributeError: 'Service' object has no attribute '__type'. This is because Python has renamed __type to _Service__type which comes into light when we do a dir(Service).

We can see this output:

['_Service__type', '__class__', '__delattr__',...]

Notice the first item in the list. _Service__name is what __type has been converted into. Now we can access the value of __type in this way:

svc = Service()
print(svc._Service__type)

The output would be "WebService".

These were the three pitfalls related to Python Classes that wanted to write about. If you enjoyed reading this article consider following this series as I would be writing a little more on such Python Pitfalls.

Sources: Raymond Hettinger Youtube Video