Python Pitfalls - Chapter: Classes
Uncovering a few common misconceptions behind Python classes
1. The legend(read, myth) of __init__
From the dawn of my time at learning and practising Python almost every article or video I saw about OOP in Python in some way or the other cement that __init__
is the constructor in Python. Even the GeeksforGeeks' article in their examples emulsifies this.
Prepare to get scandalised as I say it, this is not the case and it has never been.
__init__
method is not a constructor in Python Classes, it does not create any new instances of a class. Instances in Python are created by __new__
method. The instance, when created by __new__
method, is passed to __init__
as self
.
__init__
is a value initializer. It just populates instance variables with their values.
The fact that __new__
is almost never seen in any of the classes we create and __init__
is the first magic method to appear justifies its hogging up the limelight. And it is due to this fact that the thin line between __init__
and __new__
has blurred out even more.
2. The Constructor War
Another pick from my "Ever since I started learning Python ... " list would be the struggle I endured with the datetime
module. I have never been able to figure out what is the correct way to get a timestamp because there are so many.
print(datetime(2021, 3, 3))
> 2021-03-03 00:00:00
print(datetime.fromtimestamp(1263383616))
> 2010-01-13 17:23:36
print(datetime.fromordinal(734000))
> 2010-08-16 00:00:00
print(datetime.now())
> 2021-09-22 11:22:19.569373
And this is not an exhaustive list of how you can create a timestamp from datetime module, there are other ways as well. Wait, but why are there so many ways to initialize a datetime module? Turns out, there are different system in world which deals with different time values, example, ISO 8601 ordinal date format. And in order to build an application that incorporates ordinal date format, there must be a easy way to create timestamp from this format instead of having to write multiple lines of function just to convert one time format into the other. This is why we have fromordinal()
function among many others.
Another question, but how are there so many ways to initialize a datetime module since there can be only a single __init__
method?
Python has support for initializing a class with completely different set of values which allows your classes to be extensible. And this is done using classmethod
decorator.
Let's see how it works.
Suppose we have a Service
class which takes in a YAML configuration file for its initialization.
class Service:
def __init__(self, yaml_file):
self.yaml_file = yaml_file
# code to parse yaml and return config
Looks good enough. When at a later point of time if there is a use case that requires the service to be initialized with an XML file, should we add another parameter in __init__
and break the code for everyone using its current version? No, we do not.
Instead, we write a classmethod that accepts an XML file and return the same service object as when we initialize the class with a YAML file.
class Service:
def __init__(self, yaml_file):
self.yaml_file = yaml_file
@classmethod
def from_xml(cls, xml_file):
xml = self.parse_xml(xml_file)
return cls(xml)
Now we can initialize our class using both YAML and XML files.
svc = Service('config.yaml')
svc_xml = Service.from_xml('config.xml')
Here svc is a <__main__.Service object at 0x00000171F06F2F98>
and svc_xml is <__main__.Service object at 0x00000171F06F25F8>
Also, notice how from_xml
can be directly accessed by Service
class itself and does not need any object to be accessed.
Similarly, this Service
class can extend support for parsing other configuration file formats like .ini or .py. How cool is that?
3. Python Privacy Facade
It has been well established that Python isn't about privacy. There's no concept of public, private and protected variables in Python. Everything in Python is public (and also an object, of course!).
More often than not, it has been found in articles that prefixing a variable with a double underscore makes it private in Python.
Embrace as you attain another enlightening myth buster. Well, it is false. __name
is not a private 'name' variable. And I would reiterate the fact that Python is just not concerned about making objects private.
We all are consenting adults in a Python world.
- Raymond Hettinger
Then what __name
actually is?
__name
is a class local reference variable. And by that I mean this variable can only be accessed by the class it is defined in.
Doesn't that makes it a private variable?
No, __name
can still be accessed outside the class.
How?
Python uses name mangling to change to name of any object that starts with double underscores and convert it to _Class_objectname
.
So, if we define:
class Service:
__type = "WebService"
and try to access the __type
attribute from svc
object, we would encounter an AttributeError: 'Service' object has no attribute '__type
'. This is because Python has renamed __type
to _Service__type
which comes into light when we do a dir(Service)
.
We can see this output:
['_Service__type', '__class__', '__delattr__',...]
Notice the first item in the list. _Service__name
is what __type
has been converted into. Now we can access the value of __type
in this way:
svc = Service()
print(svc._Service__type)
The output would be "WebService"
.
These were the three pitfalls related to Python Classes that wanted to write about. If you enjoyed reading this article consider following this series as I would be writing a little more on such Python Pitfalls.
Sources: Raymond Hettinger Youtube Video