The internals of Python are actually pretty straightforward, but it’s still worth a dive. I recently gave a talk at Zillow about it, so I’d thought I’d share some points here as well.

Everything here prefixed with »> can be typed into the python interpreter (activated by typing ‘python’ in your shell if you have python installed). I strongly encourage playing and trying some of this stuff out yourself!

Basics #

At the core, everything in python is an object. Each object has three properties:

  • a unique identifier of the object via ‘id()’

  • a type of the object via ‘type()’

  • and it’s value

The base object is represented by the keyword ‘object’ in python:

<span></span><span class="o">>>></span> <span class="nb">object</span>
<span class="o"><</span><span class="nb">type</span> <span class="s1">'object'</span><span class="o">></span>

And you can always find the methods available on any object (i.e. anything) using ‘dir’:

<span></span><span class="o">>>></span> <span class="nb">dir</span><span class="p">(</span><span class="nb">object</span><span class="p">)</span>
<span class="p">[</span><span class="s1">'__class__'</span><span class="p">,</span>
 <span class="s1">'__delattr__'</span><span class="p">,</span>
 <span class="s1">'__doc__'</span><span class="p">,</span>
 <span class="s1">'__format__'</span><span class="p">,</span>
 <span class="s1">'__getattribute__'</span><span class="p">,</span>
 <span class="s1">'__hash__'</span><span class="p">,</span>
 <span class="s1">'__init__'</span><span class="p">,</span>
 <span class="s1">'__new__'</span><span class="p">,</span>
 <span class="s1">'__reduce__'</span><span class="p">,</span>
 <span class="s1">'__reduce_ex__'</span><span class="p">,</span>
 <span class="s1">'__repr__'</span><span class="p">,</span>
 <span class="s1">'__setattr__'</span><span class="p">,</span>
 <span class="s1">'__sizeof__'</span><span class="p">,</span>
 <span class="s1">'__str__'</span><span class="p">,</span>
 <span class="s1">'__subclasshook__'</span><span class="p">]</span>

So let’s talk a little bit about the more interesting ones:

  • class returns the type of an object. If the object is a type, it returns the type ‘type’

  • doc is the docstring attached to a file. These are the triple quotes contained directly below a method or class declaration.

  • new is called whenever a new instance of an object is created. It almost always calls init

  • sizeof get the size of the object. One can also use sys.getsizeof. This isn’t the most reliable because it doesn’t get the size of referenced objects, just the size of the reference itself.

  • delattr, getattribute, and setattr are used to get the attributes regarding a particular object. However, you should use (set get has)attr methods instead of directly calling these.

Types #

Types are special kind of object in Python, designed to be constructors for classes. It’s not possible to create a new object (aside from built-in shorthand like {} for dictionaries and [] for lists) without using a type object and instantiating something with it:

<span></span><span class="o">>>></span> <span class="nb">object</span><span class="p">()</span>
<span class="o"><</span><span class="nb">object</span> <span class="nb">object</span> <span class="n">at</span> <span class="mh">0x7f1e14eee080</span><span class="o">></span>

exec, eval, and compile #

exec, eval, and compile are also built-in functions in Python. They compile and evaluate code.

‘exec’ executes a particlular string of code

<span></span><span class="o">>>></span> <span class="n">exec</span><span class="p">(</span><span class="s2">"print 'hello world'"</span><span class="p">)</span>
<span class="n">hello</span> <span class="n">world</span>

‘eval’ evaluates an expression. Note: this can not be a statement. e.g. assigning a value.

<span></span><span class="o">>>></span> <span class="nb">eval</span><span class="p">(</span><span class="s2">"1"</span><span class="p">)</span>
<span class="mi">1</span>

‘compile’ compiles an expression or statement into a ‘code’ objects, which actually contained the byte-compiled executable code, and is what gets ultimately executed by Python.

Note that you have to choose to either ‘eval’ or ‘exec’ the string passed. Conversely, you can pass a file.

<span></span><span class="o">>>></span> <span class="nb">compile</span><span class="p">(</span><span class="s1">'./test.py'</span><span class="p">)</span>
<span class="o">>>></span> <span class="nb">compile</span><span class="p">(</span><span class="s1">'print "hello world", '', '</span><span class="n">exec</span><span class="s1">')</span>

Functions #

Functions (or methods) consist of two objects:

  • a code object, containing the bytecode for a particular object

  • a globals dictionary, containing the global variables necessary

One can’t instantiate functions directly, so we have to get the type of a function first:

<span></span><span class="o">>>></span> <span class="n">ftype</span> <span class="o">=</span> <span class="nb">type</span><span class="p">(</span><span class="k">lambda</span><span class="p">:</span> <span class="kc">None</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">fn</span> <span class="o">=</span> <span class="n">ftype</span><span class="p">(</span><span class="nb">compile</span><span class="p">(</span><span class="s1">'print test'</span><span class="p">,</span> <span class="s1">''</span><span class="p">,</span> <span class="s1">'exec'</span><span class="p">),</span> <span class="p">{</span><span class="s1">'test'</span><span class="p">:</span> <span class="s2">"hello world"</span><span class="p">})</span>
<span class="o">>>></span> <span class="n">fn</span>
<span class="o"><</span><span class="n">function</span> <span class="o"><</span><span class="n">module</span><span class="o">>></span>
<span class="o">>>></span> <span class="n">fn</span><span class="p">()</span>
<span class="n">hello</span> <span class="n">world</span>

So what’s actually going on here?

  • I get the type object of function. The easiest method to do this is to get the type of a lambda method which returns None. Since the type of the lambda is a ‘function’, it’s the quickest way to get what we need.

If you wanted to modify a function directly, you can! There’s a large number of method available that you can play with.

<span></span><span class="o">>>></span> <span class="nb">filter</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="o">.</span><span class="n">startswith</span><span class="p">(</span><span class="s1">'func'</span><span class="p">),</span> <span class="nb">dir</span><span class="p">(</span><span class="n">fn</span><span class="p">))</span>
<span class="p">[</span><span class="s1">'func_closure'</span><span class="p">,</span>
 <span class="s1">'func_code'</span><span class="p">,</span>
 <span class="s1">'func_defaults'</span><span class="p">,</span>
 <span class="s1">'func_dict'</span><span class="p">,</span>
 <span class="s1">'func_doc'</span><span class="p">,</span>
 <span class="s1">'func_globals'</span><span class="p">,</span>
 <span class="s1">'func_name'</span><span class="p">]</span>
<span class="o">>>></span> <span class="n">fn</span><span class="o">.</span><span class="n">func_name</span>
<span class="s1">'<module>'</span>
<span class="o">>>></span> <span class="n">fn</span><span class="o">.</span><span class="n">func_name</span> <span class="o">=</span> <span class="s1">'hello_world'</span>
<span class="s1">'hello_world'</span>
<span class="o">>>></span> <span class="n">fn</span><span class="o">.</span><span class="n">func_code</span> <span class="o">=</span> <span class="nb">compile</span><span class="p">(</span><span class="s1">'print "not " + test'</span><span class="p">,</span> <span class="s1">''</span><span class="p">,</span> <span class="s1">'exec'</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">fn</span><span class="p">()</span>
<span class="ow">not</span> <span class="n">hello</span> <span class="n">world</span>
<span class="o">>>></span> <span class="n">fn</span><span class="o">.</span><span class="n">func_globals</span><span class="p">[</span><span class="s1">'test'</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"goodbye world"</span>
<span class="ow">not</span> <span class="n">goodbye</span> <span class="n">world</span>

Classes #

Classes are just basically custom types. How can you tell? It’s made by using the ‘type’ constructor!

The ‘type’ method can not only return the type of an object, it can create one for you too! Since ‘type’ is a type object, it can be used to instantiate new types.

<span></span><span class="o">>>></span> <span class="n">a</span> <span class="o">=</span> <span class="nb">type</span><span class="p">(</span><span class="s1">'MyClassType'</span><span class="p">,</span> <span class="p">(),</span> <span class="p">{</span><span class="s1">'test'</span><span class="p">:</span> <span class="k">lambda</span> <span class="bp">self</span><span class="p">:</span> <span class="mi">1</span> <span class="p">})</span>
<span class="o">>>></span> <span class="n">b</span> <span class="o">=</span> <span class="n">a</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">b</span><span class="o">.</span><span class="n">test</span>
<span class="o"><</span><span class="n">bound</span> <span class="n">method</span> <span class="n">MyClassType</span><span class="o">.<</span><span class="k">lambda</span><span class="o">></span> <span class="n">of</span> <span class="o"><</span><span class="n">__main__</span><span class="o">.</span><span class="n">MyClassType</span> <span class="nb">object</span> <span class="n">at</span> <span class="mh">0x7f524b71e510</span><span class="o">>></span>
<span class="o">>>></span> <span class="n">b</span><span class="o">.</span><span class="n">test</span><span class="p">()</span>
<span class="mi">1</span>

The syntax is:

<span></span><span class="nb">type</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">parents</span><span class="p">,</span> <span class="n">attributes</span> <span class="o">+</span> <span class="n">values</span><span class="p">)</span>
  • Name: the name of the new type

  • Parents: references to the parent classes

  • attributes + values: a list of tuples of the key and values of the attributes of the class.

Python’s objects are incredibly maleable. You can actually modify class methods directly:

<span></span><span class="o">>>></span> <span class="n">a</span><span class="o">.</span><span class="n">test</span> <span class="o">=</span> <span class="k">lambda</span> <span class="bp">self</span> <span class="p">:</span> <span class="k">return</span> <span class="s2">"noooo!"</span>
<span class="o">>>></span> <span class="n">b</span><span class="o">.</span><span class="n">test</span><span class="p">()</span>
<span class="n">noooo</span><span class="err">!</span>

Although you can also override the method on the instance directly:

<span></span><span class="o">>>></span> <span class="n">b</span><span class="o">.</span><span class="n">test</span> <span class="o">=</span> <span class="k">lambda</span> <span class="bp">self</span> <span class="p">:</span> <span class="k">return</span> <span class="s2">"yes!"</span>
<span class="o">>>></span> <span class="n">b</span><span class="o">.</span><span class="n">test</span><span class="p">()</span>
<span class="n">yes</span><span class="err">!</span>

So how does this work? Well every python object who’s type isn’t a built in (think str, int) contains a dictionary-like object with all of it’s attributes. This can be viewed by the “dict” attribute of an object:

<span></span><span class="o">>>></span> <span class="k">class</span> <span class="nc">ABC</span><span class="p">:</span>
<span class="o">...</span>     <span class="k">pass</span>
<span class="o">...</span>
<span class="o">>>></span> <span class="n">a</span> <span class="o">=</span> <span class="n">ABC</span><span class="p">()</span>
<span class="o">>>></span> <span class="nb">print</span> <span class="n">a</span>
<span class="o"><</span><span class="n">__main__</span><span class="o">.</span><span class="n">ABC</span> <span class="n">instance</span> <span class="n">at</span> <span class="mh">0x19879e0</span><span class="o">></span>
<span class="o">>>></span> <span class="n">a</span><span class="o">.</span><span class="vm">__dict__</span>
<span class="p">{}</span>

So how does Python know which attribute to call? This is actually dictated in a method! If you noticed, when I ran a dir() on the object, there was an attribute ‘getattribute’. This method defaults to:

  • if the attribute is in the object’s own dict, then use that method.

  • if not, the attribute call’s it’s parents getattribute method, which of course recurses to it’s own parents on being unable to find it

One of the things about dict is it’s not directly writable. If you want to modify attributes on an object, python provides built-in functions for this:

  • hasattr(foo, ‘bar’) returns true if the object foo has the attribute ‘bar’

  • getattr(foo, ‘bar’) returns the attribute foo.bar

  • setattr(foo, ‘bar’, val) is equivalent to foo.bar = val

back to classes/types, there’s some interesting hidden features as well:

You can find out all the superclasses of a ‘type’ with .bases:

<span></span><span class="o">>>></span> <span class="n">a</span><span class="o">.</span><span class="vm">__bases__</span>
<span class="p">(</span><span class="nb">object</span><span class="p">,)</span>

And all subclasses:

<span></span><span class="o">>>></span> <span class="nb">str</span><span class="o">.</span><span class="n">__subclasses__</span><span class="p">()</span>
<span class="p">[</span><span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">apt</span><span class="o">.</span><span class="n">package</span><span class="o">.</span><span class="n">__dstr</span><span class="s1">'>]</span>

So how could I find all the classes in my scope? Since everything is an object, we just find all subclasses of it.

<span></span><span class="o">>>></span> <span class="nb">object</span><span class="o">.</span><span class="n">__subclasses__</span><span class="p">()</span>

Pop Quiz: Is object a subclass of type, or visa versa?

Answer: both are subclasses of each other! Kind of.

<span></span><span class="o">>>></span> <span class="nb">isinstance</span><span class="p">(</span><span class="nb">object</span><span class="p">,</span> <span class="nb">type</span><span class="p">)</span>
<span class="kc">True</span>
<span class="o">>>></span> <span class="nb">isinstance</span><span class="p">(</span><span class="nb">type</span><span class="p">,</span> <span class="nb">object</span><span class="p">)</span>
<span class="kc">True</span>
<span class="o">>>></span> <span class="nb">issubclass</span><span class="p">(</span><span class="nb">object</span><span class="p">,</span> <span class="nb">type</span><span class="p">)</span>
<span class="kc">False</span>
<span class="o">>>></span> <span class="nb">issubclass</span><span class="p">(</span><span class="nb">type</span><span class="p">,</span> <span class="nb">object</span><span class="p">)</span>
<span class="kc">True</span>

Frames #

Want to look at the stack frames within python? That’s possible too.

<span></span><span class="o">>>></span> <span class="kn">import</span> <span class="nn">sys</span>
<span class="o">>>></span> <span class="n">sys</span><span class="o">.</span><span class="n">_getframe</span><span class="p">()</span>

Will get you an instance of the existing frame, with references to the variables in the inner scope, outer scope, and more!

Conclusion #

There’s a lot of interesting stuff going on under the hood of Python, way beyond the brief discussion I covered here. The interpretive nature of python is one that promotes exploration, so don’t hesitate! Explore the wonderful world of python internals.