Posts Tagged ‘python’

Why Reddit uses Python

Wednesday, April 8th, 2009

During Steve Huffman’s and Alexis Ohanian’s Pycon Keynote, someone asked why Reddit was moved from Lisp to Python. The reason for moving wasn’t too interesting, but why they have stayed is. Steve gave two “huge” reasons Reddit continues to use Python:

The biggest thing that has kept us on Python … well, there are two huge things. One are the libraries. There’s a library for everything. We’ve been learning a lot of these technologies and a lot of these architectures as we go. And, so, when I didn’t understand connection pools, I can just find a library until I understand it better myself and write our own. Don’t understand web frameworks, so we’ll use someone else’s until we make our own. Don’t understand a lot of stuff. And Python has an awesome crutch like that. And now, as we’ve been learning more, pulling more stuff back in house — just so we can have things the way we like them — it’s made the transition super super easy.

The other thing that keeps us on Python, and this is the major thing, is how readable and writable it is. When we hire new employees … I don’t think we’ve yet hired an employee who knew Python. I just say, “everything you write needs to be in Python.” Just so I can read it. And it’s awesome because I can see from across the room, looking at their screen, whether their code is good or bad. Because good Python code has a very obvious structure. And that makes my life so much easier. […] It’s extremely expressive, extremely readable, and extremely writable. And that just keeps life smooth.

The question gets asked around 25:54 on the video.

Flexicommunity of web frameworks

Tuesday, March 24th, 2009

The group I work with has thought about using Django a few times, but our biggest concern has been the lock-in you get with their base components. We’re already happy using SQLAlchemy and Genshi for a system application in our project. Besides liking the tools we’ve picked, we’d like to avoid using multiple tools for the same task. On the other hand, if a second set of tools provides a framework we’ll totally rock with, sign us up!

A few days ago, Zed Shaw spooged about how rad Django is. There isn’t a lot of meat to his story, but there are enticing morsels like generic views, the Pinax component list, and:

Django has been pushing the idea of having discrete “applications” that act within a “site” as cooperating but separate components.

Inspired, the first thing I looked at was the ORM. The schema we’ve inherited and need to redesign is absurdly complex1 so I suspected this is where I’d first hit limitations.

The redesign will use natural keys when possible and reasonable to help protect against duplicate data entry2 — the system we inherited uses only surrogate keys and has duplicate data in a variety of places. Also, we add protection (against errors, bad data, etc.) at multiple layers, so this approach fits well with our philosophy.

When choosing a natural key, sometimes more than one column is needed to guarantee uniqueness. This is called a composite primary key. Unfortunately, composite primary keys are not supported by Django3 and their wiki page about Multi-Column Primary Key support is disheartening. The only reason they might implement this feature is to support “legacy databases (whose schema cannot be changed)”. This plus the automatic generation of primary key “id” columns tells me they have made a choice for us: use surrogate keys.

Our model is now outside Django’s current scope and there is no way to leverage another ORM that can support our needs. Sadder still, the brilliance of an entire database toolkit4 perfectly suited to work with the M in our MVC project is inaccessible.

This is the problem with frameworks like Django, Ruby on Rails, and others; they enforce their “pragmatic design”5 on you. You must use their ORM, their template language, their request dispatcher, etc. You are sequestered within their community.

In frameworks like Pylons (and others) where you get both structure and choice of base components, all the different communities of ORM creators, template engine writers, and etc. are available and intermingling. Communication between projects is high and even the frameworks themselves are mixing!6

Our project will be growing for years, the longevity and flexibility of the community is critical. The last thing we want is 4 years from now realizing we’re dependent on a suite of core components nobody is using.

  1. We actually blew the Python call stack with our relation references when using SQLAlchemy and declarative. You might be thinking that’s because the model is crap, and you’d be right. We are actually keeping a tally how many times we exclaim “WTF!?” each day as we try to beat the beast into submission.
  2. See Josh Berkus’ Primary Keyvil series for explanation of how natural keys can provide a database level way of protecting against duplicate data.
  3. “Each model requires exactly one field to have primary_key=True.”
  4. SQLAlchemy is not just an ORM, it’s an expression language and more.
  5. From the Django project homepage.
  6. TurboGears 2 was rewritten on top of Pylons.