Python deque implementation

Python deque is a double-ended queue. You can append to both ends and pop from both ends. The complexity of those operations amortizes to constant time. We are going to look at the Python 3 internal implementation of deques. It uses a linked list of blocks of 64 pointers to objects. This reduces memory overhead... » read more

Cambridge city geospatial statistics

Using the Cambridge (Massachusetts) GIS data, we can compute some interesting geospatial statistics. Subway stations Using the address points (20844 points) and the subway stations data files, we can find out how many Cambridge addresses are in a certain radius of a Cambridge subway station. More particularly, we want to find the percentage of Cambridge... » read more

Python, Twitter statistics and the 2012 French presidential election

This post describes how Pytolab was designed to process Tweets related to the 2012 French presidential election in real-time. This post also goes over some of the statistics computed over a period of nine months. Note: I presented this project at EuroSciPy 2012: abstract. ArchitectureStatistics Architecture The posts are received from the Twitter streaming API... » read more

Twitter sentiment analysis using Python and NLTK

This post describes the implementation of sentiment analysis of tweets using Python and the natural language toolkit NLTK. The post also describes the internals of NLTK related to this implementation. Background The purpose of the implementation is to be able to automatically classify a tweet as a positive or negative tweet sentiment wise. The classifier... » read more

Python dictionary implementation

This post describes how dictionaries are implemented in the Python language. Dictionaries are indexed by keys and they can be seen as associative arrays. Let’s add 3 key/value pairs to a dictionary: The values can be accessed this way: The key ‘d’ does not exist so a KeyError exception is raised. Hash tables Python dictionaries... » read more

Python string objects implementation

This article describes how string objects are managed by Python internally and how string search is done. PyStringObject structure New string object Sharing string objects String search PyStringObject structure A string object in Python is represented internally by the structure PyStringObject. “ob_shash” is the hash of the string if calculated. “ob_sval” contains the string of... » read more

Python integer objects implementation

This article describes how integer objects are managed by Python internally. An integer object in Python is represented internally by the structure PyIntObject. Its value is an attribute of type long. To avoid allocating a new integer object each time a new integer object is needed, Python allocates a block of free unused integer objects... » read more