This post describes how Pytolab was designed to process Tweets related to the 2012 French presidential election, in real-time. This post also goes over some of the statistics computed over a period of 9 months. Note: I presented this project at EuroSciPy 2012: abstract. Architecture Statistics Architecture The posts are received from the Twitter streaming [...]
This post describes the implementation of sentiment analysis of tweets using Python and the natural language toolkit NLTK. The post also describes the internals of NLTK related to this implementation. Background The purpose of the implementation is to be able to automatically classify a tweet as a positive or negative tweet sentiment wise. The classifier [...]
This article describes how string objects are managed by Python internally and how string search is done. PyStringObject structure New string object Sharing string objects String search PyStringObject structure A string object in Python is represented internally by the structure PyStringObject. “ob_shash” is the hash of the string if calculated. “ob_sval” contains the string of [...]
This article describes how integer objects are managed by Python internally. An integer object in Python is represented internally by the structure PyIntObject. Its value is an attribute of type long. To avoid allocating a new integer object each time a new integer object is needed, Python allocates a block of free unused integer objects [...]
We are going to talk about the toolkit pycrypto and how it can help us speed up development when cryptography is involved. Hash functions Encryption algorithms Public-key algorithms Hash functions A hash function takes a string and produces a fixed-length string based on the input. The output string is called the hash value. Ideal hash [...]
This post describes how lists are implemented in the Python language. Lists in Python are powerful and it is interesting to see how they are implemented internally. Following is a simple Python script appending some integers to a list and printing them. As you can see, lists are iterable. List object C structure A list [...]
This post describes how to solve mazes using 2 algorithms implemented in Python: a simple recursive algorithm and the A* search algorithm. Maze The maze we are going to use in this article is 6 cells by 6 cells. The walls are colored in blue. The starting cell is at the bottom left (x=0 and [...]
This post describes how dictionaries are implemented in the Python language. Dictionaries are indexed by keys and they can be seen as associative arrays. Let’s add 3 key/value pairs to a dictionary: The values can be accessed this way: The key ‘d’ does not exist so a KeyError exception is raised. Hash tables Python dictionaries [...]
This article describes the Python threading synchronization mechanisms in details. We are going to study the following types: Lock, RLock, Semaphore, Condition, Event and Queue. Also, we are going to look at the Python internals behind those mechanisms. The source code of the programs below can be found at github.com/laurentluce/python-tutorials under threads/. First, let’s look [...]
This article describes the internals of launching an instance in OpenStack Nova. Overview Launching a new instance involves multiple components inside OpenStack Nova: API server: handles requests from the user and relays them to the cloud controller. Cloud controller: handles the communication between the compute nodes, the networking controllers, the API server and the scheduler. [...]