This tutorial shows how to download REST resources to your database using Python and SQLAchemy. We are going to download manufacturers and cars resources to our database. We will also show how easy it is to query them once they are stored.
Requirements
We need to install SQLAchemy and simplejson Python libraries:
easy_install SQLAlchemy easy_install simplejson
You can use any database supported by SQLAlchemy: List of databases. I am using MySQL.
Imports
We need to import the following modules for our code to work fine.
import urllib2 import simplejson from sqlalchemy import Column, Integer, String, ForeignKey, create_engine from sqlalchemy.orm import sessionmaker, relationship, backref from sqlalchemy.ext.declarative import declarative_base
Tables, classes and mappers
SQLAchemy declarative configuration style allows us to define our tables, resources classes and the mapping between the class and the tables just by declaring 2 classes derived from the class Base.
Base = declarative_base()
Let’s define the manufacturers table, class and mapper.
class Manufacturer(Base): """ Manufacturer resource """ __tablename__ = 'manufacturers' id = Column(Integer, primary_key=True) name = Column(String(32), unique=True) country = Column(String(32)) def __init__(self, resource): """ Class instantiation @param resource resource JSON attribute """ self.name = resource['name'] self.country = resource['country'] def __repr__(self): return "<manufacturer('%s in %s')>" % (self.name, self.country)
Let’s define the cars table, class and mapper. Note how the relationship between the car and its manufacturer is defined.
class Car(Base): """ Car resource """ __tablename__ = 'cars' id = Column(Integer, primary_key=True) name = Column(String(32), unique=True) max_speed = Column(Integer) year_released = Column(Integer) manufacturer_id = Column(Integer, ForeignKey('manufacturers.id')) manufacturer = relationship(Manufacturer, backref=backref('cars', order_by=id)) def __init__(self, resource): """ Class instantiation @param resource resource JSON attribute """ self.name = resource['name'] self.max_speed = resource['max_speed'] self.year_released = resource['year_released'] def __repr__(self): return "<car('%s - max speed: %d, released in %s')>" % (self.name, self.max_speed, self.year_released)
Our main class is named “Resources” and we will add methods to setup a connection with the database, create the tables and download the resources.
class Resources: """ Main class """ def __init__(self): """ Class instantiation @param settings dict of settings """ # database engine self.engine = None # Collection of tables and their associated schema constructs self.metadata = None # database session self.session = None self.setup_connection() self.setup_tables() self.process() def setup_connection(self): """ Create DB session """ ... def setup_tables(self): """ Create tables in database. """ ... def process(self): """ Get resources and add them to DB """ ...
Let’s look at how we can setup a connection to our database. We call create_engine() and pass the database dialect and connection arguments. In our case, we use the string: ‘mysql://root:password@localhost:3306/vroom’ where vroom is the name of the database. This allows us to define our session using the Session factory. This session is the ORM’s handle to the database.
def setup_connection(self): """ Create DB session """ s = 'mysql://root:password@localhost:3306/vroom' self.engine = create_engine(s) Session = sessionmaker(bind=self.engine) self.session = Session()
Next is the tables creation using our tables definition. create_all() can be called multiple times safely and only missing tables will be created.
def setup_tables(self): """ Create tables in database. """ self.metadata = Base.metadata self.metadata.create_all(self.engine)
Using simplejson and urllib2, it is really easy to get resources and retrieve them as a dictionary. Let’s get the list of manufacturers.
def process(self): ... data = simplejson.load(urllib2.urlopen("http://api_url/manufacturers"))
data will contain something like this:
{"manufacturers": [ {"name": "Peugeot", "country": "France"}, {"name": "Chevrolet", "country": "USA"}, ] }
For each manufacturer, we need to get the list of cars so we can add the resources to the database. SQLAchemy will take care of setting up the foreign keys values for us: ie: relations between the manufacturers and the cars.
for i, m in enumerate(data['manufacturers']): # create Manufacturer object manufacturer = Manufacturer(m) # get cars for this manufacturer data2 = simplejson.load(urllib2.urlopen("http://api_url/manufacturers/%s" % manufacturer['name'])) # fill-up list of Car objects cars = [] for c in data2['cars'][i]: # create Car object and add it to cars list cars.append(Car(c)) # add cars list to manufacturer's cars list manufacturer.cars = cars # add manufacturer object to DB self.session.add(manufacturer) # commit changes to DB self.session.commit()
Once the objects are in the database, it is easy to query them using the ORM. For example, let’s print the list of manufacturers along with their cars.
for m in self.session.query(Manufacturer).order_by(Manufacturer.id): print m for c in m.cars: print '-- %s' % c
The output should be something like this:
<manufacturer('Peugeot in France')> -- <car('307 - max speed: 180, released in 2002')> <manufacturer('Chevrolet in USA')> -- <car('Malibu - max speed: 190, released in 2010')>
That’s it for now. Please add a comment if you have any feedback.