Skip to content

Musings of an Anonymous Geek

Made with only the finest 1's and 0's

Menu
  • About
  • Search Results
Menu

PyTPMOTW: PsycoPG2

Posted on April 21, 2010April 21, 2010 by bkjones

What is this module for?

Interacting with a PostgreSQL database in Python.

What is PostgreSQL?

PostgreSQL is an open source relational database product. It has some more advanced features, like built-in networking-related and GIS-related datatypes, the ability to script stored functions in multiple languages (including Python), etc. If you have never heard of PostgreSQL, get out from under your rock!

Making Contact

Using the pscyopg2 module to connect to a PostgreSQL database couldn’t be simpler. You can use the connect() method of the module, passing in either the individual arguments required to make contact (dbname, user, etc), or you can pass them in as one long “DSN” string, like this:

dsn = "host=localhost port=6000 dbname=testdb user=jonesy"
conn = psycopg2.connect(dsn)
conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)

The DSN value is a space-delimited collection of key=value pairs, which I construct before sending the dsn to the psycopg2.connect() method. Once we have a connection object, the very first thing I do is set the connection’s isolation level to ‘autocommit’, so that INSERT and UPDATE transactions are committed automatically without my having to call conn.commit() after each transaction. There are several isolation levels defined in the psycopg2.extensions package, and they’re defined in ‘extensions’ because they go beyond what is defined in the DB API 2.0 spec that is typically used as a reference in creating Python database modules.

Simple Queries and Type Conversion

In order to get anything out of the database, we have to know how to talk to it. Of course this means writing some SQL, but it also means sending query arguments in a format understood by the database. I’m happy to report that psycopg2 does a pretty good job of making things “just work” when it comes to converting your input into PostgreSQL types, and converting the output directly into Python types for easy manipulation in your code. That said, understanding how to properly use these features can be a bit confusing at first, so let me address the source of a lot of early confusion right away:

cur = conn.cursor()
cur.execute("""SELECT id, fname, lname, balance FROM accounts WHERE balance > %s""", min_balance)

Chances are, min_balance is an integer, but we’re using ‘%s’ anyway. Why? Because this isn’t really you telling Python to do a string formatting operation, it’s you telling psycopg2 to convert the incoming data using the default psycopg2 method, which converts integers into the PostgreSQL INT type. So, you can use “%s” in the ‘execute()’ method to properly convert integers, strings, dates, datetimes, timedeltas, lists, tuples and most other native Python types to a corresponding PostgreSQL type. There are adapters built into psycopg2 as well if you need more control over the type conversion process.

Cursors

Psycopg2 makes it pretty easy to get your results back in a format that is easy for the receiving code to deal with. For example, the projects I work on tend to use the  RealDictCursor type, because the code tends to require accessing the parts of the resultset rows by name rather than by index (or just via blind looping). Here’s how to set up and use a RealDictCursor:

curs = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
curs.execute("SELECT id, name FROM users")
rs = curs.fetchall()
for row in rs:
   print rs['id'], rs['name']

It’s possible you have two sections of code that’ll rip apart a result set, and one needs by-name access, and the other just wants to loop blindly or access by index number. If that’s the case, just replace ‘RealDictCursor’ with ‘DictCursor’, and you can have it both ways!

Another nice thing about psycopg2 is the cursor.query attribute and cursor.mogrify method. Mogrify allows you to test and see how a query will look after all input variables are bound, but before the query is sent to the server. Cursor.query prints out the exact query that was actually sent over the wire. I use cursor.query in my logging output all the time to catch out-of-order parameters and mismatched input types, etc. Here’s an example:

try:
    curs.callproc('myschema.myprocedure', callproc_params)
except Exception as out:
    print out
    print curs.query

Calling Stored Functions

Stored procedures or ‘functions’ in PostgreSQL-speak can be immensely useful in large complex applications where you want to enforce business rules in a single place outside the domain of the main application developers. It can also in some cases be more efficient to put functionality in the database than in the main application code. In addition, if you’re hiring developers, they should develop in the standard language for your environment, not SQL: SQL should be written by database administrators and developers, and exposed to the developers as needed, so all the developers have to do is call this newly-exposed function. Here’s how to call a function using psycopg2:

callproc_params = [uname, fname, lname, uid]
cur.callproc("myschema.myproc", callproc_params)

The first argument to ‘callproc()’ is the name of the stored procedure, and the second argument is a sequence holding the input parameters to the function. The input parameters should be in the order that the stored procedure expects them, and I’ve found after quite a bit of usage that the module typically is able to convert the types perfectly well without my intervention, with one exception…

The UUID Array

PostgreSQL has built-in support for lots of interesting data types, like INET types for supporting IP addresses and CIDR network blocks, and GIS-related data types. In addition, PostgreSQL supports a type that is an array of UUIDs. This comes in handy if you use a UUID to identify items and want to store an array of them to associate with an order, or you use UUIDs to track messages and want to store an array of them together to represent a message thread or conversation. To get a UUID array into the database quickly and easily, it’s really not too difficult. If you have a list of strings that are UUID strings, you can do a quick conversion, call one function, and then use the array like any other input parameter:

my_uuid_arr = [uuid.UUID(i) for i in my_uuid_arr]
psycopg2.extras.register_uuid()
callproc_params = [
myvar1,
myvar2,
my_uuid_arr
]

curs.callproc('myschema.myproc', callproc_params)

Connection Status

It’s not a given that your database connection lives on from query to query, and you shouldn’t really just assume that because you did a query a fraction of a second ago that it’s still around now. Actually, to speak about things more Pythonically, you *should* assume the connection is still there, but be ready for failure, and check the connection status to diagnose and help get things back on track. You can check the ‘status’ attribute of your connection object. Here’s one way you might do it:

    @property
    def active_dbconn(self):
        return self.conn.status in [psycopg2.extensions.STATUS_READY, psycopg2.extensions.STATUS_BEGIN]:

So, I’m assuming here that you have some object that has a connection object that it refers to as ‘self.connection’. This one-liner function uses the @property built-in Python decorator, so the other methods in the class can either check the connection status before attempting a query:

if self.active_dbconn:
    try:
        curs.execute(...)
    except Exception as out:
         logging.error("Houston we have a problem")

Or you can flip that around like this:

try:
   curs.execute(...)
except Exception as out:
    if not self.active_dbconn:
        logging.error("Execution failed because your connection is dead")
    else:
         logging.error("Execution failed in spite of live connection: %s" % out)

Read On…

A database is a large, complex beast. There’s no way to cover the entirety of a database or a module that talks to it in a simple blog post, but I hope I’ve been able to show some of the more common features, and maybe one or two other items of interest. If you want to know more, I’m happy to report that, after a LONG time of being unmaintained, the project has recently sprung back to life and is pretty well-documented these days. Check it out!

Share this:

  • Click to share on Twitter (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Facebook (Opens in new window)

Contact Me

You should follow me on Twitter

Recent Posts

  • User Activation With Django and Djoser
  • Python Selenium Webdriver Notes
  • On Keeping A Journal and Journaling
  • What Geeks Could Learn From Working In Restaurants
  • What I’ve Been Up To
  • PyCon Talk Proposals: All You Need to Know And More
  • Sending Alerts With Graphite Graphs From Nagios
  • The Python User Group in Princeton (PUG-IP): 6 months in
  • The Happy Idiot
  • pyrabbit Makes Testing and Managing RabbitMQ Easy

Categories

  • Apple
  • Big Ideas
  • Books
  • CodeKata
  • Database
  • Django
  • Freelancing
  • Hacks
  • journaling
  • Leadership
  • Linux
  • LinuxLaboratory
  • Loghetti
  • Me stuff
  • Other Cool Blogs
  • PHP
  • Productivity
  • Python
  • PyTPMOTW
  • Ruby
  • Scripting
  • Sysadmin
  • Technology
  • Testing
  • Uncategorized
  • Web Services
  • Woodworking

Archives

  • May 2021
  • December 2020
  • January 2014
  • September 2012
  • August 2012
  • February 2012
  • November 2011
  • October 2011
  • June 2011
  • April 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • September 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
  • April 2005
  • March 2005
  • February 2005
  • January 2005
  • December 2004
  • November 2004
  • October 2004
  • September 2004
  • August 2004
© 2023 Musings of an Anonymous Geek | Powered by Minimalist Blog WordPress Theme