Skip to content

Musings of an Anonymous Geek

Made with only the finest 1's and 0's

Menu
  • About
  • Search Results
Menu

Generating Reports with Charts Using Python: ReportLab

Posted on October 22, 2008March 26, 2010 by bkjones

UPDATE (Mar. 26, 2010) Just realized I never posted the link to the PDF the code here generates: here it is. My bad.

I’ve been doing a little reporting project, and I’ve been searching around for quite some time for a good graphing and charting solution for general-purpose use. I had come across ReportLab before, but it just looked so huge and convoluted to me, given the simplicity of what I wanted at the time, that I moved on. This time was different.

This time I needed a lot of the capabilities of ReportLab. I needed to generate PDFs (this is not a web-based project), I needed to generate charts, and I wanted the reports I was generating to contain various types of text objects in addition to the charts and such.

I took the cliff-dive into the depths of the ReportLab documentation. I discovered three things:

  1. There is quite a lot of documentation
  2. ReportLab is quite a capable library
  3. The documentation actually defies the simplicity of the library.

It’s a decent bit easier than it looks in the documentation, so I thought I’d take you through an example. This example is dead simple, but I still think it’s a little more practical than what I was able to find. The ReportLab documentation refers to what sounds like a great reference example, but the problem is that the tarball I downloaded didn’t contain the files it was making reference to 🙁

I started out by investigating one of the small example projects in the “demo” directory of the ReportLab directory. It was called “gadflypaper” (Ironically, written by Aaron Watters. I worked in the cube outside of his office for several months last year — Hi Aaron!). Aaron’s example was very simple, and a great starting point to start understanding how to put together a very basic document. It’s not infested with abstractions — just a few simple functions, and a lot of text. I ripped out a lot of the text until I had just an example of each function in action, and then set to work.

The Basic Process

To simplify the work of doing page layout minutiae, I (like the example) used PLATYPUS, which is built into ReportLab and abstracts away some of the low-level layout details. If you *want* low-level control, however, you can do whatever you want with the pdfgen module, also included (and PLATYPUS is basically a layer built from it).

With PLATYPUS, you get access to a bunch of prebuilt layout-related objects, representing things like paragraphs, tables, frames, and other things. You also have access to page templates, so that dealing with things like frame placement is a little easier.

So, to give you a rundown of the high-level steps:

  1. Choose a page template, and use it to create a document object.
  2. Create your “flowables” (paragraphs, charts, images, etc), and put them all into a list object. In ReportLab documentation, this is often referred to as a list named “story”
  3. Pass the list object to the build() method of the document object you created in step 1.

Phase 1: Let’s Get Something Working

As a first phase, let’s just make sure we can do the simplest of documents. Here’s some code that should work if you have a good installation of ReportLab (I’m using whatever was the latest version in early October, 2008.) Note that we’ll be cleaning this up and simplifying it as we go along.

#!/usr/bin/env python

from reportlab.platypus import *
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.rl_config import defaultPageSize
from reportlab.lib.units import inch

PAGE_HEIGHT=defaultPageSize[1]
styles = getSampleStyleSheet()
Title = Paragraph("Generating Reports with Python", styles["Heading1"])
Author = Paragraph("Brian K. Jones", styles["Normal"])
URL = Paragraph("https://protocolostomy.com", styles["Normal"])
email = Paragraph("bkjones +_at_+ gmail.com", styles["Normal"])
Abstract = Paragraph("""This is a simple example document that illustrates how to put together a basic PDF with a chart.
I used the PLATYPUS library, which is part of ReportLab, and the charting capabilities built into ReportLab.""", styles["Normal"])

Elements = [Title, Author, URL, email, Abstract]

def go():
   doc = SimpleDocTemplate('gfe.pdf')
   doc.build(Elements)

go()

Not a lot of actual code here. It’s mostly variable assignments. The variables are mostly just strings, but because I want to control how they’re arranged, I need to make them “Flowables”. Remember that PLATYPUS puts together a document by processing a list of Flowable objects and drawing them onto the document. So all of our strings are “Paragraph” objects. You’ll note, too, that Paragraph objects can be styled using definitions accessed from getSampleStyleSheet, which returns a ‘style object’. If you create one of these at the Python interpreter, and call the resulting object’s ‘list()’ function, you’ll see what styles are available, and you’ll also see what attributes each style has. Try running this code to make sure things work. Change the strings if you like 🙂

Phase 2: Simple Cleanup

I haven’t yet created insane layers of abstraction in my own code, because I’ve been working on deadlines and doing things that are relatively simple. This will inevitably change 🙂  However, there are some things you can do to make life a bit simpler and cleaner.

#!/usr/bin/env python

from reportlab.platypus import *
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.rl_config import defaultPageSize
from reportlab.lib.units import inch

PAGE_HEIGHT=defaultPageSize[1]
styles = getSampleStyleSheet()
Title = "Generating Reports with Python"
Author = "Brian K. Jones"
URL = "https://protocolostomy.com"
email = "bkjones@gmail.com"
Abstract = """This is a simple example document that illustrates how to put together a basic PDF with a chart.
I used the PLATYPUS library, which is part of ReportLab, and the charting capabilities built into ReportLab."""
Elements=[]
HeaderStyle = styles["Heading1"]
ParaStyle = styles["Normal"]
PreStyle = styles["Code"]

def header(txt, style=HeaderStyle, klass=Paragraph, sep=0.3):
    s = Spacer(0.2*inch, sep*inch)
    Elements.append(s)
    para = klass(txt, style)
    Elements.append(para)

def p(txt):
    return header(txt, style=ParaStyle, sep=0.1)

def go():
    doc = SimpleDocTemplate('gfe.pdf')
    doc.build(Elements)

header(Title)
header(Author, sep=0.1, style=ParaStyle)
header(URL, sep=0.1, style=ParaStyle)
header(email, sep=0.1, style=ParaStyle)
header("ABSTRACT")
p(Abstract)

go()

So, this is still simple. Simplistic, even. All I did was move the repetitive bits to functions. The ‘header’ and ‘p’ functions are (for now) unaltered from the gadflypaper demo. The good part here is that strings can be defined as ‘just strings’. Paragraphs and headers are just plain old string variables, and then at the bottom I just call the ‘header’ and ‘p’ functions and pass in the variables. The order in which I call the functions determines the order my document will appear in.

Phase 3

There’s kind of an issue with the way these functions work, at least for my needs. The problem is that they just go ahead and add things to the “Elements” list automagically. This might be ok for some quick and dirty tasks, but in my case I found that I needed more control. Things were crossing page boundaries where I didn’t want them to, and if I want to add formatting or apply built-in functionality, I can’t do it on a per-object basis without loading up the argument list.

I also wanted to have a relatively easy way to move *sections* of reports around, where a section might consist of a heading, a paragraph, and a source code listing — three different “Flowable” objects. So I altered these functions to make them return flowables instead of just adding things to the Elements list for me:

#!/usr/bin/env python

from reportlab.platypus import *
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.rl_config import defaultPageSize
from reportlab.lib.units import inch

PAGE_HEIGHT=defaultPageSize[1]
styles = getSampleStyleSheet()
Title = "Generating Reports with Python"
Author = "Brian K. Jones"
URL = "https://protocolostomy.com"
email = "bkjones@gmail.com"
Abstract = """This is a simple example document that illustrates how to put together a basic PDF with a chart.
I used the PLATYPUS library, which is part of ReportLab, and the charting capabilities built into ReportLab."""
Elements=[]
HeaderStyle = styles["Heading1"]
ParaStyle = styles["Normal"]
PreStyle = styles["Code"]

def header(txt, style=HeaderStyle, klass=Paragraph, sep=0.3):
    s = Spacer(0.2*inch, sep*inch)
    para = klass(txt, style)
    sect = [s, para]
    result = KeepTogether(sect)
    return result

def p(txt):
    return header(txt, style=ParaStyle, sep=0.1)

def pre(txt):
    s = Spacer(0.1*inch, 0.1*inch)
    p = Preformatted(txt, PreStyle)
    precomps = [s,p]
    result = KeepTogether(precomps)
    return result

def go():
    doc = SimpleDocTemplate('gfe.pdf')
    doc.build(Elements)

mytitle = header(Title)
myname = header(Author, sep=0.1, style=ParaStyle)
mysite = header(URL, sep=0.1, style=ParaStyle)
mymail = header(email, sep=0.1, style=ParaStyle)
abstract_title = header("ABSTRACT")
myabstract = p(Abstract)
head_info = [mytitle, myname, mysite, mymail, abstract_title, myabstract]
Elements.extend(head_info)

code_title = header("Basic code to produce output")
code_explain = p("""This is a snippet of code. It's an example using the Preformatted flowable object, which
                 makes it easy to put code into your documents. Enjoy!""")
code_source = pre("""
def header(txt, style=HeaderStyle, klass=Paragraph, sep=0.3):
    s = Spacer(0.2*inch, sep*inch)
    para = klass(txt, style)
    sect = [s, para]
    result = KeepTogether(sect)
    return result

def p(txt):
    return header(txt, style=ParaStyle, sep=0.1)

def pre(txt):
    s = Spacer(0.1*inch, 0.1*inch)
    p = Preformatted(txt, PreStyle)
    precomps = [s,p]
    result = KeepTogether(precomps)
    return result

def go():
    doc = SimpleDocTemplate('gfe.pdf')
    doc.build(Elements)
    """)
codesection = [code_title, code_explain, code_source]
src = KeepTogether(codesection)
Elements.append(src)
go()

So, this isn’t too bad. It’s still functional programming. I’ll revamp it in another post to use objects, but for those readers who are still learning all of this, it might help to leave out the abstraction for now. What I liked about the gadflypaper demo was that it was quick and dirty. You could read it line by line, top to bottom, and understand what just happened without jumping back and forth between main() code and object code.

As you can see, I’m using the KeepTogether() method, in two different ways. In the functions, I use it so I don’t have to go back later and manually add spacer elements to the Elements array. Then, toward the bottom, I create a preformatted code snippet, and I use the KeepTogether method to make sure that all parts in the code section stay together without flowing across a page boundary. There are other options you can use to customize how your document deals with ‘orphan’ and ‘widow’ elements as well, so definitely check out the documentation for that (or keep reading this blog. i’ll get to it eventually).

So what’s left?

Phase 4: The Grand Finale

The rest of the code I add is to connect to a database, make a query, and then pass the data returned from the database to a function that creates a chart. I add the chart to the Elements, and we’re in business!

#!/usr/bin/env python
import MySQLdb
import sys
import string
from reportlab.graphics.shapes import Drawing
from reportlab.graphics.charts.linecharts import HorizontalLineChart
from reportlab.platypus import *
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.rl_config import defaultPageSize
from reportlab.lib.units import inch

dbhost = 'localhost'
dbname = 'httplog'
dbuser = 'jonesy'
dbpasswd = 'mypassword'

PAGE_HEIGHT=defaultPageSize[1]
styles = getSampleStyleSheet()
Title = "Generating Reports with Python"
Author = "Brian K. Jones"
URL = "https://protocolostomy.com"
email = "bkjones@gmail.com"
Abstract = """This is a simple example document that illustrates how to put together a basic PDF with a chart.
I used the PLATYPUS library, which is part of ReportLab, and the charting capabilities built into ReportLab."""
Elements=[]
HeaderStyle = styles["Heading1"]
ParaStyle = styles["Normal"]
PreStyle = styles["Code"]

def header(txt, style=HeaderStyle, klass=Paragraph, sep=0.3):
    s = Spacer(0.2*inch, sep*inch)
    para = klass(txt, style)
    sect = [s, para]
    result = KeepTogether(sect)
    return result

def p(txt):
    return header(txt, style=ParaStyle, sep=0.1)

def pre(txt):
    s = Spacer(0.1*inch, 0.1*inch)
    p = Preformatted(txt, PreStyle)
    precomps = [s,p]
    result = KeepTogether(precomps)
    return result

def connect():
   try:
      conn1 = MySQLdb.connect(host = dbhost, user = dbuser, passwd = dbpasswd, db = dbname)
      return conn1
   except MySQLdb.Error, e:
      print "Error %d: %s" % (e.args[0], e.args[1])
      sys.exit (1)

def getcursor(conn):
   cursor = conn.cursor()
   return cursor

def totalevents_hourly(rcursor):
    rcursor.execute("""select hour, count(*) as hits from hits group by hour;""")
    return rcursor

def graphout(catnames, data):
    drawing = Drawing(400, 200)
    lc = HorizontalLineChart()
    lc.x = 30
    lc.y = 50
    lc.height = 125
    lc.width = 350
    lc.data = data
    catNames = catnames
    lc.categoryAxis.categoryNames = catNames
    lc.categoryAxis.labels.boxAnchor = 'n'
    lc.valueAxis.valueMin = 0
    lc.valueAxis.valueMax = 1500
    lc.valueAxis.valueStep = 300
    lc.lines[0].strokeWidth = 2
    lc.lines[0].symbol = makeMarker('FilledCircle') # added to make filled circles.
    lc.lines[1].strokeWidth = 1.5
    drawing.add(lc)
    return drawing

def go():
    doc = SimpleDocTemplate('gfe.pdf')
    doc.build(Elements)

mytitle = header(Title)
myname = header(Author, sep=0.1, style=ParaStyle)
mysite = header(URL, sep=0.1, style=ParaStyle)
mymail = header(email, sep=0.1, style=ParaStyle)
abstract_title = header("ABSTRACT")
myabstract = p(Abstract)
head_info = [mytitle, myname, mysite, mymail, abstract_title, myabstract]
Elements.extend(head_info)

code_title = header("Basic code to produce output")
code_explain = p("""This is a snippet of code. It's an example using the Preformatted flowable object, which
                 makes it easy to put code into your documents. Enjoy!""")
code_source = pre("""
def header(txt, style=HeaderStyle, klass=Paragraph, sep=0.3):
    s = Spacer(0.2*inch, sep*inch)
    para = klass(txt, style)
    sect = [s, para]
    result = KeepTogether(sect)
    return result

def p(txt):
    return header(txt, style=ParaStyle, sep=0.1)

def pre(txt):
    s = Spacer(0.1*inch, 0.1*inch)
    p = Preformatted(txt, PreStyle)
    precomps = [s,p]
    result = KeepTogether(precomps)
    return result

def go():
    doc = SimpleDocTemplate('gfe.pdf')
    doc.build(Elements)
    """)
codesection = [code_title, code_explain, code_source]
src = KeepTogether(codesection)
Elements.append(src)

hourly_title = header("Hits logged, per hour")
hourly_explain = p("""This shows aggregate hits across a 24-hour period. """)

conn = connect()
cur = getcursor(conn)
te_hourly = totalevents_hourly(cur)
catnames = []
data = []
values = []
for row in te_hourly:
   catnames.append(str(row[0]))
   values.append(row[1])

data.append(values)
hourly_chart = graphout(catnames, data)
hourly_section = [hourly_title, hourly_explain, hourly_chart]
Elements.extend(hourly_section)

go()

So, I’ve muddied things up a bit. If you’ve written database code before, you can just look past it all. I don’t do anything magical there. In fact, the chart creation isn’t magical either. I’m sure there’s even a cleaner way to do it – but this works for the moment.

I get a connection object, use it to get a cursor, then pass the cursor to the query function, which passes back…. a query object: te_hourly. The chart I’m going to create needs ‘category’ names for the y-axis values, and then values to plot on the chart. In my case, the hour is row[0] and the total hits for that hour are in row[1]. I build my catnames and data lists, and then create “hourly_chart” by passing my lists to the graphout function. Finally, I add the chart, along with its title and explanation to the Elements list. Done!

For its part, the graphout function is mostly just a bunch of parameters I need to configure my HorizontalLineChart object. Once the chart is all set to go, I need to add it onto my Drawing object, and return the Drawing flowable object.

Not yet what I’d call “Beautiful Code”, but it works, and it’s likely to help some other folks wade through the ‘getting started’ hump with ReportLab. Hope it was useful.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on Tumblr (Opens in new window) Tumblr
  • Click to share on Facebook (Opens in new window) Facebook

Recent Posts

  • Auditing Your Data Migration To ClickHouse Using ClickHouse Local
  • ClickHouse Cheat Sheet 2024
  • User Activation With Django and Djoser
  • Python Selenium Webdriver Notes
  • On Keeping A Journal and Journaling
  • What Geeks Could Learn From Working In Restaurants
  • What I’ve Been Up To
  • PyCon Talk Proposals: All You Need to Know And More
  • Sending Alerts With Graphite Graphs From Nagios
  • The Python User Group in Princeton (PUG-IP): 6 months in

Categories

  • Apple
  • Big Ideas
  • Books
  • CodeKata
  • Database
  • Django
  • Freelancing
  • Hacks
  • journaling
  • Leadership
  • Linux
  • LinuxLaboratory
  • Loghetti
  • Me stuff
  • Other Cool Blogs
  • PHP
  • Productivity
  • Python
  • PyTPMOTW
  • Ruby
  • Scripting
  • Sysadmin
  • Technology
  • Testing
  • Uncategorized
  • Web Services
  • Woodworking

Archives

  • January 2024
  • May 2021
  • December 2020
  • January 2014
  • September 2012
  • August 2012
  • February 2012
  • November 2011
  • October 2011
  • June 2011
  • April 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • September 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
  • April 2005
  • March 2005
  • February 2005
  • January 2005
  • December 2004
  • November 2004
  • October 2004
  • September 2004
  • August 2004
© 2025 Musings of an Anonymous Geek | Powered by Minimalist Blog WordPress Theme