Static analysis of Python extension modules using GCC

David Malcolm, Red Hat

dmalcolm@redhat.com

PyCon US 2012

Prerequisites

I'm going to assume basic familiarity with Python, and with either C or C++

Hopefully you've used, debugged, or written a Python extension module in C (perhaps via SWIG or Cython)

Straw poll?

Overview

Code: https://fedorahosted.org/gcc-python-plugin/

GNU General Public License v3 (or later)

C extension code

The Problem

How many mistakes can you see in the following C code?

PyObject *
buggy_list_of_random_ints(PyObject *self,
                          PyObject *args)
{
    PyObject *list, *item;
    long count, i;

    if (!PyArg_ParseTuple(args, "i",
                          &count)) {
         return NULL;
    }
    /* continued on the next slide... */
          

The Problem (continued)

{
    PyObject *list, *item;
    long count, i;

    if (!PyArg_ParseTuple(args, "i",
                          &count)) {
         return NULL;
    }
    /* ...continued from previous slide */
    list = PyList_New(0);

    for (i = 0; i < count; i++) {
        item = PyLong_FromLong(random());
        PyList_Append(list, item);
    }

    return list;
}

What could possibly go wrong?

Mismatching types to PyArg_ParseTuple

long count, i;

if (!PyArg_ParseTuple(args, "i", &count)) {
     return NULL;
}

If sizeof(long) > sizeof(int), only part of "count" will be written to.

Total lack of error-checking

PyList_New() can fail if there's not enough memory, returning NULL:

list = PyList_New(0); // BUG!

for (i = 0; i < count; i++) {
    item = PyLong_FromLong(random());
    PyList_Append(list, item); //BUG!
}
          

and PyList_Append(NULL, item) will then segfault the interpreter

(even if "item" is also NULL)

Total lack of error-checking: part 2

PyLong_FromLong() can fail if there's not enough memory, returning NULL:

item = PyLong_FromLong(random());  //BUG!
PyList_Append(list, item);
          

though in this case PyList_Append(list, NULL) will not segfault the interpreter.

(Does anyone know all of the rules for which CPython extension function arguments are safe to pass NULL to, and which aren't? More to come...)

Total lack of error-checking: part 3

PyList_Append() can fail if there's not enough memory, returning -1:

PyList_Append(list, item);

If this happens, the length of the list will be less than what the caller was expecting.

Reference leaks

item = PyLong_FromLong(random());
PyList_Append(list, item);

PyLong_FromLong() returns "a new reference" (if it succeeds)

PyList_Append() doesn't steal the reference, it adds a new one (if it succeeds).

Whether or not PyList_Append() succeeds or fails, "item" still owns a reference on the result of PyLong_FromLong (if non-NULL).

The code needs a Py_DECREF(item)

(as well as all the error checking)

Why are reference leaks bad?

The random ints will become immortal, and will not be cleaned up until the process exits, using up memory unnecessarily.

This will foul up CPython's arena allocators: entire 256KB chunks of RAM can be forced to be kept in-use just to keep a leaked object alive.

But this is a compiled language!

I got sick of tracking down references leaks by hand.

How can we automatically detect these bugs?

There are hundreds of packages within, say, Fedora 17 that link against libpython, written in a mixture of C, C++ (plus inline assembler, and even some Fortran)

Image source: http://hyperboleandahalf.blogspot.com/2010/06/this-is-why-ill-never-be-adult.html

Approaches to code analysis

LLVM

LLVM's clang-analyzer

http://clang-analyzer.llvm.org/

Sparse

https://sparse.wiki.kernel.org/

CIL

http://cil.sourceforge.net/

See e.g. the work we did to detect errors in libvirt:

http://berrange.com/posts/2009/05/15/static-analysis-to-validate-mutex-locking-in-libvirt-using-ocaml-cil/

Coccinelle

http://coccinelle.lip6.fr/

See my experiment from November 2009 on PyArg_ParseTuple

http://dmalcolm.livejournal.com/3689.html

Using a Python library to parse C

Proprietary tools

(non-starter for Fedora: everything built using Open Source/Free Software)

GCC plugins

Aha! Embed Python scripting into GCC

Embedding CPython within GCC

GCC 4.5 gained the ability to have plugins

(My plugin actually requires GCC >= 4.6 for other reasons)

I quickly got a "hello world" GCC plugin working:

            int
            plugin_init (/* args omitted */)
            {
                /* plugin boilerplate omitted */
                Py_Initialize();
                PyRun_SimpleString("from time import time,ctime\n"
                "print 'Today is',ctime(time())\n");
                Py_Finalize();
            }
          

(this is the example from http://docs.python.org/extending/embedding.html )

It worked

            $ gcc -fplugin=python.so test.c
            Today is Wed Mar 20 23:10:15 2011
          

Generalize it to run arbitrary Python scripts:

$ ./gcc-with-python script.py test.c

Getting at GCC's internal representation

Wrapping GCC's types

Wrapping a million-line, 25-year-old body of C code in Python

GCC's internals, as seen by the Python plugin

Our buggy example code is processed by the C frontend

Converted to a form called "GIMPLE":

Simple expressions, mostly just 3 operands at a time.

The GIMPLE is divided into a control flow graph of basic blocks in which any branching logic is done between basic blocks.

I am grossly oversimplifying things

(Etymology: the GCC version of SIMPLE IL, from the McCAT compiler project at McGill University)

Our first new GCC compilation pass

Here's a Python script to run inside the plugin to print the structure:

import gcc

# We can create custom compilation pass by subclassing e.g. gcc.GimplePass:
class DumpGimple(gcc.GimplePass):
  # passes have execute() methods:
  def execute(self, fun):

    # see next slide
          
  def execute(self, fun):
    print('FUNCTION: %s' % fun.decl.name)

    for bb in fun.cfg.basic_blocks:

      # we now have a gcc.BasicBlock
      print('BLOCK %i' % bb.index)

      # see next slide...
          
  def execute(self, fun):
    print('FUNCTION: %s' % fun.decl.name)
    for bb in fun.cfg.basic_blocks:
      print('BLOCK %i' % bb.index)

      # (from previous slide)

      if bb.gimple:
        for stmt in bb.gimple:
          # we now have a gcc.Gimple
          print(' %r' % stmt)
          print('   %s' % stmt)

      for edge in bb.succs:
        print(' edge to block %i'
              % edge.dest.index)

  # see next slide
          
class DumpGimple(gcc.GimplePass):
  def execute(self, fun):
      # (body of function omitted)

# Finally, we have to instantiate our compilation pass, and register it
# relative to one of GCC's existing passes:
ps = DumpGimple(name='dump-gimple')
ps.register_after('cfg')
          

Don't call your pass "pass" - it's a reserved word!

We can now run our script like this:

$ ./gcc-with-python \
         dump-gimple.py \
           -I/usr/include/python2.7 \
         bug.c
          

(bug.c is Python extension code, hence we need the -I compilation flag here so gcc can find <Python.h>)

Here's the output

FUNCTION: make_a_list_of_random_ints_badly
BLOCK 0
  edge to block 2
BLOCK 1
BLOCK 2
 gcc.GimpleCall()
   D.12024 = PyArg_ParseTuple (args, "i", &count);
 gcc.GimpleCond()
   if (D.12024 == 0)
  edge to block 3
  edge to block 4
BLOCK 3
 gcc.GimpleAssign()
   D.12027 = 0B;
  edge to block 8
        
BLOCK 4
 gcc.GimpleCall()
   list = PyList_New (0);
 gcc.GimpleAssign()
   i = 0;
  edge to block 6
BLOCK 5
 gcc.GimpleCall()
   D.12028 = random ();
 gcc.GimpleCall()
   item = PyLong_FromLong (D.12028);
 gcc.GimpleCall()
   PyList_Append (list, item);
 gcc.GimpleAssign()
   i = i + 1;
  edge to block 6
        
BLOCK 6
 gcc.GimpleAssign()
   count.0 = count;
 gcc.GimpleCond()
   if (i < count.0)
  edge to block 5
  edge to block 7
BLOCK 7
 gcc.GimpleAssign()
   D.12027 = list;
  edge to block 8
BLOCK 8
 gcc.GimpleReturn()
   return D.12027;
  edge to block 1
          

Too much text!

We're walking over GCC's internal representation, but it's hard to see what's going on

So I wrote a "show-gimple.py" script to visualize the structure, using GraphViz:

            $ ./gcc-with-python \
                     examples/show-gimple.py \
                       -I/usr/include/python2.7 \
                     bug.c
          

Many optimization passes later...

Various other internal representations that aren't relevant for our purposes today

What have we achieved?

We can now create new GCC warnings using Python, and perform other code analysis

Examples (by other people):

cpychecker: a static analyzer for CPython extension modules

Code: https://fedorahosted.org/gcc-python-plugin/

Structure of the checker

Interpreter for GIMPLE (simple form of abstract interpretation)

See libcpychecker/absinterp.py in the plugin's source tree

We generate a set of all possible traces through the function

(but give up when it gets too complicated)

A trace is a recording of a list of states, with transitions between them.

Structure of the checker (continued)

Transitions can have descriptive information (e.g. "when PyList_Append() fails")

A state has:

(this lets us handle the semantics of pointers)

AbstractValue subclasses

The checker "knows" how to perform operations on combinations of these: addition, subtraction, comparisons, etc

e.g. most operations on a DeallocatedMemory() instance will be detected as an error

Facets

States are pluggable: extra domain-specific information ("facets") can be added.

For now, the only plugin is a "cpython" one which adds:

See libcpychecker/refcounts.py in the plugin's source tree

Example API implementation

Given a C function e.g. PyMapping_Size, the cpython facet can provide an implementation which describes how to get from an input state to a list of all possible transitions:

def impl_PyMapping_Size(self, stmt, v_o):
    fnmeta = FnMeta(name='PyMapping_Size',
                    docurl='http://docs.python.org/c-api/mapping.html#PyMapping_Size',
                    prototype='Py_ssize_t PyMapping_Size(PyObject *o)',
                    defined_in='Objects/abstract.c',
                    notes='Can cope with NULL (sets exception)')
    # (continued on next slide)
        
# (continued from last slide)

    t_success = \
       self.state.mktrans_assignment(stmt.lhs,
         UnknownValue.make(stmt.lhs.type,
                           stmt.loc),
         fnmeta.desc_when_call_succeeds())
    t_failure = \
       self.state.mktrans_assignment(stmt.lhs,
         ConcreteValue(stmt.lhs.type,
                       stmt.loc,
                       -1),
         fnmeta.desc_when_call_fails())
    t_failure.dest.cpython.set_exception(
         'PyExc_TypeError',
         stmt.loc)
    return [t_success, t_failure]
          

We can also describe how an API can fail:

def impl_PyList_Append(self, stmt,
                       v_op, v_newitem):
    fnmeta = FnMeta(name='PyList_Append', )
                    # etc
    self.state.raise_any_null_ptr_func_arg(
       stmt, 0, v_op,
       why=invokes_Py_TYPE_via_macro(
              fnmeta, 'PyList_Check'))

    # It handles newitem being NULL:
    if v_newitem.is_null_ptr():
        s_failure = \
           self.state.mkstate_concrete_return_of(
              stmt, -1)
        s_failure.cpython.bad_internal_call(stmt.loc)
        return [Transition(self.state,
                           s_failure,
                           'returning -1 from %s() due to NULL item' % fnmeta.name)]
          

Trying it out on our buggy example code

It emits new compilation warnings:

            $ ./gcc-with-cpychecker \
                  -I/usr/include/python2.7/ \
                 bug.c
            bug.c: In function ‘make_a_list_of_random_ints_badly’:
            bug.c:10:26: warning: Mismatching type in call to PyArg_ParseTuple with format code "i" [enabled by default]
            argument 3 ("&count") had type
              "long int *" (pointing to 64 bits)
            but was expecting
              "int *" (pointing to 32 bits)
            for format code "i"
          

It detects the reference leak of "item"

            bug.c:22:1: warning: ob_refcnt of '*item' is 1 too high [enabled by default]
            bug.c:22:1: note: was expecting final ob_refcnt to be N + 1 (for some unknown N)
            bug.c:22:1: note: due to object being referenced by: PyListObject.ob_item[0]
            bug.c:22:1: note: but final ob_refcnt is N + 2
            bug.c:17:14: note: PyLongObject allocated at:   item = PyLong_FromLong(random());
          

Reference-counting model

For every object and possible trace it tracks all reference that it knows about, and assumes that there is some unknown number of other references N that it isn't tracking.

It knows the lower limit of N, and can use this when analyzing Py_DECREF() invocations.

Paths through the code

The reference count warnings emits a verbose description of the path followed through the code that leads to the error:

            bug.c:10:26: note: when PyArg_ParseTuple() succeeds at:
                if (!PyArg_ParseTuple(args, "i", &count)) {
            bug.c:10:8: note: taking False path at:
                if (!PyArg_ParseTuple(args, "i", &count)) {
            bug.c:14:10: note: reaching:
                list = PyList_New(0);
          

Paths through the code (continued)

            bug.c:14:10: note: when PyList_New() succeeds at:
                list = PyList_New(0);
            bug.c:16:5: note: when considering range:  1 <= value <= 0x7fffffff at:
                for (i = 0; i < count; i++) {
            bug.c:16:5: note: taking True path at:
                for (i = 0; i < count; i++) {
            bug.c:17:31: note: reaching:
                    item = PyLong_FromLong(random());
          

Actual references vs ob_refcnt

The checker emits a message when either changes:

            bug.c:17:14: note: when PyLong_FromLong() succeeds at:
                    item = PyLong_FromLong(random());
            bug.c:17:14: note: ob_refcnt is now refs: 1 + N where N >= 0
            bug.c:18:22: note: when PyList_Append() succeeds at:
                    PyList_Append(list, item);
            bug.c:18:22: note: ob_refcnt is now refs: 2 + N where N >= 0
            bug.c:18:22: note: '*item' is now referenced by 1 non-stack value(s): PyListObject.ob_item[0]
          

Paths through the code (continued)

The checker continues following the execution of the code, reaching the top of the loop:

            bug.c:16:5: note: when considering value == (int)1 from bug.c:10 at:
                for (i = 0; i < count; i++) {
            bug.c:16:5: note: taking False path at:
                for (i = 0; i < count; i++) {
            bug.c:21:5: note: reaching:
                return list;
            bug.c:22:1: note: returning
          

Deduplication

The checker attempts to follow every path through the function.

In this case it detects two similar traces, and only reports the warning for the first such trace:

            bug.c:22:1: note: found 1 similar trace(s) to this
          

HTML reports

The above information is too verbose, so the checker also emits the same information in HTML form:

            bug.c:6:1: note: graphical error report for
            function 'make_a_list_of_random_ints_badly' written out to 'bug.c.make_a_list_of_random_ints_badly-refcount-errors.html'
          

Lack of error-checking

It also detects the lack of error-checking for the PyList_New() call:

            bug.c:18:22: warning: calling PyList_Append with NULL as argument 1 (list) at bug.c:18 [enabled by default]
            bug.c:10:26: note: when PyArg_ParseTuple() succeeds at:
                if (!PyArg_ParseTuple(args, "i", &count)) {
            bug.c:10:8: note: taking False path at:
                if (!PyArg_ParseTuple(args, "i", &count)) {
            bug.c:14:10: note: reaching:
                list = PyList_New(0);
            bug.c:14:10: note: when PyList_New() fails at:
                list = PyList_New(0);
          

Why does it crash?

            bug.c:18:22: note: PyList_Append() invokes Py_TYPE() on the pointer via the PyList_Check() macro, thus accessing (NULL)->ob_type
            bug.c:18:22: note: found 1 similar trace(s) to this
          

A real bug found by the tool

int rpmfdFromPyObject(PyObject *obj,
                      rpmfdObject **fdop)
{
    rpmfdObject *fdo = NULL;

    /* (logic to create "fdo" omitted,
        to keep it all on one slide) */

    if (Ferror(fdo->fd)) {
        Py_DECREF(fdo);
        PyErr_SetString(
              PyExc_IOError,
              Fstrerror(fdo->fd));
        return 0;
    }
    *fdop = fdo;
    return 1;
}
          
rpmfd-py.c:34:17: warning: reading from deallocated memory at rpmfd-py.c:34: memory deallocated at rpmfd-py.c:32
rpmfd-py.c:31:8: note: taking True path at:
     if (Ferror(fdo->fd)) {
rpmfd-py.c:32:2: note: reaching:
         Py_DECREF(fdo);
rpmfd-py.c:32:2: note: when taking False path at:
         Py_DECREF(fdo);
rpmfd-py.c:32:2: note: calling tp_dealloc on new ref from call to PyObject_Call allocated at rpmfd-py.c:26 at:
         Py_DECREF(fdo);
rpmfd-py.c:35:17: note: reaching:
             Fstrerror(fdo->fd));
          

How does this work?

It's working on the preprocessed source

After preprocessing, a Py_DECREF(op) looks like this:

          do { if ( --((PyObject*)(op))->ob_refcnt != 0) ; else ( (*(((PyObject*)((PyObject *)(op)))->ob_type)->tp_dealloc)((PyObject *)((PyObject *)(op)))); } while (0);
          
or, simplified (removing casts):
          if ( --op->ob_refcnt == 0) {
             op->ob_type->tp_dealloc(op);
          }
          

How does this work? (continued)

          if ( --op->ob_refcnt == 0) {
             op->ob_type->tp_dealloc(op);
          }
          

All non-NULL (PyObject*) values are assigned an ob_refcnt value of type RefcountValue

A RefcountValue instance "knows" how to be incremented, decremented and compared against 0 (amongst other operations)

All non-NULL (PyObject*) have their ob_type set to a sane PyTypeObject. The tp_dealloc is set to an custom AbstractValue representing a sane function pointer that marks the object as a "poisoned" DeallocatedMemory value

Limitations

Status

I'm working on automatically running the checker against all Python extension code within Fedora 17

I've found and reported bugs against 27 projects so far... (~340 to go...); several are now fixed

Getting involved

Please try the checker on your extension code

Please report any false positives or tracebacks you run into

Help build out the coverage of the CPython API

Summary

Q & A

We're hiring!

http://www.redhat.com/about/work/

Extra slides

For the compiler geeks...

The GIMPLE is converted to Static Single Assignment form, which we can also see from Python

            $ ./gcc-with-python \
                     examples/show-ssa.py \
                       -I/usr/include/python2.7 \
                     bug.c
          

Many optimization passes later...

The GIMPLE-SSA form is converted to another representation Register Transfer Language, which is close to machine code

I've only barely wrapped this part of GCC in Python. Patches welcome!

Many further optimization passes later, this gets converted to assembler, and then machine code