When does my code get called?

(callbacks and passes)

Simple examples of using the plugin

implementing a new compilation pass in Python
implementing a semantic search tool in Python (e.g. examining C++ code to find callsites that use a particular C++ method)
code visualizations in Python

Implementing a new compilation pass in Python

Implementing a semantic search tool in Python (e.g. examining C++ code to find callsites that use a particular C++ method)

Code visualizations in Python

Caveats

It only traces zero or one times any loop; it gives up after that.

l-values and r-values

Trying it out on our buggy example code

It emits new compilation warnings:

$ ./gcc-with-cpychecker \
     -I/usr/include/python2.7/  bug.c
bug.c: In function ‘make_a_list_of_random_ints_badly’:
bug.c:10:26: warning: Mismatching type in
call to PyArg_ParseTuple with format code "i" [enabled
by default]
  argument 3 ("&count") had type
    "long int *" (pointing to 64 bits)
  but was expecting
    "int *" (pointing to 32 bits)
  for format code "i"

It detects the reference leak of "item"

bug.c:22:1: warning: ob_refcnt of '*item' is
1 too high [enabled by default]
bug.c:22:1: note: was expecting final
ob_refcnt to be N + 1 (for some unknown N)
bug.c:22:1: note: due to object being
referenced by: PyListObject.ob_item[0]
bug.c:22:1: note: but final ob_refcnt is
N + 2
bug.c:17:14: note: PyLongObject allocated
at:
      item = PyLong_FromLong(random());

Reference-counting model

For every object it tracks all reference that it knows about, and assumes that there is some unknown number of other references N that it isn't tracking.

It knows the lower limit of N, and can use this when analyzing Py_DECREF() invocations.

FIXME

Paths through the code

The reference count warnings emits a verbose description of the path followed through the code that leads to the error:

bug.c:10:26: note: when PyArg_ParseTuple()
succeeds at:
   if (!PyArg_ParseTuple(args, "i", &count)) {
bug.c:10:8: note: taking False path at:
   if (!PyArg_ParseTuple(args, "i", &count)) {
bug.c:14:10: note: reaching:
   list = PyList_New(0);

Paths through the code (continued)

bug.c:14:10: note: when PyList_New()
succeeds at:
   list = PyList_New(0);
bug.c:16:5: note: when considering range:
   1 <= value <= 0x7fffffff at:
   for (i = 0; i < count; i++) {
bug.c:16:5: note: taking True path at:
   for (i = 0; i < count; i++) {
bug.c:17:31: note: reaching:
      item = PyLong_FromLong(random());

Actual references vs ob_refcnt

The checker emits a message when either changes:

bug.c:17:14: note: when PyLong_FromLong()
succeeds at:         item = PyLong_FromLong(random());
bug.c:17:14: note: ob_refcnt is now refs:
1 + N where N >= 0
bug.c:18:22: note: when PyList_Append()
succeeds at:         PyList_Append(list, item);
bug.c:18:22: note: ob_refcnt is now refs:
2 + N where N >= 0
bug.c:18:22: note: '*item' is now referenced by
1 non-stack value(s): PyListObject.ob_item[0]

Paths through the code (continued)

The checker continues following the execution of the code, reaching the top of the loop:

bug.c:16:5: note: when considering
value == (int)1 from bug.c:10 at:
     for (i = 0; i < count; i++) {
bug.c:16:5: note: taking False path at:
     for (i = 0; i < count; i++) {
bug.c:21:5: note: reaching:
   return list;
bug.c:22:1: note: returning

Deduplication

The checker attempts to follow every path through the function.

In this case it detects two similar traces, and only reports the warning for the first such trace:

bug.c:22:1: note: found 1 similar trace(s)
to this

HTML reports

The above information is too verbose, so the checker also emits the same information in HTML form:

bug.c:6:1: note: graphical error report for
function 'make_a_list_of_random_ints_badly' written out
to 'bug.c.make_a_list_of_random_ints_badly
-refcount-errors.html'

FIXME: screenshot

Lack of error-checking

It also detects the lack of error-checking for the PyList_New() call:

bug.c:18:22: warning: calling PyList_Append with NULL
as argument 1 (list) at bug.c:18 [enabled by default]
bug.c:10:26: note: when PyArg_ParseTuple() succeeds at:
  if (!PyArg_ParseTuple(args, "i", &count)) {
bug.c:10:8: note: taking False path at:
  if (!PyArg_ParseTuple(args, "i", &count)) {
bug.c:14:10: note: reaching:
  list = PyList_New(0);
bug.c:14:10: note: when PyList_New() fails at:
  list = PyList_New(0);

Why does it crash?

bug.c:16:5: note: when considering range:
1 <= value <= 0x7fffffff at:
  for (i = 0; i < count; i++) {
bug.c:16:5: note: taking True path at:
  for (i = 0; i < count; i++) {
bug.c:17:31: note: reaching:
      item = PyLong_FromLong(random());
bug.c:17:14: note: when PyLong_FromLong() succeeds at:
      item = PyLong_FromLong(random());
bug.c:18:22: note: PyList_Append() invokes Py_TYPE() on
the pointer via the PyList_Check() macro, thus accessing
(NULL)->ob_type
bug.c:18:22: note: found 1 similar trace(s) to this
bug.c:6:1: note: graphical error report for function 'make_a_list_of_random_ints_badly' written out to 'bug.c.make_a_list_of_random_ints_badly-refcount-errors.html'

Another Screenshot

Limitations

It only complains about crashers and reference-count issues

It doesn't complain about the lack of error checking for PyLong_FromLong: this failure case can't crash the interpreter.

Trying it out on real code

Examples of errors that it can detect

Success stories

some of the reference-count bugs found (and fixed) with the tool (and other bugs)