I'm going to assume basic familiarity with Python, and with either C or C++
Hopefully you've used gdb at least once.
You need gdb 7.0 or later, built with Python embedding enabled.
As it happens, the crashing program was itself in Python
Simple interactive usage:
(gdb) python a = [i*2 for i in range(10)] (gdb) python print a [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Larger fragments of code:
(gdb) python ..write python code here.. [Ctrl^D]
Use help if you get lost:
(gdb) python help(gdb)
Help on package gdb: NAME gdb FILE (built-in) PACKAGE CONTENTS (..etc..)
Here's the C code I had to debug:
static PyObject *interned; /* actually a PyDictObject */ typedef struct _dictobject PyDictObject; struct _dictobject { PyObject_HEAD /* (Fields snipped for simplicity) */ /* Something within here was being corrupted: */ PyDictEntry *ma_table; };
In gdb terms:
(gdb) print /r interned $1 = (PyObject *) 0x602590In python terms:
(gdb) python \ val = gdb.parse_and_eval('interned') (gdb) python print val ..prints the representation of the value..
We now have a gdb.Value named val:
(gdb) python print type(val) <type 'gdb.Value'>
with a gdb.Type:
(gdb) python print val.type PyObject *
Use long to extract a pointer value from a gdb.Value:
(gdb) python print hex(long(val)) 0x602590
Not to be confused with the address of the gdb.Value wrapper within the gdb process:
(gdb) python print repr(val) <gdb.Value object at 0x7f52bf44bdc0>
I needed to find the corrupt data within a dynamically-allocated block within:
((PyDictObject*)interned)->ma_table
where interned is a PyObject*.
In gdb I could use its C-like mini-language to access that:
(gdb) print ((PyDictObject*)interned)->ma_table $2 = (PyDictEntry *) 0x7ffff7f28010
How to do this in Python?
In Python terms, first we need the type:
(gdb) python \ type_dict_ptr = \ gdb.lookup_type('PyDictObject').pointer() (gdb) python print type_dict_ptr PyDictObject *
Note how we used gdb.Type.pointer
Now we can cast val:
(gdb) python val2 = val.cast(type_dict_ptr) (gdb) python print val2 0x602590 (gdb) python print val2.type PyDictObject *
So val2 is equivalent to ((PyDictObject*)interned)
Treat a gdb.Value as a dictionary to get at the fields of the underlying data:
(gdb) python val3 = val2['ma_table'] (gdb) python print val3 0x7ffff7f28010 (gdb) python print val3.type PyDictEntry *
So we now have val3, equivalent to:
((PyDictObject*)interned)->ma_table
I just showed you the difficult way to do this
The easy way is to use gdb.parse_and_eval directly on a gdb expression:
(gdb) python \ val3 = gdb.parse_and_eval( '((PyDictObject*)interned)->ma_table' )
Pointer and array gdb.Value instances support the Python indexing syntax:
(gdb) python print val3[2] {me_hash = 0, me_key = 0x0, me_value = 0x0}
This is equivalent to the underlying C pointer/array syntax:
((PyDictObject*)interned)->ma_table[2]
We can then use Python to find all entries in the table satisfying a criteria:
(gdb) python print [i for i in range(8192) if long(val3[i]['me_value']) == 0]
Before:
(gdb) print pWndContents (String *) 0x7f842941fcf0
After:
(gdb) print pWndContents String(u'Hello world')
This will show up everywhere in GDB, including backtraces.
typedef struct _UniStringData { sal_Int32 mnRefCount; sal_Int32 mnLen; sal_Unicode maStr[1]; } UniStringData; class String { private: UniStringData* mpData; };
Get the program to some known state
Go hunting for instances of the type:
(gdb) p pSVData->maAppData->mpAppName $14 = (String *) 0x7f842941fcf0 (gdb) p $14->mpData $15 = (UniStringData *) 0x7f264fb6fda0 (gdb) p *$15 $16 = {mnRefCount = 1, mnLen = 7, maStr = {115}}
Now capture it as a python variable, to make it easy to go peeking inside it:
(gdb) python appName = gdb.parse_and_eval( 'pSVData->maAppData->mpAppName' ) [CTRL-D] (gdb) python print appName 0x7f264fb79028
Here's the fragment of Python code I came up with for printing (String*) values:
(gdb) python mpData = appName['mpData'] (gdb) python print( repr(u"".join( [unichr(int(mpData['maStr'][i])) for i in range(mpData['mnLen'])] ) ) )
Giving this output:
u'soffice'
A prettyprinter is a class:
class StringPrinter(object): def __init__(self, val): # "val" is a gdb.Value # representing a (String *) # in the inferior process self.val = val
with a to_string method:
def to_string(self): mpData = self.val['mpData'] length = int(mpData['mnLen']) maStr = mpData['maStr'] chars = [unichr(int(maStr[i])) for i in xrange(length)] result = u"".join(chars) return "String(%r)" % result
def pp_lookup(gdbval): # Only for types that are "String *" type = gdbval.type.unqualified() if type.code == gdb.TYPE_CODE_PTR: type = type.target().unqualified() t = str(type) if t in ("String"): return StringPrinter(gdbval)
def register (obj): if obj == None: obj = gdb # Wire up the pretty-printer obj.pretty_printers.append(pp_lookup) register (gdb.current_objfile ())
See the documentation for more details:
http://sourceware.org/gdb/current/onlinedocs/gdb/Python.html
Checking for a NULL pointer:
if 0 == long(self.val): return 'NULL'
Safety limit:
# Don't send gdb into a long loop if it # encounters corrupt data: length = min(length, 1024)
Locate some data of the type in question:
$ PYTHONPATH=$(pwd) gdb --args PROGRAM (gdb) python import YOUR_DEBUG_CODE (gdb) print SOME_DATA
You don't need to restart the program each time. Edit YOUR_DEBUG_CODE.py, repeat:
(gdb) python reload(YOUR_DEBUG_CODE) (gdb) print SOME_DATA
See Lib/test/test_gdb.py in CPython's source code for examples of this
(gdb) print /r pWndContents (String *) 0x7f842941fcf0
Create a subclass of gdb.Command, and write its invoke method.
I've done this for CPython:
- py-bt
- py-up and py-down
- py-list
See Tools/gdb/libpython.py in CPython's source code
Basic overview of poking at a process using Python from gdb
- How to walk a corrupt data-structure in a crashed process from Python
Writing a pretty-printer for a C/C++ library's data structures
Lots of other Python/gdb functionality
- Backtrace printers
- Convenience Functions
- Custom breakpoints
- etc
gdb documentation: http://sourceware.org/gdb/current/onlinedocs/gdb/Python.html
Tom Tromey's blog: http://sourceware.org/gdb/wiki/PythonGdbTutorial
The LibreOffice string pretty-printer I wrote: https://bugs.freedesktop.org/show_bug.cgi?id=34745
Python code that groks GNU libc's malloc/free implementation: https://fedorahosted.org/gdb-heap/
- the GNU implementation of the C++ STL
- hooks for CPython itself: Tools/gdb/libpython.py
- for GTK object system