Diagnostic and location-tracking improvements for GCC 8
GNU Tools Cauldron 2017
David Malcolm <dmalcolm@redhat.com>
https://dmalcolm.fedorapeople.org/presentations/cauldron-2017/
GNU Tools Cauldron 2017
David Malcolm <dmalcolm@redhat.com>
https://dmalcolm.fedorapeople.org/presentations/cauldron-2017/
Source locations are encoded as 32-bit values
(location_t
a.k.a source_location
).
Effectively a key into a database (line_table
).
- early versions of GCC merely tracked filename and line number
- column numbers were added in 2004
- tracking of macro expansions in 2011 (Dodji, I believe)
- I added tracking of ranges of source code (rather that just points) in GCC 6
location_t
vs rich_location
A location_t
is a caret+start+finish, often with caret==start
e.g.:
dst = foo * bar;
~~~~^~~~~
dst = foo * bar;
^~~
A location_t
can also represent a macro expansion
location.
location_t
vs rich_location
(2)rich_location
is a bundle of information for
diagnostic_show_locus()
:location_t
location_t
This talk is about location_t
.
int test (int first, int second)
{
return foo (100, first * 42, second);
}
We capture a location (of sorts) for the FUNCTION_DECL:
int test (int first, int second)
^~~~
but we throw away these locations:
return type:
int test (int first, int second)
^~~
param locations (FIXME: do we?):
int test (int first, int second)
^~~~~~~~~ ^~~~~~~~~~
We capture a location for the CALL_EXPR:
return foo (100, first * 42, second);
~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
We capture locations for compound expressions e.g. the MULT_EXPR:
return foo (100, first * 42, second)
~~~~~~^~~~
...but we don't permanently capture locations of constants and uses of decls:
return foo (100, first * 42, second)
^-- ^-----
(see PR 43486 "Preserve variable-use locations", filed 2010-03-22)
Other locations we discard during parsing:
locations of attributes of a function
locations of individual tokens like close parens and semicolons:
int test (int first, int second)
^ ^ ^
{
^
return foo (100, first * 42, second);
^ ^ ^ ^^
}
^
Missing location information limits our ability to implement "cousins" of a compiler on top of the GCC codebase e.g.:
- code refactoring tools,
- code reformatting tools
- IDE support daemons
- etc
Ultimately, it makes our diagnostics harder to read than they could be.
Leaf nodes in many expressions don't have location information.
Quoting tree.h:
/* The source location of this expression. Non-tree_exp nodes such as
decls and constants can be shared among multiple locations, so
return nothing. */
#define EXPR_LOCATION(NODE) \
(CAN_HAVE_LOCATION_P ((NODE)) ? (NODE)->exp.locus : UNKNOWN_LOCATION)
Nasty workarounds:
#define EXPR_LOC_OR_LOC(NODE, LOCUS) (EXPR_HAS_LOCATION (NODE) \
? (NODE)->exp.locus : (LOCUS))
location_t loc = EXPR_LOC_OR_LOC (src, input_location);
struct c_expr
{
/* The value of the expression. */
tree value;
/* [...snip...] */
/* The source range of this expression. This is redundant
for node values that have locations, but not all node kinds
have locations (e.g. constants, and references to params, locals,
etc), so we stash a copy here. */
source_range src_range;
/* [...snip...] */
};
/* A tree node, together with a location, so that we can track locations
(and ranges) during parsing.
The location is redundant for node kinds that have locations,
but not all node kinds do (e.g. constants, and references to
params, locals, etc), so we stash a copy here. */
class cp_expr
{
public:
cp_expr () :
m_value (NULL), m_loc (UNKNOWN_LOCATION) {}
cp_expr (tree value) :
m_value (value), m_loc (EXPR_LOCATION (m_value)) {}
cp_expr (tree value, location_t loc):
m_value (value), m_loc (loc) {}
/* [...snip...] */
};
When | Best source of location_t in gcc 7 |
---|---|
C frontend | c_expr, vec<location_t> at callsites |
C++ frontend | cp_expr |
generic tree | EXPR_LOCATION () |
gimple | EXPR_LOCATION () |
gimple-SSA | EXPR_LOCATION () |
RTL | EXPR_LOCATION () |
int test (int first, int second)
{
return foo (100, first * 42, second);
}
first * 42
is a MULT_EXPR
, which has a
location_t
:
return foo (100, first * 42, second);
~~~~~~^~~~
and this compound location is retained past the frontend:
When | Location of MULT_EXPR |
---|---|
C frontend | Available |
C++ frontend | Available |
generic tree | Available |
gimple | Available |
gimple-SSA | Available |
RTL | Available |
100
is usage of an INTEGER_CST
; the location:
return foo (100, first * 42, second);
^~~
is tracked via workarounds within the frontend, but doesn't make it into generic tree:
When | Location of INTEGER_CST param |
---|---|
C frontend | c_expr, vec<location_t> at callsites |
C++ frontend | cp_expr, but not at callsites |
generic tree | Not available |
gimple | Not available |
gimple-SSA | Not available |
RTL | Not available |
Similarly second
is a usage of a PARM_DECL
; the location:
return foo (100, first * 42, second);
^~~~~~
is tracked via workarounds within the frontend, but doesn't doesn't survive past the frontend:
When | Location of PARM_DECL at callsite |
---|---|
C frontend | c_expr, vec<location_t> at callsites |
C++ frontend | cp_expr, but not at callsites |
generic tree | Not available |
gimple | Not available |
gimple-SSA | Not available |
RTL | Not available |
The missing location information means we can't always emit useful locations for diagnostics in the middle-end.
TODO: example
extern int callee (int one, const char *two, float three);
int caller (int first, int second, float third)
{
return callee (first, second, third);
}
GCC 7's C++ FE reports:
test.c: In function ‘int caller(int, int, float)’:
test.c:5:38: error: invalid conversion from ‘int’ to ‘const char*’
[-fpermissive]
return callee (first, second, third);
^
test.c:1:12: note: initializing argument 2 of ‘int callee(int,
const char*, float)’
extern int callee (int one, const char *two, float three);
^~~~~~
GCC 7's C FE does better:
test.c: In function ‘caller’:
test.c:5:25: warning: passing argument 2 of ‘callee’ makes pointer
from integer without a cast [-Wint-conversion]
return callee (first, second, third);
^~~~~~
test.c:1:12: note: expected ‘const char *’ but argument is of type
‘int’
extern int callee (int one, const char *two, float three);
^~~~~~
The ideal: highlight both argument and param:
test.c: In function ‘caller’:
test.c:5:25: warning: passing argument 2 of ‘callee’ makes pointer
from integer without a cast [-Wint-conversion]
return callee (first, second, third);
^~~~~~
test.c:1:12: note: expected ‘const char *’ but argument is of type
‘int’
extern int callee (int one, const char *two, float three);
^~~~~~~~~~~~~~~
Committed gcc 8 patch:
This takes the C frontend from e.g.:
printf("hello %i %i %i ", foo, bar, baz);
~^
%s
to:
printf("hello %i %i %i ", foo, bar, baz);
~^ ~~~
%s
Proposed gcc 8 patch:
This fixes the location at the callsite, for C++ frontend warnings, so that:
error: invalid conversion from 'int' to 'const char*' [-fpermissive]
return callee (first, second, third);
^
becomes:
error: invalid conversion from 'int' to 'const char*' [-fpermissive]
return callee (first, second, third);
^~~~~~
Doesn't help for the middle-end.
Patch to C/C++ frontends to retain more information about what was seen during parsing.
Screenshot of dump:
BLT is complementary to our existing IR:
- captures the locations the FEs are currently throwing away
- doesn't bother "looking inside functions": we already have location information there (to avoid bloating representation)
- could handle the insides of functions if we wanted to
Toy IDE (~150 lines of PyGTK code):
Click on usage of "struct foo" and click on "Goto Definition"
IDE makes request to cc1:
cc1: note: received http request: ‘POST /jsonrpc HTTP/1.\x0d\x0aHost
: localhost:400\x0d\x0aConnection: keep-aliv\x0d\x0aAccept-Encoding:
gzip, deflat\x0d\x0aAccept: */\x0d\x0aUser-Agent: python-requests/2.
13.\x0d\x0acontent-type: application/jso\x0d\x0aContent-Length: 18\x
0d\x0a\x0d\x0a{"jsonrpc": "2.0", "params": {"position": {"line": 8,
"character": 17}, "textDocument": {"uri": "../../src/gcc/testsuite/g
cc.dg/lsp/test.c"}}, "method": "textDocument/definition", "id": 8}’
i.e. this JSON-RPC request:
{"jsonrpc": "2.0",
"params": {"position": {"line": 8, "character": 17},
"textDocument": {"uri": "../../src/gcc/testsuite/gcc.dg/lsp/test.c"}},
"method": "textDocument/definition",
"id": 8}
cc1 uses BLT to locate what's at the file-line-column
cc1 uses BLT to locate the definition of it
cc1 issues RPC response:
[{"range": {"end": {"character": 1, "line": 6},
"start": {"character": 0, "line": 2}},
uri": "../../src/gcc/testsuite/gcc.dg/lsp/test.c"}]
like this:
cc1: note: sending http response: ‘HTTP/1.1 200 O\x0d\x0aContent-Len
gth:13\x0d\x0a\x0d\x0a[{"range": {"end": {"character": 1, "line": 6}
, "start": {"character": 0, "line": 2}}, "uri": "../../src/gcc/tests
uite/gcc.dg/lsp/test.c"}]’
IDE highlights the returned region of source:
Before:
test.c: In function ‘caller’:
test.c:5:25: warning: passing argument 2 of ‘callee’ makes pointer
from integer without a cast [-Wint-conversion]
return callee (first, second, third);
^~~~~~
test.c:1:12: note: expected ‘const char *’ but argument is of type ‘int’
extern int callee (int one, const char *two, float three);
^~~~~~
With BLT capturing the param locations:
test.c: In function ‘caller’:
test.c:5:25: warning: passing argument 2 of ‘callee’ makes pointer
from integer without a cast [-Wint-conversion]
return callee (first, second, third);
^~~~~~
test.c:1:12: note: expected ‘const char *’ but argument is of type ‘int’
extern int callee (int one, const char *two, float three);
^~~~~~~~~~~~~~~
Also in the patch kit:
Highlighting the return type in the function defn when compaining about mismatches, e.g.:
Before:
warning: 'return' with a value, in function returning void
return 42;
^~
note: declared here
void test_1 (void)
^~~~~~
After:
warning: 'return' with a value, in function returning void
return 42;
^~
note: the return type was declared as 'void' here
void test_1 (void)
^~~~
Also in the patch kit: fix-it hints to -Wsuggest-override:
test.cc:16:15: warning: ‘virtual void B::f()’ can be marked
override [-Wsuggest-override]
virtual void f();
^
override
--- test.cc
+++ test.cc
@@ -13,5 +13,5 @@
{
B();
virtual ~B();
- virtual void f();
+ virtual void f() override;
};
Ideas for other uses of this infrastructure (not yet done):
class blt_node
{
/* [... lots of methods/accessors ...] */
private:
enum blt_kind m_kind;
blt_node *m_parent;
blt_node *m_first_child;
blt_node *m_last_child;
blt_node *m_prev_sibling;
blt_node *m_next_sibling;
location_t m_start;
location_t m_finish;
tree m_tree;
};
Current, unoptimized content (x86_64 host):
(gdb) p sizeof(blt_node)
$1 = 64
blt_context
containing an obstack?Should we store token pointers rather than location_t?
class blt_node
{
/* [...] */
#if 1
location_t m_start;
location_t m_finish;
#else
// C++: tokens are currently released after lexing...
cp_token *m_first_token;
cp_token *m_last_token;
// C: tokens are released *during* lexing
#endif
/* [...] */
};
- I hope to come up with a subset of this work (before close of stage1) that improves diagnostics, with no noticeable effect on compile time/memory
- mothballing the LSP work for now (any volunteers?)
How to reliably get at locations from middle-end?
Possible solutions (see next slides):
NOP_EXPR
or CONVERT_EXPR
?DECL_USAGE_EXPR
)Status:
- work-in-progress
- examples in the above slides work, but...
- ...much of testsuite fails, and:
- ...doesn't yet bootstrap
Lots of issues:
what about folding?
a new tree code? what about the hundreds of
switch (TREE_CODE (node))
impact on memory usage? (not yet known; still trying to get it to work)
how do the gimple representations interact with SSA and with optimization?
return foo (100, first * 42, second);
GENERIC, status quo:
return foo (100, first * 42, second);
GENERIC, with wrapper nodes:
return foo (100, first * 42, second);
GIMPLE, status quo:
_1 = first * 42;
D.1799 = foo (100, _1, second);
return D.1799;
GIMPLE, status quo:
_1 = first * 42;
D.1799 = foo (100, _1, second);
return D.1799;
GIMPLE idea 1: don't flatten the wrappers:
_1 = wrapper_1(first) * wrapper_2(42); // using the wrapper nodes
D.1799 = foo (wrapper_3(100), _1, wrapper_4(second));
return D.1799;
GIMPLE, status quo:
_1 = first * 42;
D.1799 = foo (100, _1, second);
return D.1799;
GIMPLE idea 2, turning wrappers into temporaries:
_w_first = first; // this stmt retains the location of the usage of "first"
_w_42 = 42; // likewise for the usage of "42"
_1 = _w_first * _w_42;
_w_second = second; // likewise
D.1799 = foo (_w_100, _1, _w_second);
return D.1799;
GIMPLE SSA, status quo:
_1 = first_2(D) * 42;
_6 = foo (100, _1, second_4(D));
GIMPLE SSA with idea 1 (unflattened wrappers):
_1 = wrapper_1(first) * wrapper_2(42);
_6 = foo (wrapper_3(100), _1, wrapper_4(second));
GIMPLE SSA, status quo:
_1 = first_2(D) * 42;
_6 = foo (100, _1, second_4(D));
GIMPLE SSA with idea 2 (flattened wrappers):
_w_first_1 = first;
_w_42_1 = 42;
_1 = _w_first_1 * _w_42_1;
_w_second_1 = second;
_6 = foo (_w_100_1, _1, _w_second_1);
The def-stmts for the wrappers have their location_t
.
EXPR_LOCATION
goes awaystruct tloc
{
tree node;
location_t loc;
tloc () : node (NULL), loc (UNKNOWN_LOCATION) {}
tloc (tree node_) : node (node_), loc (UNKNOWN_LOCATION) {}
tloc (tree node_, location_t loc_) : node (node_), loc (loc_) {}
/* FIXME: for now. */
operator tree () const { return node; }
//operator const_tree () const { return node; }
tree operator -> () const { return node; }
};
Status:
- doesn't even compile yet
- e.g. issues with
tree
vsconst_tree
"tree" is currently a typedef to a pointer:
typedef union tree_node *tree;
What if it was instead something like:
struct tree
{
union tree_node *node;
location_t loc;
};
How do advanced users ask for more information on what GCC's optimizers are doing?
e.g.
etc
Current UI:
- turn on dump flags
- examine
foo.c.SOMETHING
- where "SOMETHING" is undocumented, and changes from revision to revision of the compiler (e.g. "foo.c.029t.einline")
- no easy way to parse (both for humans and scripts)
- what is important, and what isn't?
- e.g. "only tell me about the hot loops"
Possible UI:
a simple way to enable sending optimization information through the diagnostic subsystem, e.g.:
-Rvectorization -fdiagnostics-hotness-threshold=500
easy-to-read output e.g.:
foo.c:23:2: remark: unable to vectorize this loop...
[-Rvectorization, hotness=1000]
for (i = 0; i < n; i++)
^
foo.c:24:4: note: ...due to this read
a[i] = b[i] * some_global;
^~~~~~~~~~~
Taking another leaf from clang:
-fsave-optimization-record -foptimization-record-file=foo.yaml
(perhaps with a compatible format? they have viewers)
What should the internal API look like?
e.g.:
if (!some_test_for_validity_of_optimization_p (...))
{
if (failed_vectorization_remark (loop_stmt))
inform (read_stmt->location, "...due to this read");
return false;
}
extern bool remark (location_t, int, const char *, ...)
ATTRIBUTE_GCC_DIAG(3,4);
extern bool remark (gimple *, int, const char *, ...)
ATTRIBUTE_GCC_DIAG(3,4);
// or pass in gcov_type or a frequency?
Is it acceptable to build up a parallel API:
if (dump_file && (dump_flags & TDF_DETAILS))
{
fprintf (dump_file,
"can't frobnicate this stmt:\n");
print_gimple_stmt (dump_file, stmt, 0, 0);
}
remark (loop_stmt, OPT_remark_foo,
"can't frobnicate this stmt");
The idea (this is a work-in-progress):
bar.c:12:5: remark: unable to vectorize loop [-Rvectorization,
hotness=1000]
for (i = 0; i < n; i++)
^
foo.c:23:2: note: ...in inlined copy of 'init' here [einline]
init (&something);
~~~~~^~~~~~~~~~~~
i.e. stash away extra info in the location_t
to describe
where the code we're optimizing came from (injecting the note
above).
Class hierarchy for describing code cloning events:
/* Base class for describing a code-cloning event. */
struct GTY(()) cloning_info
{
enum location_clone_kind m_kind;
opt_pass GTY((skip)) *m_pass;
/* Hook for adding a note to a diagnostic. */
virtual void describe (diagnostic_context *dc) = 0;
/* [...snip...] */
};
Example subclass: inlining:
struct inlining_info : public cloning_info
{
inlining_info (opt_pass *pass, source_location call_loc, tree fndecl)
: cloning_info (LOCATION_CLONE_INLINING, pass),
m_call_loc (call_loc), m_fndecl (fndecl) {}
void describe (diagnostic_context *dc) FINAL OVERRIDE;
location_t m_call_loc;
tree m_fndecl;
};
Example subclass: loop peeling (hand-waving here):
bar.c:12:5: remark: unable to vectorize loop [-Rvectorization,
hotness=1000]
for (i = 0; i < n; i++)
^
bar.c:12:5: note: ...in peeled epilog copy of loop, for elements
after last multiple of 8 [slp]
struct peeled_loop_info : public cloning_info
{
/* TODO. */
void describe (diagnostic_context *dc) FINAL OVERRIDE;
};
RAII way to register a "cloning event"
e.g. within tree-inline.c:expand_call_inline:
auto_pop_cloning_info ci (new inlining_info (current_pass,
call_stmt->location,
fn);
Pushes/pops the cloning event:
- all cloned gimple statements get a new
location_t
that refers to the cloning event, until the RAII object goes out of scope.
Overloading to get rid of "error_at_rich_loc" verbosity?
Adding a fix-it hint currently involves changing e.g.:
error_at (token->location,
"unknown type name %qE; did you mean %qs?",
token->value, hint);
to:
gcc_rich_location richloc (token->location);
richloc.add_fixit_replace (hint);
error_at_rich_loc (&richloc,
"unknown type name %qE; did you mean %qs?",
token->value, hint);
Could use overloading of error_at
to simplify it to just from this:
error_at (token->location,
"unknown type name %qE; did you mean %qs?",
token->value, hint);
to:
gcc_rich_location richloc (token->location);
richloc.add_fixit_replace (hint);
error_at (&richloc,
"unknown type name %qE; did you mean %qs?",
token->value, hint);
Try to get some/all of this in good enough shape for GCC 8 before stage 1 closes.
Thanks for listening!
(thanks to Red Hat for funding this work)
URL for these slides: https://dmalcolm.fedorapeople.org/presentations/cauldron-2017/