.. _topglobals: Appendix 1: Top 40 globals (by count of usage sites) ---------------------------------------------------- These notes are based on r199560 (trunk, 2013-05-31) `struct recog_data_d recog_data` (11802 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Recently (2013-05-18) renamed from `struct recog_data` to `struct recog_data_d` Declared in recog.h:: /* The following vectors hold the results from insn_extract. */ struct recog_data_d { /* ...snip... */ }; extern struct recog_data_d recog_data; Function `insn_extract` is generated by `genextract.c` which uses "machine description to extract operands from insn as rtl" `insn_extract` is called by `extract_insn` in `regoc.c`. The latter is called throughout the RTL passes. Plan: This appears to be a singleton, so best approach may be to make this be a field of context, and add MAYBE_STATIC throughout (probably in a `class recog` to hold similar such recog.c state). `struct gcc_options global_options` (11431 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `options.c` and `options.h` are autogenerated by `optc-gen.awk and `opth-gen.awk`. `options.h` contains per-option boilerplate of the form:: #ifdef GENERATOR_FILE extern option_type name_of_option; // as a global #else option_type x_name_of_option; // as a field of struct gcc_options #define name_of_option global_options.x_name_of_option #endif hence the code that uses these macros contains implicit references to:: global_options.x_name_of_option The generated `options.c` contains the definitions:: struct gcc_options global_options; struct gcc_options global_options_set; Plan: * move global_options to be a field of context:: class context { public: /* ... */ MAYBE_STATIC struct gcc_options global_options_; /* ... */ }; and update the awk files to make all the macros go through the singleton context instance:: #define name_of_option the_uni.global_options_.x_name_of_option In a GLOBAL_STATE this is effectively:: context::global_options_.x_name_of_option i.e. it's still a simple field lookup. * then a series of patches to remove the places where the macros are used, making them be explicit refererence to the `x_` fields, making the lookup above explicit. Where a more specific context can be accessed, use it, rather than relying on the `the_uni` singleton. * Eventually, nowhere will be using the macros, and the macro generation can be removed from the awk files. `union tree_node *[TI_MAX] global_trees` (5472 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in tree.h:: /* Standard named or nameless data types of the C compiler. */ enum tree_index { /* snip TI_ values for 0...132 */ TI_MAX }; extern GTY(()) tree global_trees[TI_MAX]; followed by dozens of macros for specific trees, e.g.:: #define void_type_node global_trees[TI_VOID_TYPE] /* The C type `void *'. */ Ideas: Given that TI_MAX is defined in tree.h and there is no preprocessor trickery going on in the enum, this is of constant size between all frontends and backends. Hence context could contain:: class context { public: /* ... */ MAYBE_STATIC tree global_trees[TI_MAX]; /* ... */ }; and it will need to visit these during GC and PCH. This will require users of context.h to include tree.h The macros in tree.h would need to be updated to go through a context:: #define void_type_node the_uni.global_trees[TI_VOID_TYPE] /* The C type `void *'. */ How to avoid the implicit dependency on `the_uni` throughout? Perhaps add methods to context for looking up trees, and use these? Alternatively, introduce a wrapper class for this: `struct global_trees_d` and have a singleton within the context. Or put it in a `class frontend`. Plan: put it in a `class frontend`:: class GTY((user)) frontend : public gc_base { protected: /* ... */ MAYBE_STATIC tree global_trees[TI_MAX]; /* ... */ }; `struct FILE * dump_file` (4832 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in dumpfile.h:: /* Global variables used to communicate with passes. */ extern FILE *dump_file; extern FILE *alt_dump_file; extern int dump_flags; extern const char *dump_file_name; Defined in dumpfile.c:: /* These are currently used for communicating between passes. However, instead of accessing them directly, the passes can use dump_printf () for dumps. */ FILE *dump_file = NULL; FILE *alt_dump_file = NULL; const char *dump_file_name; int dump_flags; The code is full of this pattern:: if (dump_file) fprintf (dump_file, FORMAT_STRING, ARGS); and less frequently:: if (dump_file && (dump_flags & TDF_DETAILS)) { Plan: not yet sure. One idea might be to replace these with:: FILE *dump_file = the_uni.dump_file_; if (dump_file) and make dump_file be a field of context, rather than a global (as in the `class context` above. This would be a largish patch though: adding a lookup to the top of many functions. Initially this could be of the form:: void foo(void) { /* Use the TLS lookup of the context in lieu of nothing better: */ FILE *dump_file = g->dump_file_; /* ... */ if (dump_file) fprintf (dump_file, FORMAT_STRING, ARGS); /* ... */ if (dump_file) fprintf (dump_file, FORMAT_STRING, ARGS); } but the lookup could be converted to:: unsigned int pass_foo::execute_hook(void) { /* Get the context as "this->ctxt_" */ FILE *dump_file = ctxt_.dump_file_; In both cases, I'm hoping that in a GLOBAL_STATE build the optimizer can identify that the context isn't used, and optimize away the lookups as equivalent to:: unsigned int pass_foo::execute_hook(void) { context &unused = this->ctxt; FILE *dump_file = context::dump_file_; thus avoiding the lookup costs in the GLOBAL_STATE build. `struct function * cfun` (4602 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ There is some non-trivial state around the `cfun` global, with an API for changing it that calls a `set_current_function` target hook when the value changes. Implementations of `set_current_function` : * rs6000 * i386 * avr * mips * rx * a default implementation There's also a stack, with a push_cfun/pop_cfun API that calls into set_cfun, and also sets `current_function_decl`. ======================= ========= Function Callsites ======================= ========= `set_cfun` 31 `push_cfun` 50 `pop_cfun` 57 `push_function_context` 19 `pop_function_context` 20 ======================= ========= Calls to push_cfun / pop_cfun are almost all balanced within the same function. Exception is modify_function in tree-sra.c which pops then pushes. One part of the puzzle is that various header files in the build define macros that reference the "cfun" global, e.g.:: #define n_basic_blocks (cfun->cfg->x_n_basic_blocks) so there are about 4600 sites that use the global. I'd hoped to elimintate `cfun` in favor of simply passing the `function*` around as a parameter, but I don't think that's realistic for this milestone. Plan: cfun remains a global in a GLOBAL_STATE build, and becomes a macro lookup in a shared-library build, using a TLS lookup:: #if GLOBAL_STATE /* Status quo: */ #define cfun (cfun + 0) #else /* (the "+ 0" ensures it's not a lvalue, so can't be assigned to) */ #define cfun (g->m_cfun + 0) #endif This is efficient for the global state case, but leads to thousands of implicit TLS reads in the shared library case (often within loops) I've been working on patches to remove these macros, making uses of cfun explicit. http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01564.html It may then be possible to amortize these lookups, since cfun->cfg rarely changes inside a function: cfun only changes when one of the API calls is invoked, and a function's cfg ptr is only set in `init_flow` and during cleanup. These macro removals and cfun->cfg consolidation may help the global state case also: the compiler can't prove cfun->cfg doesn't change if the body of the function makes a call into a function it can't see inside. Notes: `set_cfun` (`function.c`) directly sets cfun, and when it changes calls `invoke_set_current_function_hook` on the new function's decl. This potentially updates `optimization_current_node`, calls the `set_current_function` hook on the target, and potentially calls `init_tree_optimization_optabs` on the new optimizations. Status ^^^^^^ Removal of the cfun-using macros is approved; see http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01878.html and http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00780.html replacing:: if (n_basic_blocks <= NUM_FIXED_BLOCKS + 1) with:: if (n_basic_blocks_for_fn (cfun) <= NUM_FIXED_BLOCKS + 1) However, given that cfun will remain accessed via thread-local store in a shared-library build, I'd rather work on CFGs, and consolidate the TLS CFG lookup at the top of a function, giving:: struct control_flow_graph *cfg = *cfun->cfg; if (cfg->m_n_basic_blocks <= NUM_FIXED_BLOCKS + 1) Though the above change may give us a route there. .. Note to self: my working copy for this aspect is `gcc-git-remove-cfun-macros` `struct rtx_def *[MAX_SAVED_CONST_INT * 2 + 1] const_int_rtx` (3744 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `rtl.h`:: #define MAX_SAVED_CONST_INT 64 extern GTY(()) rtx const_int_rtx[MAX_SAVED_CONST_INT * 2 + 1]; #define const0_rtx (const_int_rtx[MAX_SAVED_CONST_INT]) #define const1_rtx (const_int_rtx[MAX_SAVED_CONST_INT+1]) #define const2_rtx (const_int_rtx[MAX_SAVED_CONST_INT+2]) #define constm1_rtx (const_int_rtx[MAX_SAVED_CONST_INT-1]) Defined in `emit-rtl.c`:: rtx const_int_rtx[MAX_SAVED_CONST_INT * 2 + 1]; representing small integers (-64 <= i <= 64) Used extensively by `insn-emit.c` (generated by `genemit.c`) and `insn-recog.c` (generated by `genrecog.c`), but not in the rest of the sources. Plan: `const_int_rtx_` to be MAYBE_STATIC within a backend class within the context, with const_int_rtx to become a macro:: class backend { public: MAYBE_STATIC rtx const_int_rtx_[MAX_SAVED_CONST_INT * 2 + 1]; /* with gty hooks in the vfunc */ }; #if GLOBAL_STATE /* Make sure the optimizer doesn't do unnecessary work: */ #define const_int_rtx (backend::const_int_rtx_) #else #define const_int_rtx (g->get_backend ().const_int_rtx_) #endif with the const0_rtx etc remaining as before. `union tree_node *[(int sites) ATTR_LAST] built_in_attributes` (2186 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `builtin-attrs.def` is shared by all frontends `built_in_attributes` has declarations in 3 of the frontends:: ada/gcc-interface/utils.c:5948:static GTY(()) tree built_in_attributes[(int) ATTR_LAST]; c-family/c-common.c:5000:static GTY(()) tree built_in_attributes[(int) ATTR_LAST]; lto/lto-lang.c:128:static GTY(()) tree built_in_attributes[(int) ATTR_LAST]; All three of these have a:: enum built_in_attribute { /* use builtin-attrs.def */ ATTR_LAST }; immediately prior to the `built_in_attributes`, so if I'm reading things right, they all have the same meaning of ATTR_LAST. Plan: move the array into context, though perhaps a new frontend class should be added to hold them? `union tree_node *[(int sites) BT_LAST + 1] builtin_types` (2099 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `builtin-types.def` is shared by all frontends. These are declared per-frontend:: ada/gcc-interface/utils.c:5812:static GTY(()) tree builtin_types[(int) BT_LAST + 1]; c-family/c-common.c:5047:static tree builtin_types[(int) BT_LAST + 1]; fortran/f95-lang.c:651: tree builtin_types[(int) BT_LAST + 1]; lto/lto-lang.c:173:static GTY(()) tree builtin_types[(int) BT_LAST + 1]; Both ada and c-family have:: enum c_builtin_type { /* use builtin-attrs.def */ BT_LAST }; and lto has:: enum lto_builtin_type { /* use builtin-attrs.def */ BT_LAST }; whereas fortran manages to hide it in function scope:: gfc_init_builtin_functions (void) { enum builtin_type { // etc... Plan: as for built_in_attributes `int which_alternative` (1758 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `recog.h`:: /* Set by constrain_operands to the number of the alternative that matched. */ extern int which_alternative; Defined in `recog.c`:: /* On return from `constrain_operands', indicate which alternative was satisfied. */ int which_alternative; Plan: move into context, perhaps within a new class recog (using MAYBE_STATIC):: clas recog { public: MAYBE_STATIC int which_alternative; }; `unsigned char[87] mode_size` (1495 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `machmode.h`:: /* Get the size in bytes and bits of an object of mode MODE. */ extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES]; #define GET_MODE_SIZE(MODE) ((unsigned short) mode_size[MODE]) #define GET_MODE_BITSIZE(MODE) \ ((unsigned short) (GET_MODE_SIZE (MODE) * BITS_PER_UNIT)) and also accessed by macro in `regs.h`:: #define REG_BYTES(R) mode_size[(int) GET_MODE (R)] `CONST_MODE_SIZE` is defined in `insn-modes.h` which is generated by `genmodes.c`, e.g. "generated automatically from machmode.def and config/i386/i386-modes.def" on this build, and is blank on my build, via this logic:: printf ("#define CONST_MODE_SIZE%s\n", adj_bytesize ? "" : " const"); to allow for the (autogenerated) function `init_adjust_machine_modes` to tweak them. Plan: TODO `struct df_d * df` (1205 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `struct df_d` is declared in df.h df.h has:: extern struct df_d *df; #define df_scan (df->problems_by_index[DF_SCAN]) #define df_rd (df->problems_by_index[DF_RD]) #define df_lr (df->problems_by_index[DF_LR]) #define df_live (df->problems_by_index[DF_LIVE]) #define df_chain (df->problems_by_index[DF_CHAIN]) #define df_word_lr (df->problems_by_index[DF_WORD_LR]) #define df_note (df->problems_by_index[DF_NOTE]) #define df_md (df->problems_by_index[DF_MD]) Defined in `df-core.c`:: struct df_d *df; It's created in `rest_of_handle_df_initialize` which is the execute hook for `pass_df_initialize_opt` (aka `dfinit`). It's freed in `rest_of_handle_df_finish` which is the execute hook for `pass_df_finish` (aka `dfinish`). Both of these are implemented in `df-core.c`. Plan: add:: MAYBE_STATIC struct df_d *df_; to context, and remove the global. TODO: what to do about the macros? Perhaps:: #define df (g->get_df ()) `struct gcc_target targetm` (1069 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `target.h` declares:: extern struct gcc_target targetm; which appears to use `target.def` and the `HOOK_` system. Should this be a C++ class??? TODO `struct _IO_FILE * stderr` (990 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ stderr thread-safety? likelihood of interleaved errors? `location_t input_location` (930 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `input.h`:: extern location_t input_location; which also has these macros:: #define input_line LOCATION_LINE (input_location) #define input_filename LOCATION_FILE (input_location) `tree.h` also implicitly refers to it in this macro:: #define EXPR_LOC_OR_HERE(NODE) (EXPR_HAS_LOCATION (NODE) \ ? (NODE)->exp.locus : input_location) Defined in `input.c`:: /* Current position in real source file. */ location_t input_location; `input_line` is only used in 13 places. `input_filename` is used in 21 places. `EXPR_LOC_OR_HERE` is used in 45 places. `input_location` is used in about 2600 places. Plan: * eliminate the `input_line` and `input_filename` macros Patch posted as: http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00072.html * move `input_location` into context:: class context { public: /* ... */ MAYBE_STATIC location_t input_location_; /* ... */ }; and convert `input_location` into a macro that accesses the current context's `input_location_`:: #define input_location (GET_UNI().input_location_) `struct saved_scope * scope_chain` (930 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in cp/cp-tree.h:: extern GTY(()) struct saved_scope *scope_chain; along with numerous macros that access the fields of the struct. Defined in cp/name-lookup.c:: struct saved_scope *scope_chain; Plan: TODO; the macros make it hard. `int dump_flags` (927 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in dumpfile.h:: extern int dump_flags; Defined in dumpfile.c:: int dump_flags; Plan: make dump_flags be a field of the context, then:: FILE *dump_file = g->get_dump_file (); int dump_flags = g->get_dump_flags (); if (dump_file && (dump_flags & TDF_DETAILS)) { /* use dump_file */ } preserving the bulk of the existing code (albeit with one big patch to add the locals to all scopes that need it). `struct rtl_data x_rtl` (885 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `function.h`:: extern GTY(()) struct rtl_data x_rtl; /* Accessor to RTL datastructures. We keep them statically allocated now since we never keep multiple functions. For threaded compiler we might however want to do differently. */ #define crtl (&x_rtl) along with numerous macros that add implicit uses of x_rtl:: #define return_label (crtl->x_return_label) #define naked_return_label (crtl->x_naked_return_label) #define stack_slot_list (crtl->x_stack_slot_list) #define parm_birth_insn (crtl->x_parm_birth_insn) #define frame_offset (crtl->x_frame_offset) #define stack_check_probe_note (crtl->x_stack_check_probe_note) #define arg_pointer_save_area (crtl->x_arg_pointer_save_area) #define used_temp_slots (crtl->x_used_temp_slots) #define avail_temp_slots (crtl->x_avail_temp_slots) #define temp_slot_level (crtl->x_temp_slot_level) #define nonlocal_goto_handler_labels (crtl->x_nonlocal_goto_handler_labels) #define frame_pointer_needed (crtl->frame_pointer_needed) #define stack_realign_fp (crtl->stack_realign_needed && !crtl->need_drap) #define stack_realign_drap (crtl->stack_realign_needed && crtl->need_drap) TODO: not yet sure how to deal with this. One approach would be analog of the cfun approach: make a field inside context:: class context { public: struct rtl_data crtl_; }; #if GLOBAL_STATE #define crtl (context::crtl_) #else #define crtl (g->get_crtl ()) #endif `union tree_node *[13] integer_types` (846 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `tree.h` declares:: /* The standard C integer types. Use integer_type_kind to index into this array. */ extern GTY(()) tree integer_types[itk_none]; along with access macros:: #define char_type_node integer_types[itk_char] #define signed_char_type_node integer_types[itk_signed_char] /* etc */ Defined in `tree.c`:: tree integer_types[itk_none]; TODO: not yet sure how to tackle this; perhaps another field in context hidden with a macro? `unsigned char[X86_TUNE_LAST] ix86_tune_features` (815 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `config/i386/i386.h`:: extern unsigned char ix86_tune_features[X86_TUNE_LAST]; which also declares numerous macros of the form `TARGET_FOO`, e.g.:: #define TARGET_SLOW_IMUL_IMM32_MEM \ ix86_tune_features[X86_TUNE_SLOW_IMUL_IMM32_MEM] Defined in `config/i386/i386.c`:: /* Feature tests against the various tunings. */ unsigned char ix86_tune_features[X86_TUNE_LAST]; TODO `union tree_node * current_function_decl` (756 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `tree.h`:: extern GTY(()) tree current_function_decl; Implicitly exposed by macros in `tree-diagnostic.h`:: #define diagnostic_last_function_changed (DC, DI) /* snip */ #define diagnostic_set_last_function(DC, DI) /* snip */ both of which act on a diagnostic_context. Defined in `toplev.c`:: /* The FUNCTION_DECL for the function currently being compiled, or 0 if between functions. */ tree current_function_decl; There are about 500 uses of current_function_decl in the sources. TODO: is this *always* in sync with cfun? Idea: if we put it in context, put a `context *` into each diagnostic_context, so that the macros can easily get at the correct context. `unsigned char[302][64] tree_contains_struct` (623 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `tree.h`:: extern unsigned char tree_contains_struct[MAX_TREE_CODES][64]; describing the structure of the tree "subclass" hierarchy. It's implicitly used by this macro:: #define CODE_CONTAINS_STRUCT(CODE, STRUCT) (tree_contains_struct[(CODE)][(STRUCT)]) It's initialized using various macros all of the form `MARK_TS_FOO` e.g.:: #define MARK_TS_BASE(C) \ do { \ tree_contains_struct[C][TS_BASE] = 1; \ } while (0) which are used by `initialize_tree_contains_struct`, but it's really constant data, it's only non-constant because of the way it's initialized. Each frontend adds extra stuff:: ada/gcc-interface/misc.c:832:/* Initialize language-specific bits of tree_contains_struct. */ c-family/c-common.c:11420:/* Initialize language-specific-bits of tree_contains_struct. */ fortran/f95-lang.c lto/lto-lang.c:1245: tree_contains_struct[NAMESPACE_DECL][TS_DECL_MINIMAL] = 1; TODO: is there a way of making this const? (e.g. moving definition to the frontend, and generating initializers?) `int reload_completed` (606 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `rtl.h`:: /* Nonzero after end of reload pass. Set to 1 or 0 by reload1.c. */ extern int reload_completed; and implicitly used by this macro in rtl.h:: #define can_create_pseudo_p() (!reload_in_progress && !reload_completed) Used throughout source code and autogenerated code. Plan: TODO: perhaps convert to a field of context, add a compatibility macro to get it relative to a local `context *`, and see how much still compiles??? Perhaps add a `class reload_state` ??? `struct target_hard_regs default_target_hard_regs` (585 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `hard-reg-set.h` declares:: extern struct target_hard_regs default_target_hard_regs; #if SWITCHABLE_TARGET extern struct target_hard_regs *this_target_hard_regs; #else #define this_target_hard_regs (&default_target_hard_regs) #endif followed by 18 field-access macros that implicitly access `this_target_hard_regs`. Defined in reginfo.c:: struct target_hard_regs default_target_hard_regs; struct target_regs default_target_regs; #if SWITCHABLE_TARGET struct target_hard_regs *this_target_hard_regs = &default_target_hard_regs; struct target_regs *this_target_regs = &default_target_regs; #endif Appears to be set up by `init_reg_sets` in `reginfo.c` but can then be modified by switches. TODO `struct target_rtl default_target_rtl` (552 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `rtl.h` declares:: extern GTY(()) struct target_rtl default_target_rtl; #if SWITCHABLE_TARGET extern struct target_rtl *this_target_rtl; #else #define this_target_rtl (&default_target_rtl) #endif and various macros that create implicit uses of them. emit-rtl.c defines them:: struct target_rtl default_target_rtl; #if SWITCHABLE_TARGET struct target_rtl *this_target_rtl = &default_target_rtl; #endif TODO `int flag_isoc99` (547 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TODO: Used in two places in `builtins.def` and in many places in C frontend (do_scope) `struct rtl_hooks rtl_hooks` (501 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `rtl.h`:: /* Each pass can provide its own. */ extern struct rtl_hooks rtl_hooks; TODO `struct reload[MAX_RELOADS] rld` (498 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `reload.h`:: extern struct reload rld[MAX_RELOADS]; extern int n_reloads; TODO; perhaps part of part of a new `class reload_state` ??? The value of `MAX_RELOADS` is backend-specific - reload.h has:: #define MAX_RELOADS (2 * MAX_RECOG_OPERANDS * (MAX_REGS_PER_ADDRESS + 1)) and both of the macros on the right-hand-side are backend-specific: `MAX_RECOG_OPERANDS` is defined in `insn-config.h` (which is autogenerated by genconfig.c from the machine description file), and `MAX_REGS_PER_ADDRESS` is defined in headers in the config subdirctory for the targer in use. `union tree_node *[CPTI_MAX] cp_global_trees` (489 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ These are important tree nodes used by the C++ frontend. Declared in `cp/cp-tree.h`:: extern GTY(()) tree cp_global_trees[CPTI_MAX]; Defined in `cp/decl.c`:: tree cp_global_trees[CPTI_MAX]; Only ever used implicitly, via a set of macros defined immediately after (in `cp/cp-tree.h`) e.g.:: #define this_identifier cp_global_trees[CPTI_THIS_IDENTIFIER] `cp/cp-tree.h` is used by the source files in `cp`, but it's also used in a few other places, and exposed to plugins: * `config/sol2-cxx.c` uses it, but doesn't seem to use `cp_global_trees` * `config/i386/winnt-cxx.c` * `objc/objc-next-runtime-abi-02.c` * `objc/objc-act.c` * `objc/objc-encoding.c` * `objc/objc-gnu-runtime-abi-01.c` * `objcp/objcp-decl.c` * `objcp/objcp-lang.c` * `testsuite/g++.dg/plugin/header_plugin.c` Plan: * verify that scope of usage of cp_global_trees is confined to the `cp` directory * introduce a `class cp_state` (or `class cp_frontend`) to hold `cp_global_trees` as MAYBE_STATIC *private* data, and change everywhere using the access macros to be a MAYBE_STATIC member function of the class, so that the "cp_global_trees" in the access macros are accessing a (possibly static) field of the class (adding a suitable comment to the macros, since this is magic). How much does such a `class cp_frontend` need to see the rest of the compiler? It needs a reference to the `gc_heap` that it's using but, does it need a `context *`? If we can get away with just providing a `gc_heap *`, that's more modular. `LOC vect_location` (467 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ See notes in tree-vectorizer.c below `struct target_ira default_target_ira` (440 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ "Communication between the Integrated Register Allocator (IRA) and the rest of the compiler." `ira.h` has:: extern struct target_ira default_target_ira; #if SWITCHABLE_TARGET extern struct target_ira *this_target_ira; #else #define this_target_ira (&default_target_ira) #endif followed by 18 macros for accessing fields of this_target_ira. The definitions are in `ira.c` TODO `struct cxx_pretty_printer scratch_pretty_printer` (402 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cp/error.c has:: /* The global buffer where we dump everything. It is there only for transitional purpose. It is expected, in the near future, to be completely removed. */ static cxx_pretty_printer scratch_pretty_printer; #define cxx_pp (&scratch_pretty_printer) /* Translate if being used for diagnostics, but not for dump files or __PRETTY_FUNCTION. */ #define M_(msgid) (pp_translate_identifiers (cxx_pp) ? _(msgid) : (msgid)) so all uses are confined to this source file. It was added on 2000-09-29 in da901964100f7c7fbabc841a1eb751fec549b093 aka r36666. TODO `machine_mode word_mode` (377 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `machmode.h` has:: extern enum machine_mode word_mode; and the definition is in `emit-rtl.c`:: enum machine_mode word_mode; /* Mode whose width is BITS_PER_WORD. */ It is used directly in about 300 places. The value is computed in `init_emit_once` in `emit-rtl.c` which:: /* Create some permanent unique rtl objects shared between all functions. */ and is called by `backend_init` in `toplev.c`. Also `defaults.h` has:: #ifndef STACK_SIZE_MODE #define STACK_SIZE_MODE word_mode #endif TODO; perhaps part of a `class backend` that's part of the context? `struct line_maps * line_table` (376 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `input.h`:: extern GTY(()) struct line_maps *line_table; which also defines these macros that reference it:: #define LOCATION_LOCUS(LOC) /* snip */ #define LOCATION_BLOCK(LOC) /* snip */ #define in_system_header_at(LOC) /* snip */ #define in_system_header /* snip */ Created by `general_init` in `toplev.c` Used in about 120 places. Plan: TODO; perhaps move to a new `class frontend`? `struct FILE * asm_out_file` (358 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Declared in `output.h`:: /* File in which assembler code is being written. */ #ifdef BUFSIZ extern FILE *asm_out_file; #endif Defined in `toplev.c`:: /* Output files for assembler code (real compiler output) and debugging dumps. */ FILE *asm_out_file; Used in about 1300 places, often (but not always) with fprintf; many of these places are in the per-target `config` subdirectories. Set in a few places (with save/restore pairs), but the main place is `init_asm_output` in `toplev.c`. Closed in `finalize` in `toplev.c` TODO; might have to make this one be thread-local store for a shared build, given how pervasively this is used. Alternatively, a *lot* of new classes, storing asm_out_file as MAYBE_STATIC. `union tree_node *[CTI_MAX] c_global_trees` (348 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `c-family/c-common.h` has:: extern GTY(()) tree c_global_trees[CTI_MAX]; and has about 60 macros that create implicit references to the array. Definitions exist in:: * `ada/gcc-interface/utils.c`:: static tree c_global_trees[CTI_MAX]; * `c-family/c-common.c`:: tree c_global_trees[CTI_MAX]; `c-family/c-common.h` is included by the subdirectories `c`, `c-family`, some `config` dirs, `cp`, `objc`, `testsuite/g++.dg/plugin/header_plugin.c` Plan: similar to that for `cp_global_trees`: introduce a `class c_frontend` to hold c_global_trees as *protected* MAYBE_STATIC data, with `cp_frontend` as a subclass. TODO: What to do about the ada stuff? `union tree_node *[4] sizetype_tab` (331 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `tree.h` declares:: extern GTY(()) tree sizetype_tab[(int) stk_type_kind_last]; #define sizetype sizetype_tab[(int) stk_sizetype] #define bitsizetype sizetype_tab[(int) stk_bitsizetype] #define ssizetype sizetype_tab[(int) stk_ssizetype] #define sbitsizetype sizetype_tab[(int) stk_sbitsizetype] It is defined in `stor-layout.h`:: /* Data type for the expressions representing sizes of data types. It is the first integer type laid out. */ tree sizetype_tab[(int) stk_type_kind_last]; It is used (via the macros) in about 500-600 places, in frontends and in passes. TODO: perhaps make it a field of context? `struct vec sched_luids` (326 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `sched-int.h` declares:: /* Mapping from INSN_UID to INSN_LUID. In the end all other per insn data structures should be indexed by luid. */ extern vec sched_luids; #define INSN_LUID(INSN) (sched_luids[INSN_UID (INSN)]) #define LUID_BY_UID(UID) (sched_luids[UID]) #define SET_INSN_LUID(INSN, LUID) \ (sched_luids[INSN_UID (INSN)] = (LUID)) Defined in `haifa-sched.c`:: /* Mapping from instruction UID to its Logical UID. */ vec sched_luids = vNULL; Released by sched_finish_luids in `haifa-sched.c` TODO `struct _IO_FILE * stdout` (311 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As per stderr: thread-safety? likelihood of interleaved errors? TODO `struct vec h_i_d` (300 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `sched-int.h` declares:: extern vec h_i_d; #define HID(INSN) (&h_i_d[INSN_UID (INSN)]) along with other various macros of the form `INSN_*` that add implicit uses of the global. TODO `struct lang_hooks lang_hooks` (287 sites) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `langhooks.h` has:: /* Each front end provides its own. */ extern struct lang_hooks lang_hooks; Should this be a C++ class instead? TODO