GCC Middle and Back End API Reference
|
Data Structures | |
struct | bbro_basic_block_data_def |
struct | trace |
Typedefs | |
typedef struct bbro_basic_block_data_def | bbro_basic_block_data |
Variables | |
struct target_bb_reorder | default_target_bb_reorder |
struct target_bb_reorder * | this_target_bb_reorder = &default_target_bb_reorder |
static const int | branch_threshold [N_ROUNDS] = {400, 200, 100, 0, 0} |
static const int | exec_threshold [N_ROUNDS] = {500, 200, 50, 0, 0} |
static int | array_size |
static bbro_basic_block_data * | bbd |
static int | max_entry_frequency |
static gcov_type | max_entry_count |
typedef struct bbro_basic_block_data_def bbro_basic_block_data |
Structure to hold needed information for each basic block.
|
static |
If any destination of a crossing edge does not have a label, add label; Convert any easy fall-through crossing edges to unconditional jumps.
References block_label(), control_flow_insn_p(), edge_def::dest, emit_barrier_after_bb(), emit_jump_insn_after(), edge_def::flags, single_succ_p(), and edge_def::src.
Referenced by partition_hot_cold_basic_blocks().
|
static |
Add REG_CROSSING_JUMP note to all crossing jump insns.
References add_reg_note(), find_reg_note(), edge_def::flags, edge_def::src, and basic_block_def::succs.
Referenced by partition_hot_cold_basic_blocks().
|
static |
Referenced by find_traces(), and find_traces_1_round().
|
static |
Compute and return the key (for the heap) of the basic block BB.
References cfun, bbro_basic_block_data_def::end_of_trace, edge_def::flags, basic_block_def::frequency, basic_block_def::index, optimize_function_for_size_p(), basic_block_def::preds, priority(), probably_never_executed_bb_p(), and edge_def::src.
|
static |
Return the trace number in which BB was visited.
References array_size, basic_block_def::index, and bbro_basic_block_data_def::visited.
Referenced by copy_bb(), find_traces_1_round(), and rotate_loop().
|
static |
Return true when the edge E from basic block BB is better than the temporary best edge (details are in function). The probability of edge E is PROB. The frequency of the successor is FREQ. The current best probability is BEST_PROB, the best frequency is BEST_FREQ. The edge is considered to be equivalent when PROB does not differ much from BEST_PROB; similarly for frequency.
References cfun, edge_def::dest, edge_def::flags, basic_block_def::index, optimize_function_for_size_p(), and basic_block_def::prev_bb.
Referenced by find_traces_1_round().
|
static |
Return true when the edge E is better than the temporary best edge CUR_BEST_EDGE. If SRC_INDEX_P is true, the function compares the src bb of E and CUR_BEST_EDGE; otherwise it will compare the dest bb. BEST_LEN is the trace length of src (or dest) bb in CUR_BEST_EDGE. TRACES record the information about traces. When optimizing for size, the edge with smaller index is better. When optimizing for speed, the edge with bigger probability or longer trace is better.
References cfun, edge_def::dest, basic_block_def::index, trace::length, optimize_function_for_size_p(), edge_def::probability, and edge_def::src.
Referenced by connect_traces().
|
static |
Referenced by reorder_basic_blocks().
|
static |
Connect traces in array TRACES, N_TRACES is the count of traces.
References basic_block_def::aux, cfun, connect_better_edge_p(), copy_bb(), copy_bb_p(), count, edge_def::count, current_pass, edge_def::dest, dump_file, bbro_basic_block_data_def::end_of_trace, trace::first, first, edge_def::flags, basic_block_def::index, last, trace::last, trace::length, max_entry_count, max_entry_frequency, optimize_edge_for_speed_p(), optimize_function_for_size_p(), basic_block_def::preds, edge_def::probability, si, edge_def::src, bbro_basic_block_data_def::start_of_trace, and basic_block_def::succs.
|
static |
Referenced by connect_traces(), find_traces_1_round(), rotate_loop(), and thread_prologue_and_epilogue_insns().
|
static |
Create a duplicate of the basic block OLD_BB and redirect edge E to it, add it to trace after BB, mark OLD_BB visited and update pass' data structures (TRACE is a number of trace which OLD_BB is duplicated to).
References array_size, basic_block_def::aux, bb_visited_trace(), edge_def::dest, dump_file, duplicate_block(), bbro_basic_block_data_def::end_of_trace, bbro_basic_block_data_def::heap, bbro_basic_block_data_def::in_trace, basic_block_def::index, mark_bb_visited(), bbro_basic_block_data_def::node, bbro_basic_block_data_def::start_of_trace, and bbro_basic_block_data_def::visited.
|
static |
Referenced by connect_traces(), find_traces_1_round(), and rotate_loop().
|
static |
Return true when BB can and should be copied. CODE_MAY_GROW is true when code size is allowed to grow by duplication.
References can_duplicate_block_p(), dump_file, basic_block_def::frequency, get_attr_min_length(), basic_block_def::index, optimize_bb_for_speed_p(), basic_block_def::preds, and basic_block_def::succs.
|
static |
References basic_block_def::aux, bitmap_bit_p(), bitmap_empty_p(), bitmap_set_bit(), can_duplicate_block_p(), candidates, cfg_layout_finalize(), cfg_layout_initialize(), clear_bb_flags(), computed_jump_p(), duplicate_block(), find_reg_note(), edge_def::flags, basic_block_def::flags, get_attr_min_length(), get_uncond_jump_length(), basic_block_def::index, basic_block_def::next_bb, basic_block_def::preds, single_pred_p(), single_succ(), single_succ_edge(), and single_succ_p().
|
static |
This function checks the destination block of a "crossing jump" to see if it has any crossing predecessors that begin with a code label and end with an unconditional jump. If so, it returns that predecessor block. (This is to avoid creating lots of new basic blocks that all contain unconditional jumps to the same destination).
References any_condjump_p(), edge_def::flags, basic_block_def::preds, and edge_def::src.
Referenced by fix_crossing_conditional_branches().
Find the basic blocks that are rarely executed and need to be moved to a separate section of the .o file (to cut down on paging and improve cache locality). Return a vector of all edges that cross.
References cfun, edge_def::dest, function::eh, fix_up_crossing_landing_pad(), edge_def::flags, eh_landing_pad_d::landing_pad, eh_status::lp_array, basic_block_def::preds, probably_never_executed_bb_p(), edge_def::src, basic_block_def::succs, and vNULL.
Referenced by partition_hot_cold_basic_blocks().
|
static |
Local function prototypes.
Referenced by reorder_basic_blocks().
|
static |
Find the traces for Software Trace Cache. Chain each trace through RBI()->next. Store the number of traces to N_TRACES and description of traces to TRACES.
References basic_block_def::aux, bb_to_key(), branch_threshold, basic_block_def::count, edge_def::dest, dump_file, exec_threshold, find_traces_1_round(), first, basic_block_def::frequency, bbro_basic_block_data_def::heap, basic_block_def::index, trace::last, max_entry_count, max_entry_frequency, and bbro_basic_block_data_def::node.
|
static |
One round of finding traces. Find traces for BRANCH_TH and EXEC_TH i.e. do not include basic blocks whose probability is lower than BRANCH_TH or whose frequency is lower than EXEC_TH into traces (or whose count is lower than COUNT_TH). Store the new traces into TRACES and modify the number of traces *N_TRACES. Set the round (which the trace belongs to) to ROUND. The function expects starting basic blocks to be in *HEAP and will delete *HEAP and store starting points for the next round into new *HEAP.
References basic_block_def::aux, bb_to_key(), bb_visited_trace(), better_edge_p(), block_ends_with_call_p(), cfun, copy_bb(), copy_bb_p(), edge_def::count, edge_def::dest, dump_file, bbro_basic_block_data_def::end_of_trace, trace::first, edge_def::flags, basic_block_def::frequency, bbro_basic_block_data_def::heap, bbro_basic_block_data_def::in_trace, basic_block_def::index, trace::last, trace::length, mark_bb_visited(), basic_block_def::next_bb, bbro_basic_block_data_def::node, optimize_edge_for_speed_p(), optimize_function_for_size_p(), basic_block_def::preds, prob, edge_def::probability, push_to_next_round_p(), rotate_loop(), trace::round, single_pred_p(), single_succ(), single_succ_edge(), single_succ_p(), bbro_basic_block_data_def::start_of_trace, and basic_block_def::succs.
Referenced by find_traces().
|
static |
Find all BB's with conditional jumps that are crossing edges; insert a new bb and make the conditional jump branch to the new bb instead (make the new bb same color so conditional branch won't be a 'crossing' edge). Insert an unconditional jump from the new bb to the original destination of the conditional jump.
References any_condjump_p(), basic_block_def::aux, block_label(), create_basic_block(), edge_def::dest, emit_barrier_after_bb(), emit_jump_insn(), emit_label(), find_jump_block(), edge_def::flags, gen_label_rtx(), last_bb, make_edge(), basic_block_def::prev_bb, redirect_edge_succ(), redirect_jump(), SET, and basic_block_def::succs.
Referenced by partition_hot_cold_basic_blocks().
|
static |
Find any unconditional branches that cross between hot and cold sections. Convert them into indirect jumps instead.
References any_condjump_p(), computed_jump_p(), cur_insn, delete_insn(), emit_indirect_jump(), emit_insn_before(), emit_move_insn(), end_sequence(), edge_def::flags, gen_reg_rtx(), get_insns(), start_sequence(), basic_block_def::succs, and tablejump_p().
Referenced by partition_hot_cold_basic_blocks().
|
static |
The landing pad OLD_LP, in block OLD_BB, has edges from both partitions. Duplicate the landing pad and split the edges so that no EH edge crosses partitions.
References basic_block_def::aux, block_label(), create_basic_block(), ei_next(), ei_safe_edge(), emit_barrier_after_bb(), emit_jump_insn(), emit_label(), expand_dw2_landing_pad_for_region(), find_reg_note(), gen_eh_landing_pad(), gen_label_rtx(), eh_landing_pad_d::index, eh_landing_pad_d::landing_pad, last_bb, make_edge(), eh_landing_pad_d::post_landing_pad, basic_block_def::preds, basic_block_def::prev_bb, redirect_edge_succ(), eh_landing_pad_d::region, single_succ(), and edge_def::src.
Referenced by find_rarely_executed_basic_blocks_and_crossing_edges().
|
static |
Find any bb's where the fall-through edge is a crossing edge (note that these bb's must also contain a conditional jump or end with a call instruction; we've already dealt with fall-through edges for blocks that didn't have a conditional jump or didn't end with call instruction in the call to add_labels_and_missing_jumps). Convert the fall-through edge to non-crossing edge by inserting a new bb to fall-through into. The new bb will contain an unconditional jump (crossing edge) to the original fall through destination.
References basic_block_def::aux, block_ends_with_call_p(), block_label(), can_throw_internal(), edge_def::dest, emit_barrier_after_bb(), edge_def::flags, force_nonfallthru(), invert_jump(), single_succ_edge(), basic_block_def::succs, and update_br_prob_note().
Referenced by partition_hot_cold_basic_blocks().
|
static |
Duplicate the blocks containing computed gotos. This basically unfactors computed gotos that were factored early on in the compilation process to speed up edge based data flow. We used to not unfactoring them again, which can seriously pessimize code with many computed jumps in the source code, such as interpreters. See e.g. PR15242.
References cfun, optimize_function_for_size_p(), and targetm.
|
static |
References cfun, current_function_decl, optimize_function_for_speed_p(), and user_defined_section_attribute.
|
static |
References targetm.
int get_uncond_jump_length | ( | void | ) |
Return the length of unconditional jump instruction.
References delete_insn(), emit_jump_insn(), emit_label_before(), gen_label_rtx(), get_attr_min_length(), get_insns(), and trace::length.
Referenced by duplicate_computed_gotos(), reorder_basic_blocks(), and thread_prologue_and_epilogue_insns().
void insert_section_boundary_note | ( | void | ) |
Determine which partition the first basic block in the function belongs to, then find the first basic block in the current function that belongs to a different section, and insert a NOTE_INSN_SWITCH_TEXT_SECTIONS note immediately before it in the instruction stream. When writing out the assembly code, encountering this note will make the compiler switch between the hot and cold text sections.
References emit_note_before().
Referenced by rest_of_pass_free_cfg().
rtl_opt_pass* make_pass_duplicate_computed_gotos | ( | ) |
rtl_opt_pass* make_pass_partition_blocks | ( | ) |
rtl_opt_pass* make_pass_reorder_blocks | ( | ) |
|
static |
Referenced by copy_bb(), and find_traces_1_round().
|
static |
This function marks BB that it was visited in trace number TRACE.
References bbro_basic_block_data_def::heap, basic_block_def::index, bbro_basic_block_data_def::node, and bbro_basic_block_data_def::visited.
|
static |
This function is the main 'entrance' for the optimization that partitions hot and cold basic blocks into separate sections of the .o file (to improve performance and cache locality). Ideally it would be called after all optimizations that rearrange the CFG have been called. However part of this optimization may introduce new register usage, so it must be called before register allocation has occurred. This means that this optimization is actually called well before the optimization that reorders basic blocks (see function above). This optimization checks the feedback information to determine which basic blocks are hot/cold, updates flags on the basic blocks to indicate which section they belong in. This information is later used for writing out sections in the .o file. Because hot and cold sections can be arbitrarily large (within the bounds of memory), far beyond the size of a single function, it is necessary to fix up all edges that cross section boundaries, to make sure the instructions used can actually span the required distance. The fixes are described below. Fall-through edges must be changed into jumps; it is not safe or legal to fall through across a section boundary. Whenever a fall-through edge crossing a section boundary is encountered, a new basic block is inserted (in the same section as the fall-through source), and the fall through edge is redirected to the new basic block. The new basic block contains an unconditional jump to the original fall-through target. (If the unconditional jump is insufficient to cross section boundaries, that is dealt with a little later, see below). In order to deal with architectures that have short conditional branches (which cannot span all of memory) we take any conditional jump that attempts to cross a section boundary and add a level of indirection: it becomes a conditional jump to a new basic block, in the same section. The new basic block contains an unconditional jump to the original target, in the other section. For those architectures whose unconditional branch is also incapable of reaching all of memory, those unconditional jumps are converted into indirect jumps, through a register. IMPORTANT NOTE: This optimization causes some messy interactions with the cfg cleanup optimizations; those optimizations want to merge blocks wherever possible, and to collapse indirect jump sequences (change "A jumps to B jumps to C" directly into "A jumps to C"). Those optimizations can undo the jump fixes that partitioning is required to make (see above), in order to ensure that jumps attempting to cross section boundaries are really able to cover whatever distance the jump requires (on many architectures conditional or unconditional jumps are not able to reach all of memory). Therefore tests have to be inserted into each such optimization to make sure that it does not undo stuff necessary to cross partition boundaries. This would be much less of a problem if we could perform this optimization later in the compilation, but unfortunately the fact that we may need to create indirect jumps (through registers) requires that this optimization be performed before register allocation. Hot and cold basic blocks are partitioned and put in separate sections of the .o file, to reduce paging and improve cache performance (hopefully). This can result in bits of code from the same function being widely separated in the .o file. However this is not obvious to the current bb structure. Therefore we must take care to ensure that: 1). There are no fall_thru edges that cross between sections; 2). For those architectures which have "short" conditional branches, all conditional branches that attempt to cross between sections are converted to unconditional branches; and, 3). For those architectures which have "short" unconditional branches, all unconditional branches that attempt to cross between sections are converted to indirect jumps. The code for fixing up fall_thru edges that cross between hot and cold basic blocks does so by creating new basic blocks containing unconditional branches to the appropriate label in the "other" section. The new basic block is then put in the same (hot or cold) section as the original conditional branch, and the fall_thru edge is modified to fall into the new basic block instead. By adding this level of indirection we end up with only unconditional branches crossing between hot and cold sections. Conditional branches are dealt with by adding a level of indirection. A new basic block is added in the same (hot/cold) section as the conditional branch, and the conditional branch is retargeted to the new basic block. The new basic block contains an unconditional branch to the original target of the conditional branch (in the other section). Unconditional branches are dealt with by converting them into indirect jumps.
References add_labels_and_missing_jumps(), add_reg_crossing_jump_notes(), cfun, clear_aux_for_blocks(), df_analyze(), DF_DEFER_INSN_RESCAN, df_finish_pass(), DF_LR_RUN_DCE, df_scan_alloc(), df_scan_blocks(), df_set_flags(), function::eh, find_rarely_executed_basic_blocks_and_crossing_edges(), fix_crossing_conditional_branches(), fix_crossing_unconditional_branches(), fix_up_fall_thru_edges(), and eh_status::lp_array.
|
static |
Check to see if bb should be pushed into the next round of trace collections or not. Reasons for pushing the block forward are 1). If the block is cold, we are doing partitioning, and there will be another round (cold partition blocks are not supposed to be collected into traces until the very last round); or 2). There will be another round, and the basic block is not "hot enough" for the current round of trace collection.
References cfun, basic_block_def::count, basic_block_def::frequency, and probably_never_executed_bb_p().
Referenced by find_traces_1_round().
|
static |
Reorder basic blocks. The main entry point to this file. FLAGS is the set of flags to pass to cfg_layout_initialize().
References array_size, connect_traces(), current_ir_type(), dump_file, dump_flags, dump_flow_info(), dump_reg_info(), bbro_basic_block_data_def::end_of_trace, find_traces(), get_uncond_jump_length(), bbro_basic_block_data_def::heap, bbro_basic_block_data_def::in_trace, IR_RTL_CFGLAYOUT, mark_dfs_back_edges(), bbro_basic_block_data_def::node, relink_block_chain(), set_edge_can_fallthru_flag(), bbro_basic_block_data_def::start_of_trace, and bbro_basic_block_data_def::visited.
Referenced by rest_of_handle_reorder_blocks().
|
static |
|
static |
Referenced by find_traces_1_round().
|
static |
Rotate loop whose back edge is BACK_EDGE in the tail of trace TRACE (with sequential number TRACE_N).
References any_condjump_p(), basic_block_def::aux, bb_visited_trace(), copy_bb(), copy_bb_p(), edge_def::count, edge_def::dest, find_reg_note(), trace::first, edge_def::flags, basic_block_def::index, single_succ(), single_succ_edge(), single_succ_p(), edge_def::src, bbro_basic_block_data_def::start_of_trace, and basic_block_def::succs.
|
static |
Set the flag EDGE_CAN_FALLTHRU for edges that can be fallthru.
References any_condjump_p(), edge_def::flags, invert_jump(), and basic_block_def::succs.
Referenced by reorder_basic_blocks().
|
static |
The current size of the following dynamic array.
Referenced by bb_visited_trace(), copy_bb(), and reorder_basic_blocks().
|
static |
The array which holds needed information for basic blocks.
Referenced by gen_inbound_check().
|
static |
Branch thresholds in thousandths (per mille) of the REG_BR_PROB_BASE.
Referenced by find_traces().
struct target_bb_reorder default_target_bb_reorder |
|
static |
Exec thresholds in thousandths (per mille) of the frequency of bb 0.
Referenced by find_traces().
|
static |
Referenced by connect_traces(), and find_traces().
|
static |
Maximum frequency and count of one of the entry blocks.
Referenced by connect_traces(), and find_traces().
struct target_bb_reorder* this_target_bb_reorder = &default_target_bb_reorder |