GCC Middle and Back End API Reference
|
Go to the source code of this file.
Data Structures | |
struct | expr_history_def_1 |
struct | _expr |
struct | _def |
struct | _bnd |
struct | _fence |
struct | flist_tail_def |
struct | _list_node |
struct | _list_iterator |
struct | idata_def |
struct | vinsn_def |
struct | transformed_insns |
struct | _sel_insn_data |
struct | sel_global_bb_info_def |
struct | sel_region_bb_info_def |
struct | succ_iterator |
struct | succs_info |
Typedefs | |
typedef void * | tc_t |
typedef struct _list_node * | _list_t |
typedef struct idata_def * | idata_t |
typedef struct vinsn_def * | vinsn_t |
typedef _list_t | _xlist_t |
typedef rtx | insn_t |
typedef _xlist_t | ilist_t |
typedef struct expr_history_def_1 | expr_history_def |
typedef struct _expr | expr_def |
typedef expr_def * | expr_t |
typedef struct _def * | def_t |
typedef _list_t | av_set_t |
typedef struct _bnd * | bnd_t |
typedef _list_t | blist_t |
typedef struct _fence * | fence_t |
typedef _list_t | flist_t |
typedef struct flist_tail_def * | flist_tail_t |
typedef _list_iterator | _xlist_iterator |
typedef _xlist_iterator | ilist_iterator |
typedef _list_iterator | av_set_iterator |
typedef _list_t | def_list_t |
typedef _list_iterator | def_list_iterator |
typedef struct _sel_insn_data | sel_insn_data_def |
typedef sel_insn_data_def * | sel_insn_data_t |
typedef enum deps_where_def | deps_where_t |
typedef sel_global_bb_info_def * | sel_global_bb_info_t |
typedef sel_region_bb_info_def * | sel_region_bb_info_t |
Enumerations | |
enum | local_trans_type { TRANS_SUBSTITUTION, TRANS_SPECULATION } |
enum | deps_where_def { DEPS_IN_INSN, DEPS_IN_LHS, DEPS_IN_RHS, DEPS_IN_NOWHERE } |
typedef struct _list_node* _list_t |
List backend.
typedef _list_iterator _xlist_iterator |
typedef _list_iterator av_set_iterator |
Av set iterators.
typedef _list_iterator def_list_iterator |
typedef _list_t def_list_t |
Def list iterators.
typedef enum deps_where_def deps_where_t |
typedef struct expr_history_def_1 expr_history_def |
typedef struct flist_tail_def* flist_tail_t |
typedef _xlist_iterator ilist_iterator |
typedef struct _sel_insn_data sel_insn_data_def |
typedef sel_insn_data_def* sel_insn_data_t |
typedef void* tc_t |
For state_t.
For reg_note.
tc_t is a short for target context. This is a state of the target backend.
enum deps_where_def |
enum local_trans_type |
|
inlinestatic |
Returns true when E1 is an eligible successor edge, possibly skipping empty blocks. When E2P is not null, the resulting edge is written there. FLAGS are used to specify whether back edges and out-of-region edges should be considered.
Any successor of the block that is outside current region is ineligible, except when we're skipping to loop exits.
Skip empty blocks, but be careful not to leave the region.
Save the second edge for later checks.
BLOCK_TO_BB sets topological order of the region here. It is important to use real predecessor here, which is ip->bb, as we may well have e1->src outside current region, when skipping to loop exits.
This is true for the all cases except the last one.
We are advancing forward in the region, as usual.
We are skipping to loop exits here.
This is a back edge. During pipelining we ignore back edges, but only when it leads to the same loop. It can lead to the header of the outer loop, which will also be the preheader of the current loop.
A back edge should be requested explicitly.
|
inlinestatic |
Referenced by _list_clear().
|
inlinestatic |
References _list_remove(), _list_iterator::can_remove_p, _list_iterator::lp, and _list_iterator::removed_p.
|
inlinestatic |
References _list_add().
|
inlinestatic |
|
inline |
|
inlinestatic |
Used through _FOR_EACH.
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
Referenced by _list_alloc().
|
inlinestatic |
|
inlinestatic |
When we're in a middle of a basic block, return the next insn immediately, but only when SUCCS_NORMAL is set.
First, try loop exits, if we have them.
If we have found a successor, then great.
If not, then try the next edge.
Consider bb as a possible loop header.
Get all loop exits recursively.
Move the iterator now, because we won't do succ_iter_next until loop exits will end.
bb is not a loop header, check as usual.
If loop_exits are non null, we have found an inner loop; do one more iteration to fetch an edge from these exits.
Otherwise, we've found an edge in a usual way. Break now.
|
inlinestatic |
|
inlinestatic |
We need to return a succ_iterator to avoid 'unitialized' warning during bootstrap.
Avoid 'uninitialized' warning.
Avoid 'uninitialized' warning.
|
inlinestatic |
_xlist_t functions.
|
inlinestatic |
void add_clean_fence_to_fences | ( | flist_tail_t | , |
insn_t | , | ||
fence_t | |||
) |
void add_dirty_fence_to_fences | ( | flist_tail_t | , |
insn_t | , | ||
fence_t | |||
) |
void alloc_sched_pools | ( | void | ) |
Av set functions.
Referenced by find_place_for_bookkeeping().
void av_set_clear | ( | av_set_t * | ) |
Referenced by choose_best_reg(), and move_exprs_to_boundary().
Referenced by move_exprs_to_boundary(), and moveup_expr_cached().
Referenced by choose_best_reg(), and moveup_expr_cached().
void av_set_iter_remove | ( | av_set_iterator * | ) |
Referenced by find_used_regs(), get_spec_check_type_for_insn(), moveup_expr_cached(), and update_data_sets().
void av_set_leave_one_nonspec | ( | av_set_t * | ) |
Referenced by move_exprs_to_boundary().
void av_set_split_usefulness | ( | av_set_t | , |
int | , | ||
int | |||
) |
void av_set_substract_cond_branches | ( | av_set_t * | ) |
bool bb_ends_ebb_p | ( | basic_block | ) |
Referenced by code_motion_path_driver_cleanup().
bool bb_header_p | ( | insn_t | ) |
|
inlinestatic |
Return the next block of BB not running into inconsistencies.
Referenced by code_motion_path_driver_cleanup().
void blist_remove | ( | blist_t * | ) |
bool bookkeeping_can_be_created_if_moved_through_p | ( | insn_t | ) |
Referenced by rtx_ok_for_substitution_p().
void clear_expr | ( | expr_t | ) |
void clear_outdated_rtx_info | ( | basic_block | ) |
|
read |
bool considered_for_pipelining_p | ( | struct loop * | ) |
void copy_data_sets | ( | basic_block | , |
basic_block | |||
) |
Expression transformation routines.
Referenced by count_occurrences_equiv().
tc_t create_target_context | ( | bool | ) |
Target context functions.
Referenced by rtx_ok_for_substitution_p().
void def_list_add | ( | def_list_t * | , |
insn_t | , | ||
bool | |||
) |
void exchange_data_sets | ( | basic_block | , |
basic_block | |||
) |
unsigned expr_dest_regno | ( | expr_t | ) |
Referenced by choose_best_pseudo_reg(), find_place_for_bookkeeping(), and invoke_reorder_hooks().
basic_block fallthru_bb_of_jump | ( | rtx | ) |
int find_in_history_vect | ( | vec< expr_history_def > | , |
rtx | , | ||
vinsn_t | , | ||
bool | |||
) |
void flist_clear | ( | flist_t * | ) |
Referenced by has_preds_in_current_region_p().
void flist_tail_init | ( | flist_tail_t | ) |
void free_bb_note_pool | ( | void | ) |
void free_data_for_scheduled_insn | ( | insn_t | ) |
void free_data_sets | ( | basic_block | ) |
void free_lv_sets | ( | void | ) |
void free_nop_and_exit_insns | ( | void | ) |
void free_nop_pool | ( | void | ) |
void free_nop_vinsn | ( | void | ) |
void free_regset_pool | ( | void | ) |
void free_sched_pools | ( | void | ) |
void free_succs_info | ( | struct succs_info * | ) |
Collect all loop exits recursively, skipping empty BBs between them. E.g. if BB is a loop header which has several loop exits, traverse all of them and if any of them turns out to be another loop header (after skipping empty BBs), add its loop exits to the resulting vector as well.
If bb is empty, and we're skipping to loop exits, then consider bb as a possible gate to the inner loop now.
This empty block could only lead outside the region.
And now check whether we should skip over inner loop.
Traverse all loop headers.
Add all loop exits for the current edge into the resulting vector.
Remove the original edge.
Decrease the loop counter so we won't skip anything.
References succ_iterator::current_exit, succ_iterator::ei, ei_next(), and succ_iterator::loop_exits.
int get_av_level | ( | insn_t | ) |
regset get_clear_regset_from_pool | ( | void | ) |
Referenced by count_occurrences_1().
Return exit edges of LOOP, filtering out edges with the same dest bb.
loop_p get_loop_nest_for_rgn | ( | unsigned | int | ) |
Referenced by emit_bookkeeping_insn().
regset get_regset_from_pool | ( | void | ) |
Pool functions.
int get_seqno_by_preds | ( | rtx | ) |
Referenced by choose_best_insn().
Referenced by create_speculation_check().
bool in_current_region_p | ( | basic_block | ) |
Referenced by move_op_on_enter().
void init_fences | ( | insn_t | ) |
Fences functions.
void init_lv_sets | ( | void | ) |
Various initialization functions.
|
inlinestatic |
True when BB is a header of the inner loop.
If successor belongs to another loop.
Could be '=' here because of wrong loop depths.
References succ_iterator::bb, succ_iterator::bb_end, edge_iterator::container, succ_iterator::current_exit, succ_iterator::current_flags, succ_iterator::e1, succ_iterator::e2, succ_iterator::ei, succ_iterator::flags, edge_iterator::index, succ_iterator::loop_exits, and basic_block_def::succs.
void insert_in_history_vect | ( | vec< expr_history_def > * | , |
unsigned | , | ||
enum | local_trans_type, | ||
vinsn_t | , | ||
vinsn_t | , | ||
ds_t | |||
) |
bool insn_at_boundary_p | ( | insn_t | ) |
bool insn_eligible_for_subst_p | ( | insn_t | ) |
sel_insn_data_def insn_sid | ( | insn_t | ) |
void make_region_from_loop_preheader | ( | vec< basic_block > *& | ) |
expr_t merge_with_other_exprs | ( | av_set_t * | , |
av_set_iterator * | , | ||
expr_t | |||
) |
void move_fence_to_fences | ( | flist_t | , |
flist_tail_t | |||
) |
Referenced by advance_one_cycle().
void purge_empty_blocks | ( | void | ) |
void recompute_vinsn_lhs_rhs | ( | vinsn_t | ) |
Referenced by sel_target_adjust_priority().
void reset_target_context | ( | tc_t | , |
bool | |||
) |
void return_nop_to_pool | ( | insn_t | , |
bool | |||
) |
void return_regset_to_pool | ( | regset | ) |
void sel_add_loop_preheaders | ( | bb_vec_t * | ) |
|
static |
References succ_iterator::current_exit, and succ_iterator::loop_exits.
bool sel_bb_empty_p | ( | basic_block | ) |
insn_t sel_bb_end | ( | basic_block | ) |
Referenced by moveup_set_expr().
bool sel_bb_end_p | ( | insn_t | ) |
insn_t sel_bb_head | ( | basic_block | ) |
Basic block and CFG functions.
Referenced by code_motion_process_successors(), estimate_insn_cost(), and move_exprs_to_boundary().
bool sel_bb_head_p | ( | insn_t | ) |
void sel_clear_has_dependence | ( | void | ) |
Dependence analysis functions.
basic_block sel_create_recovery_block | ( | insn_t | ) |
Referenced by try_replace_dest_reg().
void sel_extend_global_bb_info | ( | void | ) |
void sel_finish_bbs | ( | void | ) |
void sel_finish_global_and_expr | ( | void | ) |
void sel_finish_global_bb_info | ( | void | ) |
void sel_finish_pipelining | ( | void | ) |
void sel_init_bbs | ( | bb_vec_t | ) |
Referenced by code_motion_process_successors().
void sel_init_global_and_expr | ( | bb_vec_t | ) |
Referenced by code_motion_process_successors().
void sel_init_invalid_data_sets | ( | insn_t | ) |
void sel_init_pipelining | ( | void | ) |
bool sel_insn_has_single_succ_p | ( | insn_t | , |
int | |||
) |
bool sel_is_loop_preheader_p | ( | basic_block | ) |
Referenced by code_motion_process_successors().
bool sel_num_cfg_preds_gt_1 | ( | insn_t | ) |
bool sel_redirect_edge_and_branch | ( | edge | , |
basic_block | |||
) |
void sel_redirect_edge_and_branch_force | ( | edge | , |
basic_block | |||
) |
void sel_register_cfg_hooks | ( | void | ) |
Referenced by code_motion_process_successors().
bool sel_remove_insn | ( | insn_t | , |
bool | , | ||
bool | |||
) |
void sel_save_haifa_priorities | ( | void | ) |
void sel_sched_region | ( | int | ) |
void sel_set_sched_flags | ( | void | ) |
Referenced by code_motion_process_successors().
void sel_setup_sched_infos | ( | void | ) |
Referenced by code_motion_process_successors().
basic_block sel_split_edge | ( | edge | ) |
void sel_unregister_cfg_hooks | ( | void | ) |
int sel_vinsn_cost | ( | vinsn_t | ) |
void set_target_context | ( | tc_t | ) |
void setup_nop_and_exit_insns | ( | void | ) |
void setup_nop_vinsn | ( | void | ) |
Referenced by code_motion_process_successors().
bool tidy_control_flow | ( | basic_block | , |
bool | |||
) |
void vinsn_attach | ( | vinsn_t | ) |
bool vinsn_cond_branch_p | ( | vinsn_t | ) |
void vinsn_detach | ( | vinsn_t | ) |
bool vinsn_separable_p | ( | vinsn_t | ) |
Vinsns functions.
basic_block after_recovery |
Some needed definitions.
Basic block just before the EXIT_BLOCK and after recovery, if we have created it.
sbitmap bbs_pipelined |
Saves pipelined blocks. Bitmap is indexed by bb->index.
bitmap blocks_to_reschedule |
Blocks that need to be rescheduled after pipelining.
bool bookkeeping_p |
True if bookkeeping is enabled.
struct loop* current_loop_nest |
The loop nest being pipelined.
Referenced by code_motion_process_successors().
bool enable_moveup_set_path_p |
Various flags.
rtx exit_insn |
An insn that 'contained' in EXIT block.
flist_t fences |
A list of fences currently in the works.
Current fences.
Referenced by has_preds_in_current_region_p().
bitmap_head* forced_ebb_heads |
Used in bb_in_ebb_p.
int global_level |
A global level shows whether an insn is valid or not.
GLOBAL_LEVEL is used to discard information stored in basic block headers av_sets. Av_set of bb header is valid if its (bb header's) level is equal to GLOBAL_LEVEL. And invalid if lesser. This is primarily used to advance scheduling window.
Referenced by equal_after_moveup_path_p().
int max_insns_to_rename |
Maximum number of insns that are eligible for renaming.
rtx nop_pattern |
A NOP pattern used as a placeholder for real insns.
bool pipelining_p |
@verbatim
Instruction scheduling pass. Selective scheduler and pipeliner. Copyright (C) 2006-2013 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/.
Implementation of selective scheduling approach. The below implementation follows the original approach with the following changes: o the scheduler works after register allocation (but can be also tuned to work before RA); o some instructions are not copied or register renamed; o conditional jumps are not moved with code duplication; o several jumps in one parallel group are not supported; o when pipelining outer loops, code motion through inner loops is not supported; o control and data speculation are supported; o some improvements for better compile time/performance were made. Terminology =========== A vinsn, or virtual insn, is an insn with additional data characterizing insn pattern, such as LHS, RHS, register sets used/set/clobbered, etc. Vinsns also act as smart pointers to save memory by reusing them in different expressions. A vinsn is described by vinsn_t type. An expression is a vinsn with additional data characterizing its properties at some point in the control flow graph. The data may be its usefulness, priority, speculative status, whether it was renamed/subsituted, etc. An expression is described by expr_t type. Availability set (av_set) is a set of expressions at a given control flow point. It is represented as av_set_t. The expressions in av sets are kept sorted in the terms of expr_greater_p function. It allows to truncate the set while leaving the best expressions. A fence is a point through which code motion is prohibited. On each step, we gather a parallel group of insns at a fence. It is possible to have multiple fences. A fence is represented via fence_t. A boundary is the border between the fence group and the rest of the code. Currently, we never have more than one boundary per fence, as we finalize the fence group when a jump is scheduled. A boundary is represented via bnd_t. High-level overview =================== The scheduler finds regions to schedule, schedules each one, and finalizes. The regions are formed starting from innermost loops, so that when the inner loop is pipelined, its prologue can be scheduled together with yet unprocessed outer loop. The rest of acyclic regions are found using extend_rgns: the blocks that are not yet allocated to any regions are traversed in top-down order, and a block is added to a region to which all its predecessors belong; otherwise, the block starts its own region. The main scheduling loop (sel_sched_region_2) consists of just scheduling on each fence and updating fences. For each fence, we fill a parallel group of insns (fill_insns) until some insns can be added. First, we compute available exprs (av-set) at the boundary of the current group. Second, we choose the best expression from it. If the stall is required to schedule any of the expressions, we advance the current cycle appropriately. So, the final group does not exactly correspond to a VLIW word. Third, we move the chosen expression to the boundary (move_op) and update the intermediate av sets and liveness sets. We quit fill_insns when either no insns left for scheduling or we have scheduled enough insns so we feel like advancing a scheduling point. Computing available expressions =============================== The computation (compute_av_set) is a bottom-up traversal. At each insn, we're moving the union of its successors' sets through it via moveup_expr_set. The dependent expressions are removed. Local transformations (substitution, speculation) are applied to move more exprs. Then the expr corresponding to the current insn is added. The result is saved on each basic block header. When traversing the CFG, we're moving down for no more than max_ws insns. Also, we do not move down to ineligible successors (is_ineligible_successor), which include moving along a back-edge, moving to already scheduled code, and moving to another fence. The first two restrictions are lifted during pipelining, which allows us to move insns along a back-edge. We always have an acyclic region for scheduling because we forbid motion through fences. Choosing the best expression ============================ We sort the final availability set via sel_rank_for_schedule, then we remove expressions which are not yet ready (tick_check_p) or which dest registers cannot be used. For some of them, we choose another register via find_best_reg. To do this, we run find_used_regs to calculate the set of registers which cannot be used. The find_used_regs function performs a traversal of code motion paths for an expr. We consider for renaming only registers which are from the same regclass as the original one and using which does not interfere with any live ranges. Finally, we convert the resulting set to the ready list format and use max_issue and reorder* hooks similarly to the Haifa scheduler. Scheduling the best expression ============================== We run the move_op routine to perform the same type of code motion paths traversal as in find_used_regs. (These are working via the same driver, code_motion_path_driver.) When moving down the CFG, we look for original instruction that gave birth to a chosen expression. We undo the transformations performed on an expression via the history saved in it. When found, we remove the instruction or leave a reg-reg copy/speculation check if needed. On a way up, we insert bookkeeping copies at each join point. If a copy is not needed, it will be removed later during this traversal. We update the saved av sets and liveness sets on the way up, too. Finalizing the schedule ======================= When pipelining, we reschedule the blocks from which insns were pipelined to get a tighter schedule. On Itanium, we also perform bundling via the same routine from ia64.c. Dependence analysis changes =========================== We augmented the sched-deps.c with hooks that get called when a particular dependence is found in a particular part of an insn. Using these hooks, we can do several actions such as: determine whether an insn can be moved through another (has_dependence_p, moveup_expr); find out whether an insn can be scheduled on the current cycle (tick_check_p); find out registers that are set/used/clobbered by an insn and find out all the strange stuff that restrict its movement, like SCHED_GROUP_P or CANT_MOVE (done in init_global_and_expr_for_insn). Initialization changes ====================== There are parts of haifa-sched.c, sched-deps.c, and sched-rgn.c that are reused in all of the schedulers. We have split up the initialization of data of such parts into different functions prefixed with scheduler type and postfixed with the type of data initialized: {,sel_,haifa_}sched_{init,finish}, sched_rgn_init/finish, sched_deps_init/finish, sched_init_{luids/bbs}, etc. The same splitting is done with current_sched_info structure: dependence-related parts are in sched_deps_info, common part is in common_sched_info, and haifa/sel/etc part is in current_sched_info. Target contexts =============== As we now have multiple-point scheduling, this would not work with backends which save some of the scheduler state to use it in the target hooks. For this purpose, we introduce a concept of target contexts, which encapsulate such information. The backend should implement simple routines of allocating/freeing/setting such a context. The scheduler calls these as target hooks and handles the target context as an opaque pointer (similar to the DFA state type, state_t). Various speedups ================ As the correct data dependence graph is not supported during scheduling (which is to be changed in mid-term), we cache as much of the dependence analysis results as possible to avoid reanalyzing. This includes: bitmap caches on each insn in stream of the region saying yes/no for a query with a pair of UIDs; hashtables with the previously done transformations on each insn in stream; a vector keeping a history of transformations on each expr. Also, we try to minimize the dependence context used on each fence to check whether the given expression is ready for scheduling by removing from it insns that are definitely completed the execution. The results of tick_check_p checks are also cached in a vector on each fence. We keep a valid liveness set on each insn in a region to avoid the high cost of recomputation on large basic blocks. Finally, we try to minimize the number of needed updates to the availability sets. The updates happen in two cases: when fill_insns terminates, we advance all fences and increase the stage number to show that the region has changed and the sets are to be recomputed; and when the next iteration of a loop in fill_insns happens (but this one reuses the saved av sets on bb headers.) Thus, we try to break the fill_insns loop only when "significant" number of insns from the current scheduling window was scheduled. This should be made a target param. TODO: correctly support the data dependence graph at all stages and get rid of all caches. This should speed up the scheduler. TODO: implement moving cond jumps with bookkeeping copies on both targets. TODO: tune the scheduler before RA so it does not create too much pseudos. References: S.-M. Moon and K. Ebcioglu. Parallelizing nonnumerical code with selective scheduling and software pipelining. ACM TOPLAS, Vol 19, No. 6, pages 853--898, Nov. 1997. Andrey Belevantsev, Maxim Kuvyrkov, Vladimir Makarov, Dmitry Melnik, and Dmitry Zhurikhin. An interblock VLIW-targeted instruction scheduler for GCC. In Proceedings of GCC Developers' Summit 2006. Arutyun Avetisyan, Andrey Belevantsev, and Dmitry Melnik. GCC Instruction Scheduler and Software Pipeliner on the Itanium Platform. EPIC-7 Workshop. http://rogue.colorado.edu/EPIC7/.
True when pipelining is enabled.
Referenced by choose_best_insn(), and compute_live_below_insn().
bool preheader_removed |
Referenced by code_motion_process_successors().
vec<sel_insn_data_def> s_i_d |
Referenced by dump_insn_vector().
alloc_pool sched_lists_pool |
_list_t functions. All of _*list_* functions are used through accessor macros, thus we can't move them in sel-sched-ir.c.
regset sel_all_regs |
vec<sel_global_bb_info_def> sel_global_bb_info |
Per basic block data. This array is indexed by basic block index.
vec<sel_region_bb_info_def> sel_region_bb_info |
Per basic block data. This array is indexed by basic block index.