GCC Middle and Back End API Reference
profile.c File Reference

Data Structures

struct  bb_info

Functions

static void find_spanning_tree (struct edge_list *)
static unsigned instrument_edges ()
static void instrument_values ()
void get_working_sets ()
gcov_working_set_tfind_working_set ()
static gcov_typeget_exec_counts ()
static bool is_edge_inconsistent ()
static void correct_negative_edge_counts ()
static bool is_inconsistent ()
static void set_bb_counts ()
static int read_profile_edge_counts ()
static int compute_frequency_overlap ()
static void compute_branch_probabilities ()
static void compute_value_histograms (histogram_values values, unsigned cfg_checksum, unsigned lineno_checksum)
static void output_location (char const *file_name, int line, gcov_position_t *offset, basic_block bb)
void branch_prob ()
static basic_block find_group ()
static void union_groups ()
static void find_spanning_tree ()
void init_branch_prob ()
void end_branch_prob ()

Variables

struct gcov_ctr_summaryprofile_info
static gcov_working_set_t gcov_working_sets [NUM_GCOV_WORKING_SETS]
static int total_num_blocks
static int total_num_edges
static int total_num_edges_ignored
static int total_num_edges_instrumented
static int total_num_blocks_created
static int total_num_passes
static int total_num_times_called
static int total_hist_br_prob [20]
static int total_num_branches

Function Documentation

void branch_prob ( void  )
   Instrument and/or analyze program behavior based on program the CFG.

   This function creates a representation of the control flow graph (of
   the function being compiled) that is suitable for the instrumentation
   of edges and/or converting measured edge counts to counts on the
   complete CFG.

   When FLAG_PROFILE_ARCS is nonzero, this function instruments the edges in
   the flow graph that are needed to reconstruct the dynamic behavior of the
   flow graph.  This data is written to the gcno file for gcov.

   When FLAG_BRANCH_PROBABILITIES is nonzero, this function reads auxiliary
   information from the gcda file containing edge count information from
   previous executions of the function being compiled.  In this case, the
   control flow graph is annotated with actual execution counts by
   compute_branch_probabilities().

   Main entry point of this file.  
     We can't handle cyclic regions constructed using abnormal edges.
     To avoid these we replace every source of abnormal edge by a fake
     edge from entry node and every destination by fake edge to exit.
     This keeps graph acyclic and our calculation exact for all normal
     edges except for exit and entrance ones.

     We also add fake exit edges for each call and asm statement in the
     basic, since it may not return.  
         Functions returning multiple times are not handled by extra edges.
         Instead we simply allow negative counts on edges from exit to the
         block past call and corresponding probabilities.  We can't go
         with the extra edges because that would result in flowgraph that
         needs to have fake edges outside the spanning tree.  
             It may happen that there are compiler generated statements
             without a locus at all.  Go through the basic block from the
             last to the first statement looking for a locus.  
             Edge with goto locus might get wrong coverage info unless
             it is the only edge out of BB.
             Don't do that when the locuses match, so
             if (blah) goto something;
             is not computed twice.  
             Avoid bbs that have both fake entry edge and also some
             exit edge.  One of those edges wouldn't be added to the
             spanning tree, but we can't instrument any of them.  
                 Don't split the bbs containing __builtin_setjmp_receiver
                 or __builtin_setjmp_dispatcher calls.  These are very
                 special and don't expect anything to be inserted before
                 them.  
     The basic blocks are expected to be numbered sequentially.  
         Mark edges we've replaced by fake edges above as ignored.  
     Create spanning tree from basic block graph, mark each edge that is
     on the spanning tree.  We insert as many abnormal and critical edges
     as possible to minimize number of edge splits necessary.  
     Fake edges that are not on the tree will not be instrumented, so
     mark them ignored.  
     Compute two different checksums. Note that we want to compute
     the checksum in only once place, since it depends on the shape
     of the control flow which can change during 
     various transformations.  
     Write the data from which gcov can reconstruct the basic block
     graph and function line numbers (the gcno file).  
         Basic block flags 
         Arcs 
                     On trees we don't have fallthru flags, but we can
                     recompute them from CFG shape.  
         Line numbers.  
         Initialize the output.  
             Notice GOTO expressions eliminated while constructing the CFG.  
                 A file of NULL indicates the end of run.  
     For each edge not on the spanning tree, add counting code.  
         Commit changes done by instrumentation.  

References gimple_has_location(), and gsi_stmt().

static void compute_branch_probabilities ( )
static
   Compute the branch probabilities for the various branches.
   Annotate them accordingly.  

   CFG_CHECKSUM is the precomputed checksum for the CFG.  
     Very simple sanity checks so we catch bugs in our profiling code.  
     Attach extra info block to each bb.  
     Avoid predicting entry on exit nodes.  
     For every block in the file,
     - if every exit/entrance edge has a known count, then set the block count
     - if the block count is known, and every exit/entrance edge but one has
     a known execution count, then set the count of the remaining edge

     As edge counts are set, decrement the succ/pred count, but don't delete
     the edge, that way we can easily tell when all edges are known, or only
     one edge is unknown.  
     The order that the basic blocks are iterated through is important.
     Since the code that finds spanning trees starts with block 0, low numbered
     edges are put on the spanning tree in preference to high numbered edges.
     Hence, most instrumented edges are at the end.  Graph solving works much
     faster if we propagate numbers from the end to the start.

     This takes an average of slightly more than 3 passes.  
                     One of the counts will be invalid, but it is zero,
                     so adding it in also doesn't hurt.  
                     Search for the invalid edge, and set its count.  
                     Calculate count for remaining edge by conservation.  
                     One of the counts will be invalid, but it is zero,
                     so adding it in also doesn't hurt.  
                     Search for the invalid edge, and set its count.  
                     Calculate count for remaining edge by conservation.  
     If the graph has been correctly solved, every block will have a
     succ and pred count of zero.  
     Check for inconsistent basic block counts 
            Inconsistency detected. Make it flow-consistent. 
            Set bb counts to the sum of the outgoing edge counts 
     For every edge, calculate its branch probability and add a reg_note
     to the branch insn to indicate this.  
             Function may return twice in the cased the called function is
             setjmp or calls fork, but we can't represent this by extra
             edge from the entry, since extra edge from the exit is
             already present.  We get negative frequency from the entry
             point.  
                 Find the branch edge.  It is possible that we do have fake
                 edges here.  
         As a last resort, distribute the probabilities evenly.
         Use simple heuristics that if there are normal edges,
         give all abnormals frequency of 0, otherwise distribute the
         frequency over abnormals (this is the case of noreturn
         calls).  
static int compute_frequency_overlap ( )
static
   Compare the static estimated profile to the actual profile, and
   return the "degree of overlap" measure between them.

   Degree of overlap is a number between 0 and OVERLAP_BASE. It is
   the sum of each basic block's minimum relative weights between
   two profiles. And overlap of OVERLAP_BASE means two profiles are
   identical.  

References changes, error(), get_exec_counts(), gcov_ctr_summary::run_max, gcov_ctr_summary::runs, gcov_ctr_summary::sum_all, and gcov_ctr_summary::sum_max.

static void compute_value_histograms ( histogram_values  values,
unsigned  cfg_checksum,
unsigned  lineno_checksum 
)
static
   Load value histograms values whose description is stored in VALUES array
   from .gcda file.  

   CFG_CHECKSUM is the precomputed checksum for the CFG.  
void end_branch_prob ( void  )
   Performs file-level cleanup after branch-prob processing
   is completed.  

Referenced by rest_of_type_compilation().

static basic_block find_group ( )
static
   Union find algorithm implementation for the basic blocks using
   aux fields.  
     Compress path.  

References edge_def::dest, edge_info::ignore, basic_block_def::index, edge_def::src, and union_groups().

static void find_spanning_tree ( struct edge_list )
static
   Forward declarations.  
static void find_spanning_tree ( )
static
   This function searches all of the edges in the program flow graph, and puts
   as many bad edges as possible onto the spanning tree.  Bad edges include
   abnormals edges, which can't be instrumented at the moment.  Since it is
   possible for fake edges to form a cycle, we will have to develop some
   better way in the future.  Also put critical edges to the tree, since they
   are more expensive to instrument.  
     We use aux field for standard union-find algorithm.  
     Add fake edge exit to entry we can't instrument.  
     First add all abnormal edges to the tree unless they form a cycle. Also
     add all edges to EXIT_BLOCK_PTR to avoid inserting profiling code behind
     setting return value from function.  
     Now insert all critical edges to the tree unless they form a cycle.  
     And now the rest.  
gcov_working_set_t* find_working_set ( )
   Given a the desired percentage of the full profile (sum_all from the
   summary), multiplied by 10 to avoid float in PCT_TIMES_10, returns
   the corresponding working set information. If an exact match for
   the percentage isn't found, the closest value is used.  

References edge_info::ignore, edge_info::on_tree, and basic_block_def::succs.

static gcov_type* get_exec_counts ( )
static
   Computes hybrid profile for all matching entries in da_file.  
   
   CFG_CHECKSUM is the precomputed checksum for the CFG.  
     Count the edges to be (possibly) instrumented.  

Referenced by compute_frequency_overlap().

void get_working_sets ( void  )
   Fill the working set information into the profile_info structure.  
         Multiply the percentage by 100 to avoid float.  
             Print out the percentage using int arithmatic to avoid float.  

References dump_file, HOST_WIDEST_INT, HOST_WIDEST_INT_PRINT_DEC, gcov_working_set_info::min_counter, and gcov_working_set_info::num_counters.

void init_branch_prob ( void  )
   Perform file-level initialization for branch-prob processing.  
static unsigned instrument_edges ( )
static
   Add edge instrumentation code to the entire insn chain.

   F is the first insn of the chain.
   NUM_BLOCKS is the number of basic blocks found in F.  

References edge_def::dest, dump_file, edge_def::flags, gimple_gen_edge_profiler(), edge_info::ignore, basic_block_def::index, edge_info::on_tree, and edge_def::src.

static bool is_edge_inconsistent ( )
static
static bool is_inconsistent ( )
static
   Check consistency.
   Return true if inconsistency is found.  
static void output_location ( char const *  file_name,
int  line,
gcov_position_t offset,
basic_block  bb 
)
static
   When passed NULL as file_name, initialize.
   When passed something else, output the necessary commands to change
   line to LINE and offset to FILE_NAME.  
         If this is a new source file, then output the
         file's name to the .bb file.  

References add_noreturn_fake_exit_edges(), flow_call_edges_add(), gsi_end_p(), gsi_last_nondebug_bb(), gsi_prev_nondebug(), last, basic_block_def::succs, and total_num_times_called.

static int read_profile_edge_counts ( )
static
   Reads profile data and returns total number of edge counts read 
     For each edge not on the spanning tree, set its execution count from
     the .da file.  
     The first count in the .da file is the number of times that the function
     was entered.  This is the exec_count for block zero.  

References edge_def::dest, dump_enabled_p(), dump_printf_loc(), error(), basic_block_def::index, and input_location.

static void set_bb_counts ( )
static
   Set each basic block count to the sum of its outgoing edge counts 

References edge_info::ignore, edge_info::on_tree, and basic_block_def::succs.

static void union_groups ( )
static
     ??? I don't have a place for the rank field.  OK.  Lets go w/o it,
     this code is unlikely going to be performance problem anyway.  

References edge_def::dest, basic_block_def::index, and edge_def::src.

Referenced by find_group().


Variable Documentation

gcov_working_set_t gcov_working_sets[NUM_GCOV_WORKING_SETS]
static
   Counter working set information computed from the current counter
   summary. Not initialized unless profile_info summary is non-NULL.  
struct gcov_ctr_summary* profile_info
   Counter summary from the last set of coverage counts read.  

Referenced by bb_in_region_p().

int total_hist_br_prob[20]
static
int total_num_blocks
static
   Collect statistics on the performance of this pass for the entire source
   file.  
int total_num_blocks_created
static
int total_num_branches
static
int total_num_edges
static
int total_num_edges_ignored
static
int total_num_edges_instrumented
static
int total_num_passes
static
int total_num_times_called
static

Referenced by output_location().