^=== analyzing loop ===
Analyzing loop at loop-interchange.c:20
=== analyze_loop_nest ===
=== vect_analyze_loop_form ===
=== get_loop_niters ===
=== vect_analyze_data_refs ===
got vectype for stmt:
_5 = u[_4];
vector(2) double
=== vect_analyze_scalar_cycles ===
Analyze phi:
sum_40 = PHI <sum_42(5), sum_7(11)>
Access function of PHI:
{sum_42, +, _5}_3
step:
_5
, init:
sum_42
step unknown.
Analyze phi:
i_26 = PHI <0(5), i_25(11)>
Access function of PHI:
{0, +, 1}_3
step:
1
, init:
0
Detected induction.
Analyze phi:
ivtmp_24 = PHI <1335(5), ivtmp_19(11)>
Access function of PHI:
{1335, +, 4294967295}_3
step:
4294967295
, init:
1335
Detected induction.
Analyze phi:
ivtmp_45 = PHI <1335(5), ivtmp_46(11)>
Access function of PHI:
{1335, +, 4294967295}_3
step:
4294967295
, init:
1335
Detected induction.
Analyze phi:
sum_40 = PHI <sum_42(5), sum_7(11)>
detected reduction:
sum_7 = _5 + sum_40;
Detected reduction.
=== vect_pattern_recog ===
vect_is_simple_use: operand
j_17
def_stmt:
j_17 = PHI <0(2), j_10(10)>
type of def: external
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
=== vect_analyze_data_ref_accesses ===
=== vect_mark_stmts_to_be_vectorized ===
init: phi relevant?
sum_40 = PHI <sum_42(5), sum_7(11)>
init: phi relevant?
i_26 = PHI <0(5), i_25(11)>
init: phi relevant?
ivtmp_24 = PHI <1335(5), ivtmp_19(11)>
init: phi relevant?
ivtmp_45 = PHI <1335(5), ivtmp_46(11)>
init: stmt relevant?
# DEBUG sum => NULL
init: stmt relevant?
# DEBUG i => i_26
init: stmt relevant?
# DEBUG sum => NULL
init: stmt relevant?
# DEBUG j => 0
init: stmt relevant?
# DEBUG sum => sum_40
init: stmt relevant?
# DEBUG j => j_17
init: stmt relevant?
# DEBUG BEGIN_STMT
init: stmt relevant?
_2 = j_17 * 1335;
init: stmt relevant?
_4 = _2 + i_26;
init: stmt relevant?
_5 = u[_4];
init: stmt relevant?
sum_7 = _5 + sum_40;
vec_stmt_relevant_p: used out of loop.
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
vec_stmt_relevant_p: stmt live but not relevant.
mark relevant 1, live 1:
sum_7 = _5 + sum_40;
init: stmt relevant?
# DEBUG sum => sum_7
init: stmt relevant?
# DEBUG D#2 => j_17 + 1
init: stmt relevant?
# DEBUG j => D#2
init: stmt relevant?
# DEBUG sum => sum_7
init: stmt relevant?
# DEBUG j => D#2
init: stmt relevant?
# DEBUG i => NULL
init: stmt relevant?
# DEBUG sum => NULL
init: stmt relevant?
# DEBUG i => NULL
init: stmt relevant?
ivtmp_19 = ivtmp_24 - 1;
init: stmt relevant?
i_25 = i_26 + 1;
init: stmt relevant?
ivtmp_46 = ivtmp_45 - 1;
init: stmt relevant?
if (ivtmp_46 != 0)
worklist: examine stmt:
sum_7 = _5 + sum_40;
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
mark relevant 1, live 0:
_5 = u[_4];
vect_is_simple_use: operand
sum_40
def_stmt:
sum_40 = PHI <sum_42(5), sum_7(11)>
type of def: reduction
mark relevant 1, live 0:
sum_40 = PHI <sum_42(5), sum_7(11)>
worklist: examine stmt:
sum_40 = PHI <sum_42(5), sum_7(11)>
vect_is_simple_use: operand
sum_42
def_stmt:
sum_42 = PHI <0.0(2), sum_37(10)>
type of def: external
def_stmt is out of loop.
vect_is_simple_use: operand
sum_7
def_stmt:
sum_7 = _5 + sum_40;
type of def: reduction
reduc-stmt defining reduc-phi in the same nest.
worklist: examine stmt:
_5 = u[_4];
=== vect_analyze_data_ref_dependences ===
=== vect_determine_vectorization_factor ===
==> examining phi:
sum_40 = PHI <sum_42(5), sum_7(11)>
get vectype for scalar type:
double
vectype:
vector(2) double
nunits = 2
==> examining phi:
i_26 = PHI <0(5), i_25(11)>
==> examining phi:
ivtmp_24 = PHI <1335(5), ivtmp_19(11)>
==> examining phi:
ivtmp_45 = PHI <1335(5), ivtmp_46(11)>
==> examining statement:
# DEBUG sum => NULL
skip.
==> examining statement:
# DEBUG i => i_26
skip.
==> examining statement:
# DEBUG sum => NULL
skip.
==> examining statement:
# DEBUG j => 0
skip.
==> examining statement:
# DEBUG sum => sum_40
skip.
==> examining statement:
# DEBUG j => j_17
skip.
==> examining statement:
# DEBUG BEGIN_STMT
skip.
==> examining statement:
_2 = j_17 * 1335;
skip.
==> examining statement:
_4 = _2 + i_26;
skip.
==> examining statement:
_5 = u[_4];
get vectype for scalar type:
double
vectype:
vector(2) double
nunits = 2
==> examining statement:
sum_7 = _5 + sum_40;
get vectype for scalar type:
double
vectype:
vector(2) double
get vectype for scalar type:
double
vectype:
vector(2) double
nunits = 2
==> examining statement:
# DEBUG sum => sum_7
skip.
==> examining statement:
# DEBUG D#2 => j_17 + 1
skip.
==> examining statement:
# DEBUG j => D#2
skip.
==> examining statement:
# DEBUG sum => sum_7
skip.
==> examining statement:
# DEBUG j => D#2
skip.
==> examining statement:
# DEBUG i => NULL
skip.
==> examining statement:
# DEBUG sum => NULL
skip.
==> examining statement:
# DEBUG i => NULL
skip.
==> examining statement:
ivtmp_19 = ivtmp_24 - 1;
skip.
==> examining statement:
i_25 = i_26 + 1;
skip.
==> examining statement:
ivtmp_46 = ivtmp_45 - 1;
skip.
==> examining statement:
if (ivtmp_46 != 0)
skip.
vectorization factor = 2
=== vect_analyze_slp ===
=== vect_make_slp_decision ===
vectorization_factor = 2, niters = 1335
=== vect_analyze_data_refs_alignment ===
recording new base alignment for
&u
alignment: 16
misalignment: 0
based on:
_5 = u[_4];
vect_compute_data_ref_alignment:
Unknown alignment for access:
u[_4]
=== vect_prune_runtime_alias_test_list ===
=== vect_enhance_data_refs_alignment ===
Unknown misalignment, naturally aligned
vect_can_advance_ivs_p:
Analyze phi:
sum_40 = PHI <sum_42(5), sum_7(11)>
reduc or virtual phi. skip.
Analyze phi:
i_26 = PHI <0(5), i_25(11)>
Analyze phi:
ivtmp_24 = PHI <1335(5), ivtmp_19(11)>
Analyze phi:
ivtmp_45 = PHI <1335(5), ivtmp_46(11)>
vect_model_load_cost: aligned.
vect_get_data_access_cost: inside_cost = 12, outside_cost = 0.
vect_model_load_cost: unaligned supported by hardware.
vect_get_data_access_cost: inside_cost = 12, outside_cost = 0.
Vectorizing an unaligned access.
=== vect_analyze_loop_operations ===
examining phi:
sum_40 = PHI <sum_42(5), sum_7(11)>
examining phi:
i_26 = PHI <0(5), i_25(11)>
examining phi:
ivtmp_24 = PHI <1335(5), ivtmp_19(11)>
examining phi:
ivtmp_45 = PHI <1335(5), ivtmp_46(11)>
==> examining statement:
# DEBUG sum => NULL
irrelevant.
==> examining statement:
# DEBUG i => i_26
irrelevant.
==> examining statement:
# DEBUG sum => NULL
irrelevant.
==> examining statement:
# DEBUG j => 0
irrelevant.
==> examining statement:
# DEBUG sum => sum_40
irrelevant.
==> examining statement:
# DEBUG j => j_17
irrelevant.
==> examining statement:
# DEBUG BEGIN_STMT
irrelevant.
==> examining statement:
_2 = j_17 * 1335;
irrelevant.
==> examining statement:
_4 = _2 + i_26;
irrelevant.
==> examining statement:
_5 = u[_4];
num. args = 4 (not unary/binary/ternary op).
vect_is_simple_use: operand
u[_4]
not ssa-name.
use not simple.
can't use a fully-masked loop because the target doesn't have the appropriate masked load or store.
vect_model_load_cost: unaligned supported by hardware.
vect_model_load_cost: inside_cost = 12, prologue_cost = 0 .
==> examining statement:
sum_7 = _5 + sum_40;
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
vect_is_simple_use: operand
sum_40
def_stmt:
sum_40 = PHI <sum_42(5), sum_7(11)>
type of def: reduction
reduc op not supported by target.
vect_model_reduction_cost: inside_cost = 12, prologue_cost = 4, epilogue_cost = 28 .
==> examining statement:
# DEBUG sum => sum_7
irrelevant.
==> examining statement:
# DEBUG D#2 => j_17 + 1
irrelevant.
==> examining statement:
# DEBUG j => D#2
irrelevant.
==> examining statement:
# DEBUG sum => sum_7
irrelevant.
==> examining statement:
# DEBUG j => D#2
irrelevant.
==> examining statement:
# DEBUG i => NULL
irrelevant.
==> examining statement:
# DEBUG sum => NULL
irrelevant.
==> examining statement:
# DEBUG i => NULL
irrelevant.
==> examining statement:
ivtmp_19 = ivtmp_24 - 1;
irrelevant.
==> examining statement:
i_25 = i_26 + 1;
irrelevant.
==> examining statement:
ivtmp_46 = ivtmp_45 - 1;
irrelevant.
==> examining statement:
if (ivtmp_46 != 0)
irrelevant.
not using a fully-masked loop.
Cost model analysis:
Vector inside of loop cost: 24
Vector prologue cost: 4
Vector epilogue cost: 44
Scalar iteration cost: 16
Scalar outside cost: 0
Vector outside cost: 48
prologue iterations: 0
epilogue iterations: 1
Calculated minimum iters for profitability: 10
Runtime profitability threshold = 10
Static estimate profitability threshold = 10
epilog loop required
vect_can_advance_ivs_p:
Analyze phi:
sum_40 = PHI <sum_42(5), sum_7(11)>
reduc or virtual phi. skip.
Analyze phi:
i_26 = PHI <0(5), i_25(11)>
Analyze phi:
ivtmp_24 = PHI <1335(5), ivtmp_19(11)>
Analyze phi:
ivtmp_45 = PHI <1335(5), ivtmp_46(11)>
loop vectorized
=== vec_transform_loop ===
vect_can_advance_ivs_p:
Analyze phi:
sum_40 = PHI <sum_7(11), sum_42(5)>
reduc or virtual phi. skip.
Analyze phi:
i_26 = PHI <i_25(11), 0(5)>
Analyze phi:
ivtmp_24 = PHI <ivtmp_19(11), 1335(5)>
Analyze phi:
ivtmp_45 = PHI <ivtmp_46(11), 1335(5)>
vect_update_ivs_after_vectorizer: phi:
sum_40 = PHI <sum_7(11), sum_42(5)>
reduc or virtual phi. skip.
vect_update_ivs_after_vectorizer: phi:
i_26 = PHI <i_25(11), 0(5)>
vect_update_ivs_after_vectorizer: phi:
ivtmp_24 = PHI <ivtmp_19(11), 1335(5)>
vect_update_ivs_after_vectorizer: phi:
ivtmp_45 = PHI <ivtmp_46(11), 1335(5)>
------>vectorizing phi:
sum_40 = PHI <sum_7(11), sum_42(16)>
transform phi.
------>vectorizing phi:
i_26 = PHI <i_25(11), 0(16)>
------>vectorizing phi:
ivtmp_24 = PHI <ivtmp_19(11), 1335(16)>
------>vectorizing phi:
ivtmp_45 = PHI <ivtmp_46(11), 1335(16)>
------>vectorizing phi:
vect_sum_7.15_51 = PHI <(11), (16)>
------>vectorizing statement:
# DEBUG sum => NULL
------>vectorizing statement:
# DEBUG i => i_26
------>vectorizing statement:
# DEBUG sum => NULL
------>vectorizing statement:
# DEBUG j => 0
------>vectorizing statement:
# DEBUG sum => sum_40
------>vectorizing statement:
# DEBUG j => j_17
------>vectorizing statement:
# DEBUG BEGIN_STMT
------>vectorizing statement:
_2 = j_17 * 1335;
------>vectorizing statement:
_4 = _2 + i_26;
------>vectorizing statement:
_5 = u[_4];
transform statement.
transform load. ncopies = 1
create vector_type-pointer variable to type:
vector(2) double
vectorizing an array ref:
u
created
vectp_u.17_52
add new stmt:
vect__5.18_58 = MEM[(double *)vectp_u.16_56];
------>vectorizing statement:
sum_7 = _5 + sum_40;
transform statement.
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
vect_is_simple_use: operand
sum_40
def_stmt:
sum_40 = PHI <sum_7(11), sum_42(16)>
type of def: reduction
reduc op not supported by target.
transform reduction.
vect_get_vec_def_for_operand:
_5
vect_is_simple_use: operand
_5
def_stmt:
_5 = u[_4];
type of def: internal
def_stmt =
_5 = u[_4];
vect_get_vec_def_for_operand:
sum_40
vect_is_simple_use: operand
sum_40
def_stmt:
sum_40 = PHI <sum_7(11), sum_42(16)>
type of def: reduction
def_stmt =
sum_40 = PHI <sum_7(11), sum_42(16)>
add new stmt:
vect_sum_7.19_59 = vect__5.18_58 + vect_sum_7.15_51;
vect_is_simple_use: operand
sum_42
def_stmt:
sum_42 = PHI <0.0(2), sum_37(10)>
type of def: external
transform reduction: created def-use cycle:
vect_sum_7.15_51 = PHI <vect_sum_7.19_59(11), { 0.0, 0.0 }(16)>
vect_sum_7.19_59 = vect__5.18_58 + vect_sum_7.15_51;
Reduce using vector shifts
extract scalar result
------>vectorizing statement:
# DEBUG sum => sum_7
------>vectorizing statement:
# DEBUG D#2 => j_17 + 1
------>vectorizing statement:
# DEBUG j => D#2
------>vectorizing statement:
# DEBUG sum => sum_7
------>vectorizing statement:
# DEBUG j => D#2
------>vectorizing statement:
# DEBUG i => NULL
------>vectorizing statement:
# DEBUG sum => NULL
------>vectorizing statement:
# DEBUG i => NULL
------>vectorizing statement:
ivtmp_19 = ivtmp_24 - 1;
------>vectorizing statement:
i_25 = i_26 + 1;
------>vectorizing statement:
ivtmp_46 = ivtmp_45 - 1;
------>vectorizing statement:
vectp_u.16_57 = vectp_u.16_56 + 16;
------>vectorizing statement:
if (ivtmp_46 != 0)
New loop exit condition:
if (ivtmp_66 < 667)
LOOP VECTORIZED