Today I was asked this question. We have 2 cases with code blocks A, B and C. These code block don't share any resources except an iterator (int i).
Please give 3 possible reasons why case 1 could be faster than case 2, and 3 possible reasons why case 2 could be faster than case 1:
case 1
for (i=0; i<N; ++i){
A;
B;
C;
}
case 2
for (i=0; i<N; ++i){
A;
}
for (i=0; i<N; ++i){
B;
}
for (i=0; i<N; ++i){
C;
}
x
is faster because 1) profiling showed as much, 2) profiling revealed as much, and 3) profiling barfed on casey
. Alternative answer: depends on system cache size, code size of the functions, what the functions do (what data they access, which could make the previous point moot) etc... - rubenvbloop fusion
(case 1) andloop fission
(case 2), two optimizations sometimes performed by compilers. - Luc Touraillefor
loops in case 2? - Thomas Matthews