Tuesday, June 23, 2015

BASH Matrix Multiplication

tl;dr Bash is not the language for math-intensive operations.





REPS=$1;
FILE_1=$2;
FILE_2=$3
OUTFILENAME=$4;

readonly COLS=`head -1 $FILE_1 | wc -w`;
readonly ROWS=`cat $FILE_1 | wc -l`;
# echo "rows is $ROWS; cols is $COLS"
if [[ $ROWS != $COLS ]]; then
    echo "Expecting square matrices, " \
         "but rows = $ROWS, cols = $COLS\n";
    exit 1;
fi

# --------------------------------------------------
# SUBROUTINES
#
function outputMatrix()
{
    local matrixname=$1;
    local matrix;
    local elem;
    echo "matrix is '$matrixname'.";
    eval matrix=\( \${${matrixname}[@]} \);
    local i=0;
    for elem in "${matrix[@]}"; do
 echo -n "$elem ";
 if (( ++i == $COLS )); then
     echo '';
     i=0;
 fi
    done  
}

function multiply()
{
    declare -a product;
    local M=$1 N=$2;
    local i j k idx1 idx2 idx3;
    for ((i=0; i < $ROWS; i++ )); do
        for ((j=0; j<$COLS; j++)); do
            local result=0;
            for ((k=0; k<$ROWS; k++)); do
                idx1=$((i*$ROWS+k));
                idx2=$((k*$COLS+j));
                result=$((result+${M}[idx1]*${N}[idx2]));
            done
            idx3=$((i*$COLS+j));
            product[$idx3]=$result;
        done
    done
#    outputMatrix product;
}
# --------------------------------------------------
#

matrix1=($(cat $FILE_1));
matrix2=($(cat $FILE_2));


start=`date  +%s.%N`;
for ((repeat = 0; repeat < $REPS; repeat++)); do
    multiply matrix1 matrix2 
done
end=`date  +%s.%N`;
echo "$end - $start" | bc

My core multiplication routine is similar to the one at RosettaCode. Results are drastically inferior compared to Perl or JavaScript, a factor or 200x ~ 1000x as slow.


Notice that the 100x100 matrix has only a single data point, at 55,000 multiplications per second. That took 547 seconds; I wasn't going to wait around 2 hours for more repetitions.

Part of the problem is from flattening a two-dimensional matrix into a single one-dimensional array. Converting indices into a single array index overwhelms the actual matrix multiplication, resulting in 3N^3 + N^2 multiplications. These all happen in "user space", while Perl & JS have the opportunity to carry out array indexing in "language space", potentially more efficient.

Actually, simplifying the index calculations only improved runtime about 10%, not a whole lot. Possibly variable access overwhelms the difference between addition and multiplication.


    for ((i=0; i <; $ROWS; i++ )); do
        for ((j=0; j < $COLS; j++)); do
            local result=0;
            local idx1=$((i*$ROWS));
            local idx2=$j;

            for ((k=0; k < $ROWS; k++)); do
                result=$((result+${M}[idx1]*${N}[idx2]));
                ((idx1 += 1))
                ((idx2 += COLS))
            done
            product[$idx3]=$result;
            ((idx3 ++))
        done
    done



No comments: