We study the arithmetic complexity of iterated matrix multiplication. We show that any multilinear homogeneous depth 4 arithmetic formula computing the product of d generic matrices of size nn, IMMnd, has size n(d) as long as dn110 . This improves the result of Nisan and Wigderson (Computational Complexity, 1997) for depth 4 set-multilinear formulas.
We also study (O(dt))(t) formulas, which are depth 4 formulas with the stated bounds on the fan-ins of the gates. A recent depth reduction result of Tavenas (MFCS, 2013) shows that any n-variate degree d=nO(1) polynomial computable by a circuit of size poly(n) can also be computed by a depth 4 (O(dt))(t) formula of top fan-in nO(dt) . We show that any such formula computing IMMnd has top fan-in n(dt) , proving the optimality of Tavenas' result. This also strengthens a result of Kayal, Saha, and Saptharishi (ECCC, 2013) which gives a similar lower bound for an explicit polynomial in VNP.