Interpolation is a vital piece of many image processing pipelines. In this document we present how to optimize the implementation of image B-spline interpolation for GPU architectures. The implementation is based on the one proposed by Briand and Monasse in 2018 and works for orders up to 11. The two main optimizations consist in: (1) transposing the B-spline coefficients before the prefiltering of the rows and (2) dividing columns into subregions in order to use more threads. We assess the impact of the floating point precision and of using high B-spline orders.