PhD Lars Karlsson

Umeå University
Department of Computing Science
Assistant professor in parallel and multi-core computing

Contact information

UMIT Research Lab on the ground floor of the MIT building
+46-(0)90-786 70 24

Lars Karlsson
Dept. of Computing Science
Umeå University
SE-901 87 Umeå


Parallel and high-performance computing
Matrix and tensor computations
Efficient use of memory hierarchies
Fine-grained and dynamic scheduling
Multi-scale/hybrid parallelism





  1. Lars Karlsson, Daniel Kressner, and Bruno Lang. Optimally Packed Chains of Bulges in Multishift QR Algorithms. ACM Transactions on Mathematical Software (accepted 2013).
  2. Lars Karlsson, Bo Kågström, and Eddie Wadbro. Fine-Grained Bulge-Chasing Kernels for Strongly Scalable Parallel QR Algorithms. Parallel Computing (accepted 2013).
  3. Lars Karlsson and Bo Kågström. Parallel two-stage reduction to Hessenberg form on shared-memory architectures. Parallel Computing, volume 37, issue 12, December 2011, pages 771-782.
  4. Lars Karlsson and Bo Kågström. Efficient reduction from block Hessenberg form to Hessenberg form using shared memory. Proceedings of PARA 2010, Applied Parallel and Scientific Computing, LNCS, volume 7134, 2012, pages 258-268.
  5. Bo Kågström, Lars Karlsson, and Daniel Kressner. Computing Codimensions and Generic Canonical Forms for Generalized Matrix Products. Electronic Journal of Linear Algebra, volume 22, 2011, pages 277-309.
  6. Fred Gustavson, Lars Karlsson, and Bo Kågström. Parallel and Cache-Efficient In-Place Matrix Storage Format Conversion. ACM Transactions on Mathematical Software, volume 38, issue 3, April 2012, pages 17:1-17:32.
  7. Lars Karlsson. Blocked and Scalable Matrix Computations --- Packed Cholesky, In-Place Transposition, and Two-Sided Transformations. Licentiate Thesis, Dept. of Computing Science, Umeå University, Sweden, 2009. Report UMINF 09.11, ISBN 978-91-7264-788-6.
  8. Lars Karlsson. Blocked In-Place Transposition with Application to Storage Format Conversion. Technical Report UMINF 09.01, Dept. of Computing Science, Umeå University, Sweden, 2009.
  9. Lars Karlsson and Bo Kågström. A Framework for Dynamic Node-Scheduling of Two-Sided Blocked Matrix Computations. In Proceedings of PARA 2008 (accepted), 2009.
  10. Fred Gustavson, Lars Karlsson, and Bo Kågström. Distributed SBP Cholesky Factorization Algorithms with Near-Optimal Scheduling. ACM Transactions on Mathematical Software, Volume 36, Number 2, pages 11:1-11:25, 2009. (Also published as Report UMINF 07.19 and IBM Research Report RC24342.)
  11. Fred Gustavson, Lars Karlsson, and Bo Kågström. Three Algorithms for Cholesky Factorization on Distributed Memory using Packed Storage. In Applied Parallel Computing: State of the Art in Scientific Computing (PARA 2006), Lecture Notes in Computer Science, LNCS 4699, pages 550-559, Springer, 2007.


In-Place Matrix Transposition and Matrix Storage Format Conversion

Software to efficiently transpose a matrix in-place or convert to/from the standard column- and row-major matrix storage formats and the four standard blocked formats.

Codimensions of Generalized Matrix Products

Software to compute the codimension of a generalized matrix product given in canonical form.

Distributed SBP Cholesky Factorization

Prototype software to efficiently compute a dense Cholesky factorization using the Distributed Square Block Packed (Distributed SBP) storage format on a distributed memory machine.