Skip to content

Adds libxsmm support for micro-GEMMs#569

Open
zhihao-deng wants to merge 2 commits into
masterfrom
zhihao/feature/libxsmm
Open

Adds libxsmm support for micro-GEMMs#569
zhihao-deng wants to merge 2 commits into
masterfrom
zhihao/feature/libxsmm

Conversation

@zhihao-deng

Copy link
Copy Markdown
Contributor

Adds an optional libxsmm fast path for the small strided ToT micro-GEMMs, enabled with -DTA_LIBXSMM=ON (default OFF). When on, TiledArray fetches and builds libxsmm from source itself and routes these GEMM families through libxsmm's JIT, falling back to the vendor BLAS for any shape with max(M,N,K) > 64:

  • ce+cearena_strided_dgemm_ce_ce_{right,left}
  • ce+earena_strided_dgemm_ce_e
  • scale / "mixed" — tot_x_t and t_x_tot regimes (Tensor scale path)

A runtime switch TA_LIBXSMM=0 (off/false/no) routes every micro-GEMM back through the vendor BLAS without rebuilding.

ta_test arena_strided_dgemm_suite + arena_einsum_unit_suite pass with libxsmm ON (JIT fires) and with TA_LIBXSMM=0 — numerically equivalent to the vendor-BLAS path.

…SMM=ON)

Fetch+build libxsmm from source (no system install assumed) and route the
small strided tensor-of-tensors GEMMs (ce+e, ce+ce, scale) through its JIT,
falling back to vendor BLAS for shapes max(M,N,K)>64. Runtime toggle TA_LIBXSMM=0.
- Install libxsmm.a+headers into TA's prefix; split TiledArray_LIBXSMM into
  BUILD/INSTALL interfaces so the exported config has no build-tree leak.
- Guard 64->32-bit narrowing of lda/ldb/ldc in libxsmm_gemm_le64.
- Make the libxsmm sub-make parallelism configurable (LIBXSMM_BUILD_NJOBS).
- Add a TA_LIBXSMM=1 CTest gate + a direct scale_libxsmm_dgemm numerical test.
Comment thread external/libxsmm.cmake
set(LIBXSMM_BUILD_BYPRODUCTS "${_LIBXSMM_INSTALL_DIR}/lib/libxsmm.a")
message(STATUS "custom target libxsmm is expected to build these byproducts: ${LIBXSMM_BUILD_BYPRODUCTS}")

ExternalProject_Add(libxsmm

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does libxsmm have a CMake harness so that we can use FetchContent?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found that they support CMake now. Will adapt later

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, don't adapt yet, it may not be usable as a subproject. Safer to use ExternalProject_add (only for projects that we control directly it's better to use FetchContent)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants