I: pbuilder: network access will be disabled during build I: Current time: Wed Jun 4 07:12:23 +14 2025 I: pbuilder-time-stamp: 1748970743 I: Building the build Environment I: extracting base tarball [/var/cache/pbuilder/trixie-reproducible-base.tgz] I: copying local configuration W: --override-config is not set; not updating apt.conf Read the manpage for details. I: mounting /proc filesystem I: mounting /sys filesystem I: creating /{dev,run}/shm I: mounting /dev/pts filesystem I: redirecting /dev/ptmx to /dev/pts/ptmx I: policy-rc.d already exists I: Copying source file I: copying [gemmlowp_0.0~git20211220.e844ffd-1.dsc] I: copying [./gemmlowp_0.0~git20211220.e844ffd.orig.tar.xz] I: copying [./gemmlowp_0.0~git20211220.e844ffd-1.debian.tar.xz] I: Extracting source gpgv: Signature made Fri Jun 24 05:56:40 2022 gpgv: using RSA key 638BC75EC1E5C589067E35DE62645EB35F686A8A gpgv: issuer "lumin@debian.org" gpgv: Can't check signature: No public key dpkg-source: warning: cannot verify inline signature for ./gemmlowp_0.0~git20211220.e844ffd-1.dsc: no acceptable signature found dpkg-source: info: extracting gemmlowp in gemmlowp-0.0~git20211220.e844ffd dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd.orig.tar.xz dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd-1.debian.tar.xz dpkg-source: info: using patch list from debian/patches/series dpkg-source: info: applying 0001-cmake-build-fix.patch I: using fakeroot in build. I: Installing the build-deps I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/D01_modify_environment starting debug: Running on ionos5-amd64. I: Changing host+domainname to test build reproducibility I: Adding a custom variable just for the fun of it... I: Changing /bin/sh to bash '/bin/sh' -> '/bin/bash' lrwxrwxrwx 1 root root 9 Jun 3 17:12 /bin/sh -> /bin/bash I: Setting pbuilder2's login shell to /bin/bash I: Setting pbuilder2's GECOS to second user,second room,second work-phone,second home-phone,second other I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/D01_modify_environment finished I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/D02_print_environment starting I: set BASH=/bin/sh BASHOPTS=checkwinsize:cmdhist:complete_fullquote:extquote:force_fignore:globasciiranges:globskipdots:hostcomplete:interactive_comments:patsub_replacement:progcomp:promptvars:sourcepath BASH_ALIASES=() BASH_ARGC=() BASH_ARGV=() BASH_CMDS=() BASH_LINENO=([0]="12" [1]="0") BASH_LOADABLES_PATH=/usr/local/lib/bash:/usr/lib/bash:/opt/local/lib/bash:/usr/pkg/lib/bash:/opt/pkg/lib/bash:. BASH_SOURCE=([0]="/tmp/hooks/D02_print_environment" [1]="/tmp/hooks/D02_print_environment") BASH_VERSINFO=([0]="5" [1]="2" [2]="21" [3]="1" [4]="release" [5]="x86_64-pc-linux-gnu") BASH_VERSION='5.2.21(1)-release' BUILDDIR=/build/reproducible-path BUILDUSERGECOS='second user,second room,second work-phone,second home-phone,second other' BUILDUSERNAME=pbuilder2 BUILD_ARCH=amd64 DEBIAN_FRONTEND=noninteractive DEB_BUILD_OPTIONS='buildinfo=+all reproducible=+all parallel=42 ' DIRSTACK=() DISTRIBUTION=trixie EUID=0 FUNCNAME=([0]="Echo" [1]="main") GROUPS=() HOME=/root HOSTNAME=i-capture-the-hostname HOSTTYPE=x86_64 HOST_ARCH=amd64 IFS=' ' INVOCATION_ID=d87ef0efd2f84062b6f89f43886325a6 LANG=C LANGUAGE=et_EE:et LC_ALL=C MACHTYPE=x86_64-pc-linux-gnu MAIL=/var/mail/root OPTERR=1 OPTIND=1 OSTYPE=linux-gnu PATH=/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/i/capture/the/path PBCURRENTCOMMANDLINEOPERATION=build PBUILDER_OPERATION=build PBUILDER_PKGDATADIR=/usr/share/pbuilder PBUILDER_PKGLIBDIR=/usr/lib/pbuilder PBUILDER_SYSCONFDIR=/etc PIPESTATUS=([0]="0") POSIXLY_CORRECT=y PPID=3273419 PS4='+ ' PWD=/ SHELL=/bin/bash SHELLOPTS=braceexpand:errexit:hashall:interactive-comments:posix SHLVL=3 SUDO_COMMAND='/usr/bin/timeout -k 24.1h 24h /usr/bin/ionice -c 3 /usr/bin/nice -n 11 /usr/bin/unshare --uts -- /usr/sbin/pbuilder --build --configfile /srv/reproducible-results/rbuild-debian/r-b-build.JedN9w4Z/pbuilderrc_4mEd --distribution trixie --hookdir /etc/pbuilder/rebuild-hooks --debbuildopts -b --basetgz /var/cache/pbuilder/trixie-reproducible-base.tgz --buildresult /srv/reproducible-results/rbuild-debian/r-b-build.JedN9w4Z/b2 --logfile b2/build.log gemmlowp_0.0~git20211220.e844ffd-1.dsc' SUDO_GID=110 SUDO_UID=105 SUDO_USER=jenkins TERM=unknown TZ=/usr/share/zoneinfo/Etc/GMT-14 UID=0 USER=root _='I: set' http_proxy=http://213.165.73.152:3128 I: uname -a Linux i-capture-the-hostname 6.6.13+bpo-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.6.13-1~bpo12+1 (2024-02-15) x86_64 GNU/Linux I: ls -l /bin lrwxrwxrwx 1 root root 7 May 25 19:07 /bin -> usr/bin I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/D02_print_environment finished -> Attempting to satisfy build-dependencies -> Creating pbuilder-satisfydepends-dummy package Package: pbuilder-satisfydepends-dummy Version: 0.invalid.0 Architecture: amd64 Maintainer: Debian Pbuilder Team Description: Dummy package to satisfy dependencies with aptitude - created by pbuilder This package was created automatically by pbuilder to satisfy the build-dependencies of the package being currently built. Depends: debhelper-compat (= 13), cmake dpkg-deb: building package 'pbuilder-satisfydepends-dummy' in '/tmp/satisfydepends-aptitude/pbuilder-satisfydepends-dummy.deb'. Selecting previously unselected package pbuilder-satisfydepends-dummy. (Reading database ... 19901 files and directories currently installed.) Preparing to unpack .../pbuilder-satisfydepends-dummy.deb ... Unpacking pbuilder-satisfydepends-dummy (0.invalid.0) ... dpkg: pbuilder-satisfydepends-dummy: dependency problems, but configuring anyway as you requested: pbuilder-satisfydepends-dummy depends on debhelper-compat (= 13); however: Package debhelper-compat is not installed. pbuilder-satisfydepends-dummy depends on cmake; however: Package cmake is not installed. Setting up pbuilder-satisfydepends-dummy (0.invalid.0) ... Reading package lists... Building dependency tree... Reading state information... Initializing package states... Writing extended state information... Building tag database... pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) The following NEW packages will be installed: autoconf{a} automake{a} autopoint{a} autotools-dev{a} bsdextrautils{a} debhelper{a} dh-autoreconf{a} dh-strip-nondeterminism{a} dwz{a} file{a} gettext{a} gettext-base{a} groff-base{a} intltool-debian{a} libarchive-zip-perl{a} libdebhelper-perl{a} libelf1t64{a} libfile-stripnondeterminism-perl{a} libicu72{a} libmagic-mgc{a} libmagic1t64{a} libpipeline1{a} libsub-override-perl{a} libtool{a} libuchardet0{a} libxml2{a} m4{a} man-db{a} po-debconf{a} sensible-utils{a} The following packages are RECOMMENDED but will NOT be installed: curl libarchive-cpio-perl libltdl-dev libmail-sendmail-perl lynx wget 0 packages upgraded, 30 newly installed, 0 to remove and 0 not upgraded. Need to get 19.0 MB of archives. After unpacking 73.5 MB will be used. The following packages have unmet dependencies: pbuilder-satisfydepends-dummy : Depends: cmake but it is not installable The following actions will resolve these dependencies: Remove the following packages: 1) libssl3 [3.1.5-1 (now)] Install the following packages: 2) cmake [3.28.3-1 (testing)] 3) cmake-data [3.28.3-1 (testing)] 4) libarchive13t64 [3.7.2-2 (testing)] 5) libbrotli1 [1.1.0-2+b3 (testing)] 6) libcurl4t64 [8.7.1-3 (testing)] 7) libexpat1 [2.6.2-1 (testing)] 8) libjsoncpp25 [1.9.5-6+b2 (testing)] 9) libldap-2.5-0 [2.5.13+dfsg-5+b3 (testing)] 10) libnghttp2-14 [1.59.0-1 (testing)] 11) libproc2-0 [2:4.0.4-4 (testing)] 12) libpsl5t64 [0.21.2-1.1 (testing)] 13) librhash0 [1.4.3-3+b1 (testing)] 14) librtmp1 [2.4+20151223.gitfa8646d.1-2+b4 (testing)] 15) libsasl2-2 [2.1.28+dfsg1-4+b1 (testing)] 16) libsasl2-modules-db [2.1.28+dfsg1-4+b1 (testing)] 17) libssh2-1t64 [1.11.0-4.1+b2 (testing)] 18) libssl3t64 [3.2.1-3 (testing)] 19) libuv1t64 [1.48.0-1.1 (testing)] 20) procps [2:4.0.4-4 (testing)] The following NEW packages will be installed: autoconf{a} automake{a} autopoint{a} autotools-dev{a} bsdextrautils{a} cmake{a} cmake-data{a} debhelper{a} dh-autoreconf{a} dh-strip-nondeterminism{a} dwz{a} file{a} gettext{a} gettext-base{a} groff-base{a} intltool-debian{a} libarchive-zip-perl{a} libarchive13t64{a} libbrotli1{a} libcurl4t64{a} libdebhelper-perl{a} libelf1t64{a} libexpat1{a} libfile-stripnondeterminism-perl{a} libicu72{a} libjsoncpp25{a} libldap-2.5-0{a} libmagic-mgc{a} libmagic1t64{a} libnghttp2-14{a} libpipeline1{a} libproc2-0{a} libpsl5t64{a} librhash0{a} librtmp1{a} libsasl2-2{a} libsasl2-modules-db{a} libssh2-1t64{a} libssl3t64{a} libsub-override-perl{a} libtool{a} libuchardet0{a} libuv1t64{a} libxml2{a} m4{a} man-db{a} po-debconf{a} procps{a} sensible-utils{a} The following packages will be REMOVED: libssl3{a} The following packages are RECOMMENDED but will NOT be installed: ca-certificates curl libarchive-cpio-perl libldap-common libltdl-dev libmail-sendmail-perl libsasl2-modules lynx psmisc publicsuffix wget 0 packages upgraded, 49 newly installed, 1 to remove and 0 not upgraded. Need to get 37.1 MB of archives. After unpacking 131 MB will be used. Writing extended state information... Get: 1 http://deb.debian.org/debian trixie/main amd64 libssl3t64 amd64 3.2.1-3 [2244 kB] Get: 2 http://deb.debian.org/debian trixie/main amd64 libproc2-0 amd64 2:4.0.4-4 [64.6 kB] Get: 3 http://deb.debian.org/debian trixie/main amd64 procps amd64 2:4.0.4-4 [880 kB] Get: 4 http://deb.debian.org/debian trixie/main amd64 sensible-utils all 0.0.22 [22.4 kB] Get: 5 http://deb.debian.org/debian trixie/main amd64 libmagic-mgc amd64 1:5.45-3 [314 kB] Get: 6 http://deb.debian.org/debian trixie/main amd64 libmagic1t64 amd64 1:5.45-3 [105 kB] Get: 7 http://deb.debian.org/debian trixie/main amd64 file amd64 1:5.45-3 [42.9 kB] Get: 8 http://deb.debian.org/debian trixie/main amd64 gettext-base amd64 0.21-14+b1 [161 kB] Get: 9 http://deb.debian.org/debian trixie/main amd64 libuchardet0 amd64 0.0.8-1+b1 [68.8 kB] Get: 10 http://deb.debian.org/debian trixie/main amd64 groff-base amd64 1.23.0-3+b1 [1180 kB] Get: 11 http://deb.debian.org/debian trixie/main amd64 bsdextrautils amd64 2.39.3-6 [89.4 kB] Get: 12 http://deb.debian.org/debian trixie/main amd64 libpipeline1 amd64 1.5.7-2 [38.0 kB] Get: 13 http://deb.debian.org/debian trixie/main amd64 man-db amd64 2.12.0-3 [1401 kB] Get: 14 http://deb.debian.org/debian trixie/main amd64 m4 amd64 1.4.19-4 [287 kB] Get: 15 http://deb.debian.org/debian trixie/main amd64 autoconf all 2.71-3 [332 kB] Get: 16 http://deb.debian.org/debian trixie/main amd64 autotools-dev all 20220109.1 [51.6 kB] Get: 17 http://deb.debian.org/debian trixie/main amd64 automake all 1:1.16.5-1.3 [823 kB] Get: 18 http://deb.debian.org/debian trixie/main amd64 autopoint all 0.21-14 [496 kB] Get: 19 http://deb.debian.org/debian trixie/main amd64 libicu72 amd64 72.1-4+b1 [9395 kB] Get: 20 http://deb.debian.org/debian trixie/main amd64 libxml2 amd64 2.9.14+dfsg-1.3+b3 [692 kB] Get: 21 http://deb.debian.org/debian trixie/main amd64 libarchive13t64 amd64 3.7.2-2 [346 kB] Get: 22 http://deb.debian.org/debian trixie/main amd64 libbrotli1 amd64 1.1.0-2+b3 [305 kB] Get: 23 http://deb.debian.org/debian trixie/main amd64 libsasl2-modules-db amd64 2.1.28+dfsg1-4+b1 [19.7 kB] Get: 24 http://deb.debian.org/debian trixie/main amd64 libsasl2-2 amd64 2.1.28+dfsg1-4+b1 [57.0 kB] Get: 25 http://deb.debian.org/debian trixie/main amd64 libldap-2.5-0 amd64 2.5.13+dfsg-5+b3 [184 kB] Get: 26 http://deb.debian.org/debian trixie/main amd64 libnghttp2-14 amd64 1.59.0-1 [74.3 kB] Get: 27 http://deb.debian.org/debian trixie/main amd64 libpsl5t64 amd64 0.21.2-1.1 [56.8 kB] Get: 28 http://deb.debian.org/debian trixie/main amd64 librtmp1 amd64 2.4+20151223.gitfa8646d.1-2+b4 [58.5 kB] Get: 29 http://deb.debian.org/debian trixie/main amd64 libssh2-1t64 amd64 1.11.0-4.1+b2 [215 kB] Get: 30 http://deb.debian.org/debian trixie/main amd64 libcurl4t64 amd64 8.7.1-3 [441 kB] Get: 31 http://deb.debian.org/debian trixie/main amd64 libexpat1 amd64 2.6.2-1 [103 kB] Get: 32 http://deb.debian.org/debian trixie/main amd64 libjsoncpp25 amd64 1.9.5-6+b2 [81.9 kB] Get: 33 http://deb.debian.org/debian trixie/main amd64 librhash0 amd64 1.4.3-3+b1 [132 kB] Get: 34 http://deb.debian.org/debian trixie/main amd64 libuv1t64 amd64 1.48.0-1.1 [147 kB] Get: 35 http://deb.debian.org/debian trixie/main amd64 cmake-data all 3.28.3-1 [2128 kB] Get: 36 http://deb.debian.org/debian trixie/main amd64 cmake amd64 3.28.3-1 [10.5 MB] Get: 37 http://deb.debian.org/debian trixie/main amd64 libdebhelper-perl all 13.15.3 [88.0 kB] Get: 38 http://deb.debian.org/debian trixie/main amd64 libtool all 2.4.7-7 [517 kB] Get: 39 http://deb.debian.org/debian trixie/main amd64 dh-autoreconf all 20 [17.1 kB] Get: 40 http://deb.debian.org/debian trixie/main amd64 libarchive-zip-perl all 1.68-1 [104 kB] Get: 41 http://deb.debian.org/debian trixie/main amd64 libsub-override-perl all 0.10-1 [10.6 kB] Get: 42 http://deb.debian.org/debian trixie/main amd64 libfile-stripnondeterminism-perl all 1.13.1-1 [19.4 kB] Get: 43 http://deb.debian.org/debian trixie/main amd64 dh-strip-nondeterminism all 1.13.1-1 [8620 B] Get: 44 http://deb.debian.org/debian trixie/main amd64 libelf1t64 amd64 0.191-1+b1 [189 kB] Get: 45 http://deb.debian.org/debian trixie/main amd64 dwz amd64 0.15-1+b1 [110 kB] Get: 46 http://deb.debian.org/debian trixie/main amd64 gettext amd64 0.21-14+b1 [1301 kB] Get: 47 http://deb.debian.org/debian trixie/main amd64 intltool-debian all 0.35.0+20060710.6 [22.9 kB] Get: 48 http://deb.debian.org/debian trixie/main amd64 po-debconf all 1.0.21+nmu1 [248 kB] Get: 49 http://deb.debian.org/debian trixie/main amd64 debhelper all 13.15.3 [901 kB] Fetched 37.1 MB in 0s (143 MB/s) debconf: delaying package configuration, since apt-utils is not installed dpkg: libssl3:amd64: dependency problems, but removing anyway as you requested: libkrb5-3:amd64 depends on libssl3 (>= 3.0.0). coreutils depends on libssl3 (>= 3.0.0). (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 19901 files and directories currently installed.) Removing libssl3:amd64 (3.1.5-1) ... Selecting previously unselected package libssl3t64:amd64. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 19888 files and directories currently installed.) Preparing to unpack .../libssl3t64_3.2.1-3_amd64.deb ... Unpacking libssl3t64:amd64 (3.2.1-3) ... Setting up libssl3t64:amd64 (3.2.1-3) ... Selecting previously unselected package libproc2-0:amd64. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 19903 files and directories currently installed.) Preparing to unpack .../00-libproc2-0_2%3a4.0.4-4_amd64.deb ... Unpacking libproc2-0:amd64 (2:4.0.4-4) ... Selecting previously unselected package procps. Preparing to unpack .../01-procps_2%3a4.0.4-4_amd64.deb ... Unpacking procps (2:4.0.4-4) ... Selecting previously unselected package sensible-utils. Preparing to unpack .../02-sensible-utils_0.0.22_all.deb ... Unpacking sensible-utils (0.0.22) ... Selecting previously unselected package libmagic-mgc. Preparing to unpack .../03-libmagic-mgc_1%3a5.45-3_amd64.deb ... Unpacking libmagic-mgc (1:5.45-3) ... Selecting previously unselected package libmagic1t64:amd64. Preparing to unpack .../04-libmagic1t64_1%3a5.45-3_amd64.deb ... Unpacking libmagic1t64:amd64 (1:5.45-3) ... Selecting previously unselected package file. Preparing to unpack .../05-file_1%3a5.45-3_amd64.deb ... Unpacking file (1:5.45-3) ... Selecting previously unselected package gettext-base. Preparing to unpack .../06-gettext-base_0.21-14+b1_amd64.deb ... Unpacking gettext-base (0.21-14+b1) ... Selecting previously unselected package libuchardet0:amd64. Preparing to unpack .../07-libuchardet0_0.0.8-1+b1_amd64.deb ... Unpacking libuchardet0:amd64 (0.0.8-1+b1) ... Selecting previously unselected package groff-base. Preparing to unpack .../08-groff-base_1.23.0-3+b1_amd64.deb ... Unpacking groff-base (1.23.0-3+b1) ... Selecting previously unselected package bsdextrautils. Preparing to unpack .../09-bsdextrautils_2.39.3-6_amd64.deb ... Unpacking bsdextrautils (2.39.3-6) ... Selecting previously unselected package libpipeline1:amd64. Preparing to unpack .../10-libpipeline1_1.5.7-2_amd64.deb ... Unpacking libpipeline1:amd64 (1.5.7-2) ... Selecting previously unselected package man-db. Preparing to unpack .../11-man-db_2.12.0-3_amd64.deb ... Unpacking man-db (2.12.0-3) ... Selecting previously unselected package m4. Preparing to unpack .../12-m4_1.4.19-4_amd64.deb ... Unpacking m4 (1.4.19-4) ... Selecting previously unselected package autoconf. Preparing to unpack .../13-autoconf_2.71-3_all.deb ... Unpacking autoconf (2.71-3) ... Selecting previously unselected package autotools-dev. Preparing to unpack .../14-autotools-dev_20220109.1_all.deb ... Unpacking autotools-dev (20220109.1) ... Selecting previously unselected package automake. Preparing to unpack .../15-automake_1%3a1.16.5-1.3_all.deb ... Unpacking automake (1:1.16.5-1.3) ... Selecting previously unselected package autopoint. Preparing to unpack .../16-autopoint_0.21-14_all.deb ... Unpacking autopoint (0.21-14) ... Selecting previously unselected package libicu72:amd64. Preparing to unpack .../17-libicu72_72.1-4+b1_amd64.deb ... Unpacking libicu72:amd64 (72.1-4+b1) ... Selecting previously unselected package libxml2:amd64. Preparing to unpack .../18-libxml2_2.9.14+dfsg-1.3+b3_amd64.deb ... Unpacking libxml2:amd64 (2.9.14+dfsg-1.3+b3) ... Selecting previously unselected package libarchive13t64:amd64. Preparing to unpack .../19-libarchive13t64_3.7.2-2_amd64.deb ... Unpacking libarchive13t64:amd64 (3.7.2-2) ... Selecting previously unselected package libbrotli1:amd64. Preparing to unpack .../20-libbrotli1_1.1.0-2+b3_amd64.deb ... Unpacking libbrotli1:amd64 (1.1.0-2+b3) ... Selecting previously unselected package libsasl2-modules-db:amd64. Preparing to unpack .../21-libsasl2-modules-db_2.1.28+dfsg1-4+b1_amd64.deb ... Unpacking libsasl2-modules-db:amd64 (2.1.28+dfsg1-4+b1) ... Selecting previously unselected package libsasl2-2:amd64. Preparing to unpack .../22-libsasl2-2_2.1.28+dfsg1-4+b1_amd64.deb ... Unpacking libsasl2-2:amd64 (2.1.28+dfsg1-4+b1) ... Selecting previously unselected package libldap-2.5-0:amd64. Preparing to unpack .../23-libldap-2.5-0_2.5.13+dfsg-5+b3_amd64.deb ... Unpacking libldap-2.5-0:amd64 (2.5.13+dfsg-5+b3) ... Selecting previously unselected package libnghttp2-14:amd64. Preparing to unpack .../24-libnghttp2-14_1.59.0-1_amd64.deb ... Unpacking libnghttp2-14:amd64 (1.59.0-1) ... Selecting previously unselected package libpsl5t64:amd64. Preparing to unpack .../25-libpsl5t64_0.21.2-1.1_amd64.deb ... Unpacking libpsl5t64:amd64 (0.21.2-1.1) ... Selecting previously unselected package librtmp1:amd64. Preparing to unpack .../26-librtmp1_2.4+20151223.gitfa8646d.1-2+b4_amd64.deb ... Unpacking librtmp1:amd64 (2.4+20151223.gitfa8646d.1-2+b4) ... Selecting previously unselected package libssh2-1t64:amd64. Preparing to unpack .../27-libssh2-1t64_1.11.0-4.1+b2_amd64.deb ... Unpacking libssh2-1t64:amd64 (1.11.0-4.1+b2) ... Selecting previously unselected package libcurl4t64:amd64. Preparing to unpack .../28-libcurl4t64_8.7.1-3_amd64.deb ... Unpacking libcurl4t64:amd64 (8.7.1-3) ... Selecting previously unselected package libexpat1:amd64. Preparing to unpack .../29-libexpat1_2.6.2-1_amd64.deb ... Unpacking libexpat1:amd64 (2.6.2-1) ... Selecting previously unselected package libjsoncpp25:amd64. Preparing to unpack .../30-libjsoncpp25_1.9.5-6+b2_amd64.deb ... Unpacking libjsoncpp25:amd64 (1.9.5-6+b2) ... Selecting previously unselected package librhash0:amd64. Preparing to unpack .../31-librhash0_1.4.3-3+b1_amd64.deb ... Unpacking librhash0:amd64 (1.4.3-3+b1) ... Selecting previously unselected package libuv1t64:amd64. Preparing to unpack .../32-libuv1t64_1.48.0-1.1_amd64.deb ... Unpacking libuv1t64:amd64 (1.48.0-1.1) ... Selecting previously unselected package cmake-data. Preparing to unpack .../33-cmake-data_3.28.3-1_all.deb ... Unpacking cmake-data (3.28.3-1) ... Selecting previously unselected package cmake. Preparing to unpack .../34-cmake_3.28.3-1_amd64.deb ... Unpacking cmake (3.28.3-1) ... Selecting previously unselected package libdebhelper-perl. Preparing to unpack .../35-libdebhelper-perl_13.15.3_all.deb ... Unpacking libdebhelper-perl (13.15.3) ... Selecting previously unselected package libtool. Preparing to unpack .../36-libtool_2.4.7-7_all.deb ... Unpacking libtool (2.4.7-7) ... Selecting previously unselected package dh-autoreconf. Preparing to unpack .../37-dh-autoreconf_20_all.deb ... Unpacking dh-autoreconf (20) ... Selecting previously unselected package libarchive-zip-perl. Preparing to unpack .../38-libarchive-zip-perl_1.68-1_all.deb ... Unpacking libarchive-zip-perl (1.68-1) ... Selecting previously unselected package libsub-override-perl. Preparing to unpack .../39-libsub-override-perl_0.10-1_all.deb ... Unpacking libsub-override-perl (0.10-1) ... Selecting previously unselected package libfile-stripnondeterminism-perl. Preparing to unpack .../40-libfile-stripnondeterminism-perl_1.13.1-1_all.deb ... Unpacking libfile-stripnondeterminism-perl (1.13.1-1) ... Selecting previously unselected package dh-strip-nondeterminism. Preparing to unpack .../41-dh-strip-nondeterminism_1.13.1-1_all.deb ... Unpacking dh-strip-nondeterminism (1.13.1-1) ... Selecting previously unselected package libelf1t64:amd64. Preparing to unpack .../42-libelf1t64_0.191-1+b1_amd64.deb ... Unpacking libelf1t64:amd64 (0.191-1+b1) ... Selecting previously unselected package dwz. Preparing to unpack .../43-dwz_0.15-1+b1_amd64.deb ... Unpacking dwz (0.15-1+b1) ... Selecting previously unselected package gettext. Preparing to unpack .../44-gettext_0.21-14+b1_amd64.deb ... Unpacking gettext (0.21-14+b1) ... Selecting previously unselected package intltool-debian. Preparing to unpack .../45-intltool-debian_0.35.0+20060710.6_all.deb ... Unpacking intltool-debian (0.35.0+20060710.6) ... Selecting previously unselected package po-debconf. Preparing to unpack .../46-po-debconf_1.0.21+nmu1_all.deb ... Unpacking po-debconf (1.0.21+nmu1) ... Selecting previously unselected package debhelper. Preparing to unpack .../47-debhelper_13.15.3_all.deb ... Unpacking debhelper (13.15.3) ... Setting up libexpat1:amd64 (2.6.2-1) ... Setting up libpipeline1:amd64 (1.5.7-2) ... Setting up libicu72:amd64 (72.1-4+b1) ... Setting up bsdextrautils (2.39.3-6) ... Setting up libmagic-mgc (1:5.45-3) ... Setting up libarchive-zip-perl (1.68-1) ... Setting up libdebhelper-perl (13.15.3) ... Setting up libbrotli1:amd64 (1.1.0-2+b3) ... Setting up libuv1t64:amd64 (1.48.0-1.1) ... Setting up libmagic1t64:amd64 (1:5.45-3) ... Setting up libpsl5t64:amd64 (0.21.2-1.1) ... Setting up libnghttp2-14:amd64 (1.59.0-1) ... Setting up gettext-base (0.21-14+b1) ... Setting up m4 (1.4.19-4) ... Setting up file (1:5.45-3) ... Setting up libelf1t64:amd64 (0.191-1+b1) ... Setting up libsasl2-modules-db:amd64 (2.1.28+dfsg1-4+b1) ... Setting up autotools-dev (20220109.1) ... Setting up librtmp1:amd64 (2.4+20151223.gitfa8646d.1-2+b4) ... Setting up libproc2-0:amd64 (2:4.0.4-4) ... Setting up autopoint (0.21-14) ... Setting up libjsoncpp25:amd64 (1.9.5-6+b2) ... Setting up libsasl2-2:amd64 (2.1.28+dfsg1-4+b1) ... Setting up autoconf (2.71-3) ... Setting up dwz (0.15-1+b1) ... Setting up sensible-utils (0.0.22) ... Setting up librhash0:amd64 (1.4.3-3+b1) ... Setting up libuchardet0:amd64 (0.0.8-1+b1) ... Setting up procps (2:4.0.4-4) ... Setting up libsub-override-perl (0.10-1) ... Setting up cmake-data (3.28.3-1) ... Setting up libssh2-1t64:amd64 (1.11.0-4.1+b2) ... Setting up libxml2:amd64 (2.9.14+dfsg-1.3+b3) ... Setting up automake (1:1.16.5-1.3) ... update-alternatives: using /usr/bin/automake-1.16 to provide /usr/bin/automake (automake) in auto mode Setting up libfile-stripnondeterminism-perl (1.13.1-1) ... Setting up gettext (0.21-14+b1) ... Setting up libtool (2.4.7-7) ... Setting up libldap-2.5-0:amd64 (2.5.13+dfsg-5+b3) ... Setting up intltool-debian (0.35.0+20060710.6) ... Setting up dh-autoreconf (20) ... Setting up dh-strip-nondeterminism (1.13.1-1) ... Setting up groff-base (1.23.0-3+b1) ... Setting up libarchive13t64:amd64 (3.7.2-2) ... Setting up libcurl4t64:amd64 (8.7.1-3) ... Setting up po-debconf (1.0.21+nmu1) ... Setting up man-db (2.12.0-3) ... Not building database; man-db/auto-update is not 'true'. Setting up cmake (3.28.3-1) ... Setting up debhelper (13.15.3) ... Processing triggers for libc-bin (2.37-18) ... Reading package lists... Building dependency tree... Reading state information... Reading extended state information... Initializing package states... Writing extended state information... Building tag database... -> Finished parsing the build-deps Reading package lists... Building dependency tree... Reading state information... fakeroot is already the newest version (1.33-1). 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. I: Building the package I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/A99_set_merged_usr starting Not re-configuring usrmerge for trixie I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/A99_set_merged_usr finished hostname: Name or service not known I: Running cd /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/ && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/i/capture/the/path" HOME="/nonexistent/second-build" dpkg-buildpackage -us -uc -b && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/i/capture/the/path" HOME="/nonexistent/second-build" dpkg-genchanges -S > ../gemmlowp_0.0~git20211220.e844ffd-1_source.changes dpkg-buildpackage: info: source package gemmlowp dpkg-buildpackage: info: source version 0.0~git20211220.e844ffd-1 dpkg-buildpackage: info: source distribution unstable dpkg-buildpackage: info: source changed by Mo Zhou dpkg-source --before-build . dpkg-buildpackage: info: host architecture amd64 fakeroot debian/rules clean dh clean -Scmake debian/rules override_dh_auto_clean make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' rm -f CMakeLists.txt dh_auto_clean make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' dh_clean -O-Scmake debian/rules build dh build -Scmake dh_update_autotools_config -O-Scmake dh_autoreconf -O-Scmake debian/rules override_dh_auto_configure make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' ln -s contrib/CMakeLists.txt . dh_auto_configure -- \ -DCMAKE_C_FLAGS="-g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" \ -DCMAKE_CXX_FLAGS="-g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" cd obj-x86_64-linux-gnu && DEB_PYTHON_INSTALL_LAYOUT=deb cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=None -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_INSTALL_LOCALSTATEDIR=/var -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_USE_PACKAGE_REGISTRY=OFF -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON -DFETCHCONTENT_FULLY_DISCONNECTED=ON -DCMAKE_INSTALL_RUNSTATEDIR=/run -DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON "-GUnix Makefiles" -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_INSTALL_LIBDIR=lib/x86_64-linux-gnu "-DCMAKE_C_FLAGS=-g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" "-DCMAKE_CXX_FLAGS=-g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2" .. -- The C compiler identification is GNU 13.2.0 -- The CXX compiler identification is GNU 13.2.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Configuring done (1.4s) -- Generating done (0.0s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_EXPORT_NO_PACKAGE_REGISTRY CMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY CMAKE_FIND_USE_PACKAGE_REGISTRY FETCHCONTENT_FULLY_DISCONNECTED -- Build files have been written to: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' dh_auto_build -O-Scmake cd obj-x86_64-linux-gnu && make -j42 "INSTALL=install --strip-program=true" VERBOSE=1 make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -S"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" -B"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu//CMakeFiles/progress.marks" make -f CMakeFiles/Makefile2 all make[2]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/depend make -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/depend make -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/depend make -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/eight_bit_int_gemm.dir/DependInfo.cmake" "--color=" make -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/benchmark.dir/DependInfo.cmake" "--color=" make -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/benchmark_all_sizes.dir/DependInfo.cmake" "--color=" make -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_math_helpers.dir/DependInfo.cmake" "--color=" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_blocking_counter.dir/DependInfo.cmake" "--color=" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_allocator.dir/DependInfo.cmake" "--color=" make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_fixedpoint.dir/DependInfo.cmake" "--color=" make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/build make -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/build make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/build make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/build make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/build make -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/build make -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/build make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 17%] Building CXX object CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o [ 17%] Building CXX object CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o [ 17%] Building CXX object CMakeFiles/benchmark.dir/test/benchmark.cc.o [ 23%] Building CXX object CMakeFiles/test_allocator.dir/test/test_allocator.cc.o [ 29%] Building CXX object CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -MF CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o.d -o CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_blocking_counter.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/benchmark.dir/test/benchmark.cc.o -MF CMakeFiles/benchmark.dir/test/benchmark.cc.o.d -o CMakeFiles/benchmark.dir/test/benchmark.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/benchmark.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -DBENCHMARK_8bit -DBENCHMARK_QUICK -MD -MT CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -MF CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o.d -o CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/benchmark_all_sizes.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -MF CMakeFiles/test_allocator.dir/test/test_allocator.cc.o.d -o CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_allocator.cc" [ 35%] Building CXX object CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -MF CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o.d -o CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/eight_bit_int_gemm/eight_bit_int_gemm.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -MF CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o.d -o CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_math_helpers.cc" [ 41%] Building CXX object CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -MF CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o.d -o CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_fixedpoint.cc" [ 47%] Linking CXX executable test_allocator /usr/bin/cmake -E cmake_link_script CMakeFiles/test_allocator.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -o test_allocator [ 52%] Linking CXX executable test_blocking_counter /usr/bin/cmake -E cmake_link_script CMakeFiles/test_blocking_counter.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -o test_blocking_counter -lpthread make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 52%] Built target test_allocator [ 52%] Built target test_blocking_counter [ 58%] Linking CXX executable test_math_helpers /usr/bin/cmake -E cmake_link_script CMakeFiles/test_math_helpers.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -o test_math_helpers make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 58%] Built target test_math_helpers [ 64%] Linking CXX executable benchmark_all_sizes /usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark_all_sizes.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -o benchmark_all_sizes -lpthread make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 64%] Built target benchmark_all_sizes [ 70%] Linking CXX executable benchmark /usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/benchmark.dir/test/benchmark.cc.o -o benchmark -lpthread make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 70%] Built target benchmark [ 76%] Linking CXX executable test_fixedpoint /usr/bin/cmake -E cmake_link_script CMakeFiles/test_fixedpoint.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -o test_fixedpoint make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 76%] Built target test_fixedpoint [ 82%] Linking CXX static library libeight_bit_int_gemm.a /usr/bin/cmake -P CMakeFiles/eight_bit_int_gemm.dir/cmake_clean_target.cmake /usr/bin/cmake -E cmake_link_script CMakeFiles/eight_bit_int_gemm.dir/link.txt --verbose=1 /usr/bin/ar qc libeight_bit_int_gemm.a CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o /usr/bin/ranlib libeight_bit_int_gemm.a make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 82%] Built target eight_bit_int_gemm make -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/depend make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_gemmlowp.dir/DependInfo.cmake" "--color=" make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/build make[3]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 88%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test_data.cc" [ 94%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test.cc.o -c "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc" /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In file included from /usr/include/stdio.h:906, from /usr/include/c++/13/cstdio:42, from /usr/include/c++/13/ext/string_conversions.h:45, from /usr/include/c++/13/bits/basic_string.h:4109, from /usr/include/c++/13/string:54, from /usr/include/c++/13/bits/locale_classes.h:40, from /usr/include/c++/13/bits/ios_base.h:41, from /usr/include/c++/13/ios:44, from /usr/include/c++/13/ostream:40, from /usr/include/c++/13/iostream:41, from /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.h:26, from /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:15: In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 4>, gemmlowp::KernelSideFormat, 5> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 4>, gemmlowp::KernelSideFormat, 5> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=] 123 | snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=] 123 | snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ [100%] Linking CXX executable test_gemmlowp /usr/bin/cmake -E cmake_link_script CMakeFiles/test_gemmlowp.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_gemmlowp.dir/test/test.cc.o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -o test_gemmlowp libeight_bit_int_gemm.a -lpthread make[3]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [100%] Built target test_gemmlowp make[2]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -E cmake_progress_start "/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles" 0 make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' dh_auto_test -O-Scmake cd obj-x86_64-linux-gnu && make -j42 test ARGS\+=--verbose ARGS\+=-j42 make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' Running tests... /usr/bin/ctest --force-new-ctest-process --verbose -j42 UpdateCTestConfiguration from :/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Parse Config file:/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl UpdateCTestConfiguration from :/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Parse Config file:/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Test project /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu Constructing a list of tests Done constructing a list of tests Updating test list for fixtures Added 0 tests to meet fixture requirements Checking test dependency graph... Checking test dependency graph end test 1 Start 1: test_math_helpers 1: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_math_helpers 1: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 1: Test timeout computed to be: 1500 test 2 Start 2: test_blocking_counter 2: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_blocking_counter 2: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 2: Test timeout computed to be: 1500 test 3 Start 3: test_allocator 3: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_allocator 3: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 3: Test timeout computed to be: 1500 test 4 Start 4: test_fixedpoint 4: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_fixedpoint 4: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 4: Test timeout computed to be: 1500 test 5 Start 5: test_gemmlowp 5: Test command: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_gemmlowp 5: Working Directory: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 5: Test timeout computed to be: 1500 1/5 Test #1: test_math_helpers ................ Passed 0.01 sec 2/5 Test #3: test_allocator ................... Passed 0.01 sec 5: TestWithSmallData: PASS 5: number of matrix entries: 8 5: median value: 136 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 3/5 Test #2: test_blocking_counter ............ Passed 0.01 sec 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 4: PASS (Scalar int32) 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 4: PASS (Scalar int16) 4/5 Test #4: test_fixedpoint .................. Passed 0.33 sec 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 24 5: TestWithRealData: PASS with Lhs: 8 bit, Rhs: 8 bit 5: number of matrix entries: 49152 5: median value: 104 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestWithRealData: PASS with (legacy, no longer requantizing) Lhs: 7 bit, Rhs: 5 bit 5: number of matrix entries: 49152 5: median value: 104 5: median unsigned diff: 0 (tolerating 2) 5: max unsigned diff: 0 (tolerating 10) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0.2) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestWithSmallDataPerChannelQuantization: PASS 5: number of matrix entries: 18 5: median value: 127 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestWithLargeDataPerChannelQuantization: PASS 5: number of matrix entries: 550 5: median value: 7 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestMultithreadedPerChannelQuantization: PASS 5: number of matrix entries: 1280 5: median value: 0 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: All tests passed. 5/5 Test #5: test_gemmlowp .................... Passed 114.30 sec 100% tests passed, 0 tests failed out of 5 Total Test time (real) = 114.31 sec make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' create-stamp debian/debhelper-build-stamp fakeroot debian/rules binary dh binary -Scmake dh_testroot -O-Scmake dh_prep -O-Scmake dh_auto_install --destdir=debian/libgemmlowp-dev/ -O-Scmake cd obj-x86_64-linux-gnu && make -j42 install DESTDIR=/build/reproducible-path/gemmlowp-0.0\~git20211220.e844ffd/debian/libgemmlowp-dev AM_UPDATE_INFO_DIR=no "INSTALL=install --strip-program=true" make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -S"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd" -B"/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" --check-build-system CMakeFiles/Makefile.cmake 0 make -f CMakeFiles/Makefile2 preinstall make[2]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[2]: Nothing to be done for 'preinstall'. make[2]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' Install the project... /usr/bin/cmake -P cmake_install.cmake -- Install configuration: "None" -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/eight_bit_int_gemm/eight_bit_int_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/base.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemv.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_operations_common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_single_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_transform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_32.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_64.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_transform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_32.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_64.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_32.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_64.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/bit_depth.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/gemmlowp.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/map.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/output_stages.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/instrumentation.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/profiler.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/pthread_everywhere.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/allocator.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/block_params.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/common.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/compute.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/detect_platform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/dispatch_gemm_shape.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_default.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_reference.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/multi_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/platform.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_common_neon_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/single_thread_gemm.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/unpack.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_avx.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_msa.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_neon.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_sse.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_wasmsimd.h -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/libeight_bit_int_gemm.a -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/cmake/gemmlowp/gemmlowp-config.cmake -- Installing: /build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/cmake/gemmlowp/gemmlowp-config-none.cmake make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' dh_install -O-Scmake debian/rules override_dh_installdocs make[1]: Entering directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' mkdir -p debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/ install meta/README debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/ dh_installdocs make[1]: Leaving directory '/build/reproducible-path/gemmlowp-0.0~git20211220.e844ffd' dh_installchangelogs -O-Scmake dh_installexamples -O-Scmake dh_installinit -O-Scmake dh_perl -O-Scmake dh_link -O-Scmake dh_strip_nondeterminism -O-Scmake dh_compress -O-Scmake dh_fixperms -O-Scmake dh_missing -O-Scmake dh_dwz -a -O-Scmake dh_strip -a -O-Scmake dh_makeshlibs -a -O-Scmake dh_shlibdeps -a -O-Scmake dh_installdeb -O-Scmake dh_gencontrol -O-Scmake dh_md5sums -O-Scmake dh_builddeb -O-Scmake dpkg-deb: building package 'libgemmlowp-dev' in '../libgemmlowp-dev_0.0~git20211220.e844ffd-1_amd64.deb'. dpkg-genbuildinfo --build=binary -O../gemmlowp_0.0~git20211220.e844ffd-1_amd64.buildinfo dpkg-genchanges --build=binary -O../gemmlowp_0.0~git20211220.e844ffd-1_amd64.changes dpkg-genchanges: info: binary-only upload (no source code included) dpkg-source --after-build . dpkg-buildpackage: info: binary-only upload (no source included) dpkg-genchanges: info: including full source code in upload I: copying local configuration I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/B01_cleanup starting I: user script /srv/workspace/pbuilder/3273419/tmp/hooks/B01_cleanup finished I: unmounting dev/ptmx filesystem I: unmounting dev/pts filesystem I: unmounting dev/shm filesystem I: unmounting proc filesystem I: unmounting sys filesystem I: cleaning the build env I: removing directory /srv/workspace/pbuilder/3273419 and its subdirectories I: Current time: Wed Jun 4 07:16:49 +14 2025 I: pbuilder-time-stamp: 1748971009