I: pbuilder: network access will be disabled during build I: Current time: Thu May 23 03:49:08 +14 2024 I: pbuilder-time-stamp: 1716385748 I: Building the build Environment I: extracting base tarball [/var/cache/pbuilder/bookworm-reproducible-base.tgz] I: copying local configuration W: --override-config is not set; not updating apt.conf Read the manpage for details. I: mounting /proc filesystem I: mounting /sys filesystem I: creating /{dev,run}/shm I: mounting /dev/pts filesystem I: redirecting /dev/ptmx to /dev/pts/ptmx I: policy-rc.d already exists I: Copying source file I: copying [gemmlowp_0.0~git20211220.e844ffd-1.dsc] I: copying [./gemmlowp_0.0~git20211220.e844ffd.orig.tar.xz] I: copying [./gemmlowp_0.0~git20211220.e844ffd-1.debian.tar.xz] I: Extracting source gpgv: Signature made Fri Jun 24 19:56:40 2022 +14 gpgv: using RSA key 638BC75EC1E5C589067E35DE62645EB35F686A8A gpgv: issuer "lumin@debian.org" gpgv: Can't check signature: No public key dpkg-source: warning: cannot verify inline signature for ./gemmlowp_0.0~git20211220.e844ffd-1.dsc: no acceptable signature found dpkg-source: info: extracting gemmlowp in gemmlowp-0.0~git20211220.e844ffd dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd.orig.tar.xz dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd-1.debian.tar.xz dpkg-source: info: using patch list from debian/patches/series dpkg-source: info: applying 0001-cmake-build-fix.patch I: using fakeroot in build. I: Installing the build-deps I: user script /srv/workspace/pbuilder/41679/tmp/hooks/D01_modify_environment starting debug: Running on ionos15-amd64. I: Changing host+domainname to test build reproducibility I: Adding a custom variable just for the fun of it... I: Changing /bin/sh to bash '/bin/sh' -> '/bin/bash' lrwxrwxrwx 1 root root 9 May 23 03:49 /bin/sh -> /bin/bash I: Setting pbuilder2's login shell to /bin/bash I: Setting pbuilder2's GECOS to second user,second room,second work-phone,second home-phone,second other I: user script /srv/workspace/pbuilder/41679/tmp/hooks/D01_modify_environment finished I: user script /srv/workspace/pbuilder/41679/tmp/hooks/D02_print_environment starting I: set BASH=/bin/sh BASHOPTS=checkwinsize:cmdhist:complete_fullquote:extquote:force_fignore:globasciiranges:globskipdots:hostcomplete:interactive_comments:patsub_replacement:progcomp:promptvars:sourcepath BASH_ALIASES=() BASH_ARGC=() BASH_ARGV=() BASH_CMDS=() BASH_LINENO=([0]="12" [1]="0") BASH_LOADABLES_PATH=/usr/local/lib/bash:/usr/lib/bash:/opt/local/lib/bash:/usr/pkg/lib/bash:/opt/pkg/lib/bash:. BASH_SOURCE=([0]="/tmp/hooks/D02_print_environment" [1]="/tmp/hooks/D02_print_environment") BASH_VERSINFO=([0]="5" [1]="2" [2]="15" [3]="1" [4]="release" [5]="x86_64-pc-linux-gnu") BASH_VERSION='5.2.15(1)-release' BUILDDIR=/build BUILDUSERGECOS='second user,second room,second work-phone,second home-phone,second other' BUILDUSERNAME=pbuilder2 BUILD_ARCH=amd64 DEBIAN_FRONTEND=noninteractive DEB_BUILD_OPTIONS='buildinfo=+all reproducible=+all parallel=16' DIRSTACK=() DISTRIBUTION=bookworm EUID=0 FUNCNAME=([0]="Echo" [1]="main") GROUPS=() HOME=/root HOSTNAME=i-capture-the-hostname HOSTTYPE=x86_64 HOST_ARCH=amd64 IFS=' ' INVOCATION_ID=438b4086a69d4b80aabea9e519e89f71 LANG=C LANGUAGE=et_EE:et LC_ALL=C MACHTYPE=x86_64-pc-linux-gnu MAIL=/var/mail/root OPTERR=1 OPTIND=1 OSTYPE=linux-gnu PATH=/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/i/capture/the/path PBCURRENTCOMMANDLINEOPERATION=build PBUILDER_OPERATION=build PBUILDER_PKGDATADIR=/usr/share/pbuilder PBUILDER_PKGLIBDIR=/usr/lib/pbuilder PBUILDER_SYSCONFDIR=/etc PIPESTATUS=([0]="0") POSIXLY_CORRECT=y PPID=41679 PS4='+ ' PWD=/ SHELL=/bin/bash SHELLOPTS=braceexpand:errexit:hashall:interactive-comments:posix SHLVL=3 SUDO_COMMAND='/usr/bin/timeout -k 24.1h 24h /usr/bin/ionice -c 3 /usr/bin/nice -n 11 /usr/bin/unshare --uts -- /usr/sbin/pbuilder --build --configfile /srv/reproducible-results/rbuild-debian/r-b-build.gxG9dw9N/pbuilderrc_x4tA --distribution bookworm --hookdir /etc/pbuilder/rebuild-hooks --debbuildopts -b --basetgz /var/cache/pbuilder/bookworm-reproducible-base.tgz --buildresult /srv/reproducible-results/rbuild-debian/r-b-build.gxG9dw9N/b2 --logfile b2/build.log --extrapackages usrmerge gemmlowp_0.0~git20211220.e844ffd-1.dsc' SUDO_GID=111 SUDO_UID=106 SUDO_USER=jenkins TERM=unknown TZ=/usr/share/zoneinfo/Etc/GMT-14 UID=0 USER=root _='I: set' http_proxy=http://85.184.249.68:3128 I: uname -a Linux i-capture-the-hostname 6.1.0-0.deb11.5-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.12-1~bpo11+1 (2023-03-05) x86_64 GNU/Linux I: ls -l /bin total 5632 -rwxr-xr-x 1 root root 1265648 Feb 13 2023 bash -rwxr-xr-x 3 root root 39224 Sep 19 2022 bunzip2 -rwxr-xr-x 3 root root 39224 Sep 19 2022 bzcat lrwxrwxrwx 1 root root 6 Sep 19 2022 bzcmp -> bzdiff -rwxr-xr-x 1 root root 2225 Sep 19 2022 bzdiff lrwxrwxrwx 1 root root 6 Sep 19 2022 bzegrep -> bzgrep -rwxr-xr-x 1 root root 4893 Nov 28 2021 bzexe lrwxrwxrwx 1 root root 6 Sep 19 2022 bzfgrep -> bzgrep -rwxr-xr-x 1 root root 3775 Sep 19 2022 bzgrep -rwxr-xr-x 3 root root 39224 Sep 19 2022 bzip2 -rwxr-xr-x 1 root root 14568 Sep 19 2022 bzip2recover lrwxrwxrwx 1 root root 6 Sep 19 2022 bzless -> bzmore -rwxr-xr-x 1 root root 1297 Sep 19 2022 bzmore -rwxr-xr-x 1 root root 44016 Sep 21 2022 cat -rwxr-xr-x 1 root root 68656 Sep 21 2022 chgrp -rwxr-xr-x 1 root root 64496 Sep 21 2022 chmod -rwxr-xr-x 1 root root 72752 Sep 21 2022 chown -rwxr-xr-x 1 root root 151152 Sep 21 2022 cp -rwxr-xr-x 1 root root 125640 Jan 6 2023 dash -rwxr-xr-x 1 root root 121904 Sep 21 2022 date -rwxr-xr-x 1 root root 89240 Sep 21 2022 dd -rwxr-xr-x 1 root root 102200 Sep 21 2022 df -rwxr-xr-x 1 root root 151344 Sep 21 2022 dir -rwxr-xr-x 1 root root 88656 Mar 24 2023 dmesg lrwxrwxrwx 1 root root 8 Dec 20 2022 dnsdomainname -> hostname lrwxrwxrwx 1 root root 8 Dec 20 2022 domainname -> hostname -rwxr-xr-x 1 root root 43856 Sep 21 2022 echo -rwxr-xr-x 1 root root 41 Jan 25 2023 egrep -rwxr-xr-x 1 root root 35664 Sep 21 2022 false -rwxr-xr-x 1 root root 41 Jan 25 2023 fgrep -rwxr-xr-x 1 root root 85600 Mar 24 2023 findmnt -rwsr-xr-x 1 root root 35128 Mar 23 2023 fusermount -rwxr-xr-x 1 root root 203152 Jan 25 2023 grep -rwxr-xr-x 2 root root 2346 Apr 10 2022 gunzip -rwxr-xr-x 1 root root 6447 Apr 10 2022 gzexe -rwxr-xr-x 1 root root 98136 Apr 10 2022 gzip -rwxr-xr-x 1 root root 22680 Dec 20 2022 hostname -rwxr-xr-x 1 root root 72824 Sep 21 2022 ln -rwxr-xr-x 1 root root 53024 Mar 24 2023 login -rwxr-xr-x 1 root root 151344 Sep 21 2022 ls -rwxr-xr-x 1 root root 207168 Mar 24 2023 lsblk -rwxr-xr-x 1 root root 97552 Sep 21 2022 mkdir -rwxr-xr-x 1 root root 72912 Sep 21 2022 mknod -rwxr-xr-x 1 root root 43952 Sep 21 2022 mktemp -rwxr-xr-x 1 root root 59712 Mar 24 2023 more -rwsr-xr-x 1 root root 59704 Mar 24 2023 mount -rwxr-xr-x 1 root root 18744 Mar 24 2023 mountpoint -rwxr-xr-x 1 root root 142968 Sep 21 2022 mv lrwxrwxrwx 1 root root 8 Dec 20 2022 nisdomainname -> hostname lrwxrwxrwx 1 root root 14 Apr 3 2023 pidof -> /sbin/killall5 -rwxr-xr-x 1 root root 43952 Sep 21 2022 pwd lrwxrwxrwx 1 root root 4 Feb 13 2023 rbash -> bash -rwxr-xr-x 1 root root 52112 Sep 21 2022 readlink -rwxr-xr-x 1 root root 72752 Sep 21 2022 rm -rwxr-xr-x 1 root root 56240 Sep 21 2022 rmdir -rwxr-xr-x 1 root root 27560 Nov 3 2022 run-parts -rwxr-xr-x 1 root root 126424 Jan 6 2023 sed lrwxrwxrwx 1 root root 9 May 23 03:49 sh -> /bin/bash -rwxr-xr-x 1 root root 43888 Sep 21 2022 sleep -rwxr-xr-x 1 root root 85008 Sep 21 2022 stty -rwsr-xr-x 1 root root 72000 Mar 24 2023 su -rwxr-xr-x 1 root root 39824 Sep 21 2022 sync -rwxr-xr-x 1 root root 531984 Apr 7 2023 tar -rwxr-xr-x 1 root root 14520 Nov 3 2022 tempfile -rwxr-xr-x 1 root root 109616 Sep 21 2022 touch -rwxr-xr-x 1 root root 35664 Sep 21 2022 true -rwxr-xr-x 1 root root 14568 Mar 23 2023 ulockmgr_server -rwsr-xr-x 1 root root 35128 Mar 24 2023 umount -rwxr-xr-x 1 root root 43888 Sep 21 2022 uname -rwxr-xr-x 2 root root 2346 Apr 10 2022 uncompress -rwxr-xr-x 1 root root 151344 Sep 21 2022 vdir -rwxr-xr-x 1 root root 72024 Mar 24 2023 wdctl lrwxrwxrwx 1 root root 8 Dec 20 2022 ypdomainname -> hostname -rwxr-xr-x 1 root root 1984 Apr 10 2022 zcat -rwxr-xr-x 1 root root 1678 Apr 10 2022 zcmp -rwxr-xr-x 1 root root 6460 Apr 10 2022 zdiff -rwxr-xr-x 1 root root 29 Apr 10 2022 zegrep -rwxr-xr-x 1 root root 29 Apr 10 2022 zfgrep -rwxr-xr-x 1 root root 2081 Apr 10 2022 zforce -rwxr-xr-x 1 root root 8103 Apr 10 2022 zgrep -rwxr-xr-x 1 root root 2206 Apr 10 2022 zless -rwxr-xr-x 1 root root 1842 Apr 10 2022 zmore -rwxr-xr-x 1 root root 4577 Apr 10 2022 znew I: user script /srv/workspace/pbuilder/41679/tmp/hooks/D02_print_environment finished -> Attempting to satisfy build-dependencies -> Creating pbuilder-satisfydepends-dummy package Package: pbuilder-satisfydepends-dummy Version: 0.invalid.0 Architecture: amd64 Maintainer: Debian Pbuilder Team Description: Dummy package to satisfy dependencies with aptitude - created by pbuilder This package was created automatically by pbuilder to satisfy the build-dependencies of the package being currently built. Depends: debhelper-compat (= 13), cmake dpkg-deb: building package 'pbuilder-satisfydepends-dummy' in '/tmp/satisfydepends-aptitude/pbuilder-satisfydepends-dummy.deb'. Selecting previously unselected package pbuilder-satisfydepends-dummy. (Reading database ... 19596 files and directories currently installed.) Preparing to unpack .../pbuilder-satisfydepends-dummy.deb ... Unpacking pbuilder-satisfydepends-dummy (0.invalid.0) ... dpkg: pbuilder-satisfydepends-dummy: dependency problems, but configuring anyway as you requested: pbuilder-satisfydepends-dummy depends on debhelper-compat (= 13); however: Package debhelper-compat is not installed. pbuilder-satisfydepends-dummy depends on cmake; however: Package cmake is not installed. Setting up pbuilder-satisfydepends-dummy (0.invalid.0) ... Reading package lists... Building dependency tree... Reading state information... Initializing package states... Writing extended state information... Building tag database... pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) The following NEW packages will be installed: autoconf{a} automake{a} autopoint{a} autotools-dev{a} bsdextrautils{a} cmake{a} cmake-data{a} debhelper{a} dh-autoreconf{a} dh-strip-nondeterminism{a} dwz{a} file{a} gettext{a} gettext-base{a} groff-base{a} intltool-debian{a} libarchive-zip-perl{a} libarchive13{a} libbrotli1{a} libcurl4{a} libdebhelper-perl{a} libelf1{a} libexpat1{a} libfile-stripnondeterminism-perl{a} libicu72{a} libjsoncpp25{a} libldap-2.5-0{a} libmagic-mgc{a} libmagic1{a} libnghttp2-14{a} libpipeline1{a} libproc2-0{a} libpsl5{a} librhash0{a} librtmp1{a} libsasl2-2{a} libsasl2-modules-db{a} libssh2-1{a} libsub-override-perl{a} libtool{a} libuchardet0{a} libuv1{a} libxml2{a} m4{a} man-db{a} po-debconf{a} procps{a} sensible-utils{a} The following packages are RECOMMENDED but will NOT be installed: ca-certificates curl libarchive-cpio-perl libldap-common libltdl-dev libmail-sendmail-perl libsasl2-modules lynx psmisc publicsuffix wget 0 packages upgraded, 48 newly installed, 0 to remove and 0 not upgraded. Need to get 32.3 MB of archives. After unpacking 120 MB will be used. Writing extended state information... Get: 1 http://deb.debian.org/debian bookworm/main amd64 libproc2-0 amd64 2:4.0.2-3 [62.8 kB] Get: 2 http://deb.debian.org/debian bookworm/main amd64 procps amd64 2:4.0.2-3 [709 kB] Get: 3 http://deb.debian.org/debian bookworm/main amd64 sensible-utils all 0.0.17+nmu1 [19.0 kB] Get: 4 http://deb.debian.org/debian bookworm/main amd64 libmagic-mgc amd64 1:5.44-3 [305 kB] Get: 5 http://deb.debian.org/debian bookworm/main amd64 libmagic1 amd64 1:5.44-3 [104 kB] Get: 6 http://deb.debian.org/debian bookworm/main amd64 file amd64 1:5.44-3 [42.5 kB] Get: 7 http://deb.debian.org/debian bookworm/main amd64 gettext-base amd64 0.21-12 [160 kB] Get: 8 http://deb.debian.org/debian bookworm/main amd64 libuchardet0 amd64 0.0.7-1 [67.8 kB] Get: 9 http://deb.debian.org/debian bookworm/main amd64 groff-base amd64 1.22.4-10 [916 kB] Get: 10 http://deb.debian.org/debian bookworm/main amd64 bsdextrautils amd64 2.38.1-5+b1 [86.6 kB] Get: 11 http://deb.debian.org/debian bookworm/main amd64 libpipeline1 amd64 1.5.7-1 [38.5 kB] Get: 12 http://deb.debian.org/debian bookworm/main amd64 man-db amd64 2.11.2-2 [1386 kB] Get: 13 http://deb.debian.org/debian bookworm/main amd64 m4 amd64 1.4.19-3 [287 kB] Get: 14 http://deb.debian.org/debian bookworm/main amd64 autoconf all 2.71-3 [332 kB] Get: 15 http://deb.debian.org/debian bookworm/main amd64 autotools-dev all 20220109.1 [51.6 kB] Get: 16 http://deb.debian.org/debian bookworm/main amd64 automake all 1:1.16.5-1.3 [823 kB] Get: 17 http://deb.debian.org/debian bookworm/main amd64 autopoint all 0.21-12 [495 kB] Get: 18 http://deb.debian.org/debian bookworm/main amd64 libicu72 amd64 72.1-3 [9376 kB] Get: 19 http://deb.debian.org/debian bookworm/main amd64 libxml2 amd64 2.9.14+dfsg-1.1+b3 [687 kB] Get: 20 http://deb.debian.org/debian bookworm/main amd64 libarchive13 amd64 3.6.2-1 [343 kB] Get: 21 http://deb.debian.org/debian bookworm/main amd64 libbrotli1 amd64 1.0.9-2+b6 [275 kB] Get: 22 http://deb.debian.org/debian bookworm/main amd64 libsasl2-modules-db amd64 2.1.28+dfsg-10 [20.3 kB] Get: 23 http://deb.debian.org/debian bookworm/main amd64 libsasl2-2 amd64 2.1.28+dfsg-10 [59.7 kB] Get: 24 http://deb.debian.org/debian bookworm/main amd64 libldap-2.5-0 amd64 2.5.13+dfsg-5 [183 kB] Get: 25 http://deb.debian.org/debian bookworm/main amd64 libnghttp2-14 amd64 1.52.0-1 [72.3 kB] Get: 26 http://deb.debian.org/debian bookworm/main amd64 libpsl5 amd64 0.21.2-1 [58.7 kB] Get: 27 http://deb.debian.org/debian bookworm/main amd64 librtmp1 amd64 2.4+20151223.gitfa8646d.1-2+b2 [60.8 kB] Get: 28 http://deb.debian.org/debian bookworm/main amd64 libssh2-1 amd64 1.10.0-3+b1 [179 kB] Get: 29 http://deb.debian.org/debian bookworm/main amd64 libcurl4 amd64 7.88.1-8 [386 kB] Get: 30 http://deb.debian.org/debian bookworm/main amd64 libexpat1 amd64 2.5.0-1 [99.3 kB] Get: 31 http://deb.debian.org/debian bookworm/main amd64 libjsoncpp25 amd64 1.9.5-4 [78.6 kB] Get: 32 http://deb.debian.org/debian bookworm/main amd64 librhash0 amd64 1.4.3-3 [134 kB] Get: 33 http://deb.debian.org/debian bookworm/main amd64 libuv1 amd64 1.44.2-1 [140 kB] Get: 34 http://deb.debian.org/debian bookworm/main amd64 cmake-data all 3.25.1-1 [2026 kB] Get: 35 http://deb.debian.org/debian bookworm/main amd64 cmake amd64 3.25.1-1 [8692 kB] Get: 36 http://deb.debian.org/debian bookworm/main amd64 libdebhelper-perl all 13.11.4 [81.2 kB] Get: 37 http://deb.debian.org/debian bookworm/main amd64 libtool all 2.4.7-5 [517 kB] Get: 38 http://deb.debian.org/debian bookworm/main amd64 dh-autoreconf all 20 [17.1 kB] Get: 39 http://deb.debian.org/debian bookworm/main amd64 libarchive-zip-perl all 1.68-1 [104 kB] Get: 40 http://deb.debian.org/debian bookworm/main amd64 libsub-override-perl all 0.09-4 [9304 B] Get: 41 http://deb.debian.org/debian bookworm/main amd64 libfile-stripnondeterminism-perl all 1.13.1-1 [19.4 kB] Get: 42 http://deb.debian.org/debian bookworm/main amd64 dh-strip-nondeterminism all 1.13.1-1 [8620 B] Get: 43 http://deb.debian.org/debian bookworm/main amd64 libelf1 amd64 0.188-2.1 [174 kB] Get: 44 http://deb.debian.org/debian bookworm/main amd64 dwz amd64 0.15-1 [109 kB] Get: 45 http://deb.debian.org/debian bookworm/main amd64 gettext amd64 0.21-12 [1300 kB] Get: 46 http://deb.debian.org/debian bookworm/main amd64 intltool-debian all 0.35.0+20060710.6 [22.9 kB] Get: 47 http://deb.debian.org/debian bookworm/main amd64 po-debconf all 1.0.21+nmu1 [248 kB] Get: 48 http://deb.debian.org/debian bookworm/main amd64 debhelper all 13.11.4 [942 kB] Fetched 32.3 MB in 0s (74.2 MB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libproc2-0:amd64. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 19596 files and directories currently installed.) Preparing to unpack .../00-libproc2-0_2%3a4.0.2-3_amd64.deb ... Unpacking libproc2-0:amd64 (2:4.0.2-3) ... Selecting previously unselected package procps. Preparing to unpack .../01-procps_2%3a4.0.2-3_amd64.deb ... Unpacking procps (2:4.0.2-3) ... Selecting previously unselected package sensible-utils. Preparing to unpack .../02-sensible-utils_0.0.17+nmu1_all.deb ... Unpacking sensible-utils (0.0.17+nmu1) ... Selecting previously unselected package libmagic-mgc. Preparing to unpack .../03-libmagic-mgc_1%3a5.44-3_amd64.deb ... Unpacking libmagic-mgc (1:5.44-3) ... Selecting previously unselected package libmagic1:amd64. Preparing to unpack .../04-libmagic1_1%3a5.44-3_amd64.deb ... Unpacking libmagic1:amd64 (1:5.44-3) ... Selecting previously unselected package file. Preparing to unpack .../05-file_1%3a5.44-3_amd64.deb ... Unpacking file (1:5.44-3) ... Selecting previously unselected package gettext-base. Preparing to unpack .../06-gettext-base_0.21-12_amd64.deb ... Unpacking gettext-base (0.21-12) ... Selecting previously unselected package libuchardet0:amd64. Preparing to unpack .../07-libuchardet0_0.0.7-1_amd64.deb ... Unpacking libuchardet0:amd64 (0.0.7-1) ... Selecting previously unselected package groff-base. Preparing to unpack .../08-groff-base_1.22.4-10_amd64.deb ... Unpacking groff-base (1.22.4-10) ... Selecting previously unselected package bsdextrautils. Preparing to unpack .../09-bsdextrautils_2.38.1-5+b1_amd64.deb ... Unpacking bsdextrautils (2.38.1-5+b1) ... Selecting previously unselected package libpipeline1:amd64. Preparing to unpack .../10-libpipeline1_1.5.7-1_amd64.deb ... Unpacking libpipeline1:amd64 (1.5.7-1) ... Selecting previously unselected package man-db. Preparing to unpack .../11-man-db_2.11.2-2_amd64.deb ... Unpacking man-db (2.11.2-2) ... Selecting previously unselected package m4. Preparing to unpack .../12-m4_1.4.19-3_amd64.deb ... Unpacking m4 (1.4.19-3) ... Selecting previously unselected package autoconf. Preparing to unpack .../13-autoconf_2.71-3_all.deb ... Unpacking autoconf (2.71-3) ... Selecting previously unselected package autotools-dev. Preparing to unpack .../14-autotools-dev_20220109.1_all.deb ... Unpacking autotools-dev (20220109.1) ... Selecting previously unselected package automake. Preparing to unpack .../15-automake_1%3a1.16.5-1.3_all.deb ... Unpacking automake (1:1.16.5-1.3) ... Selecting previously unselected package autopoint. Preparing to unpack .../16-autopoint_0.21-12_all.deb ... Unpacking autopoint (0.21-12) ... Selecting previously unselected package libicu72:amd64. Preparing to unpack .../17-libicu72_72.1-3_amd64.deb ... Unpacking libicu72:amd64 (72.1-3) ... Selecting previously unselected package libxml2:amd64. Preparing to unpack .../18-libxml2_2.9.14+dfsg-1.1+b3_amd64.deb ... Unpacking libxml2:amd64 (2.9.14+dfsg-1.1+b3) ... Selecting previously unselected package libarchive13:amd64. Preparing to unpack .../19-libarchive13_3.6.2-1_amd64.deb ... Unpacking libarchive13:amd64 (3.6.2-1) ... Selecting previously unselected package libbrotli1:amd64. Preparing to unpack .../20-libbrotli1_1.0.9-2+b6_amd64.deb ... Unpacking libbrotli1:amd64 (1.0.9-2+b6) ... Selecting previously unselected package libsasl2-modules-db:amd64. Preparing to unpack .../21-libsasl2-modules-db_2.1.28+dfsg-10_amd64.deb ... Unpacking libsasl2-modules-db:amd64 (2.1.28+dfsg-10) ... Selecting previously unselected package libsasl2-2:amd64. Preparing to unpack .../22-libsasl2-2_2.1.28+dfsg-10_amd64.deb ... Unpacking libsasl2-2:amd64 (2.1.28+dfsg-10) ... Selecting previously unselected package libldap-2.5-0:amd64. Preparing to unpack .../23-libldap-2.5-0_2.5.13+dfsg-5_amd64.deb ... Unpacking libldap-2.5-0:amd64 (2.5.13+dfsg-5) ... Selecting previously unselected package libnghttp2-14:amd64. Preparing to unpack .../24-libnghttp2-14_1.52.0-1_amd64.deb ... Unpacking libnghttp2-14:amd64 (1.52.0-1) ... Selecting previously unselected package libpsl5:amd64. Preparing to unpack .../25-libpsl5_0.21.2-1_amd64.deb ... Unpacking libpsl5:amd64 (0.21.2-1) ... Selecting previously unselected package librtmp1:amd64. Preparing to unpack .../26-librtmp1_2.4+20151223.gitfa8646d.1-2+b2_amd64.deb ... Unpacking librtmp1:amd64 (2.4+20151223.gitfa8646d.1-2+b2) ... Selecting previously unselected package libssh2-1:amd64. Preparing to unpack .../27-libssh2-1_1.10.0-3+b1_amd64.deb ... Unpacking libssh2-1:amd64 (1.10.0-3+b1) ... Selecting previously unselected package libcurl4:amd64. Preparing to unpack .../28-libcurl4_7.88.1-8_amd64.deb ... Unpacking libcurl4:amd64 (7.88.1-8) ... Selecting previously unselected package libexpat1:amd64. Preparing to unpack .../29-libexpat1_2.5.0-1_amd64.deb ... Unpacking libexpat1:amd64 (2.5.0-1) ... Selecting previously unselected package libjsoncpp25:amd64. Preparing to unpack .../30-libjsoncpp25_1.9.5-4_amd64.deb ... Unpacking libjsoncpp25:amd64 (1.9.5-4) ... Selecting previously unselected package librhash0:amd64. Preparing to unpack .../31-librhash0_1.4.3-3_amd64.deb ... Unpacking librhash0:amd64 (1.4.3-3) ... Selecting previously unselected package libuv1:amd64. Preparing to unpack .../32-libuv1_1.44.2-1_amd64.deb ... Unpacking libuv1:amd64 (1.44.2-1) ... Selecting previously unselected package cmake-data. Preparing to unpack .../33-cmake-data_3.25.1-1_all.deb ... Unpacking cmake-data (3.25.1-1) ... Selecting previously unselected package cmake. Preparing to unpack .../34-cmake_3.25.1-1_amd64.deb ... Unpacking cmake (3.25.1-1) ... Selecting previously unselected package libdebhelper-perl. Preparing to unpack .../35-libdebhelper-perl_13.11.4_all.deb ... Unpacking libdebhelper-perl (13.11.4) ... Selecting previously unselected package libtool. Preparing to unpack .../36-libtool_2.4.7-5_all.deb ... Unpacking libtool (2.4.7-5) ... Selecting previously unselected package dh-autoreconf. Preparing to unpack .../37-dh-autoreconf_20_all.deb ... Unpacking dh-autoreconf (20) ... Selecting previously unselected package libarchive-zip-perl. Preparing to unpack .../38-libarchive-zip-perl_1.68-1_all.deb ... Unpacking libarchive-zip-perl (1.68-1) ... Selecting previously unselected package libsub-override-perl. Preparing to unpack .../39-libsub-override-perl_0.09-4_all.deb ... Unpacking libsub-override-perl (0.09-4) ... Selecting previously unselected package libfile-stripnondeterminism-perl. Preparing to unpack .../40-libfile-stripnondeterminism-perl_1.13.1-1_all.deb ... Unpacking libfile-stripnondeterminism-perl (1.13.1-1) ... Selecting previously unselected package dh-strip-nondeterminism. Preparing to unpack .../41-dh-strip-nondeterminism_1.13.1-1_all.deb ... Unpacking dh-strip-nondeterminism (1.13.1-1) ... Selecting previously unselected package libelf1:amd64. Preparing to unpack .../42-libelf1_0.188-2.1_amd64.deb ... Unpacking libelf1:amd64 (0.188-2.1) ... Selecting previously unselected package dwz. Preparing to unpack .../43-dwz_0.15-1_amd64.deb ... Unpacking dwz (0.15-1) ... Selecting previously unselected package gettext. Preparing to unpack .../44-gettext_0.21-12_amd64.deb ... Unpacking gettext (0.21-12) ... Selecting previously unselected package intltool-debian. Preparing to unpack .../45-intltool-debian_0.35.0+20060710.6_all.deb ... Unpacking intltool-debian (0.35.0+20060710.6) ... Selecting previously unselected package po-debconf. Preparing to unpack .../46-po-debconf_1.0.21+nmu1_all.deb ... Unpacking po-debconf (1.0.21+nmu1) ... Selecting previously unselected package debhelper. Preparing to unpack .../47-debhelper_13.11.4_all.deb ... Unpacking debhelper (13.11.4) ... Setting up libexpat1:amd64 (2.5.0-1) ... Setting up libpipeline1:amd64 (1.5.7-1) ... Setting up libpsl5:amd64 (0.21.2-1) ... Setting up libicu72:amd64 (72.1-3) ... Setting up bsdextrautils (2.38.1-5+b1) ... Setting up libmagic-mgc (1:5.44-3) ... Setting up libarchive-zip-perl (1.68-1) ... Setting up libdebhelper-perl (13.11.4) ... Setting up libbrotli1:amd64 (1.0.9-2+b6) ... Setting up libnghttp2-14:amd64 (1.52.0-1) ... Setting up libmagic1:amd64 (1:5.44-3) ... Setting up gettext-base (0.21-12) ... Setting up m4 (1.4.19-3) ... Setting up file (1:5.44-3) ... Setting up libsasl2-modules-db:amd64 (2.1.28+dfsg-10) ... Setting up autotools-dev (20220109.1) ... Setting up libuv1:amd64 (1.44.2-1) ... Setting up librtmp1:amd64 (2.4+20151223.gitfa8646d.1-2+b2) ... Setting up libproc2-0:amd64 (2:4.0.2-3) ... Setting up autopoint (0.21-12) ... Setting up libjsoncpp25:amd64 (1.9.5-4) ... Setting up libsasl2-2:amd64 (2.1.28+dfsg-10) ... Setting up autoconf (2.71-3) ... Setting up sensible-utils (0.0.17+nmu1) ... Setting up librhash0:amd64 (1.4.3-3) ... Setting up libuchardet0:amd64 (0.0.7-1) ... Setting up procps (2:4.0.2-3) ... Setting up libsub-override-perl (0.09-4) ... Setting up libssh2-1:amd64 (1.10.0-3+b1) ... Setting up cmake-data (3.25.1-1) ... Setting up libelf1:amd64 (0.188-2.1) ... Setting up libxml2:amd64 (2.9.14+dfsg-1.1+b3) ... Setting up automake (1:1.16.5-1.3) ... update-alternatives: using /usr/bin/automake-1.16 to provide /usr/bin/automake (automake) in auto mode Setting up libfile-stripnondeterminism-perl (1.13.1-1) ... Setting up gettext (0.21-12) ... Setting up libtool (2.4.7-5) ... Setting up libarchive13:amd64 (3.6.2-1) ... Setting up libldap-2.5-0:amd64 (2.5.13+dfsg-5) ... Setting up intltool-debian (0.35.0+20060710.6) ... Setting up dh-autoreconf (20) ... Setting up dh-strip-nondeterminism (1.13.1-1) ... Setting up dwz (0.15-1) ... Setting up groff-base (1.22.4-10) ... Setting up libcurl4:amd64 (7.88.1-8) ... Setting up po-debconf (1.0.21+nmu1) ... Setting up man-db (2.11.2-2) ... Not building database; man-db/auto-update is not 'true'. Setting up cmake (3.25.1-1) ... Setting up debhelper (13.11.4) ... Processing triggers for libc-bin (2.36-9) ... Reading package lists... Building dependency tree... Reading state information... Reading extended state information... Initializing package states... Writing extended state information... Building tag database... -> Finished parsing the build-deps Reading package lists... Building dependency tree... Reading state information... usrmerge is already the newest version (35). fakeroot is already the newest version (1.31-1.2). 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. I: Building the package I: user script /srv/workspace/pbuilder/41679/tmp/hooks/A99_set_merged_usr starting Re-configuring usrmerge... removed '/etc/unsupported-skip-usrmerge-conversion' The system has been successfully converted. I: user script /srv/workspace/pbuilder/41679/tmp/hooks/A99_set_merged_usr finished hostname: Name or service not known I: Running cd /build/gemmlowp-0.0~git20211220.e844ffd/ && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/i/capture/the/path" HOME="/nonexistent/second-build" dpkg-buildpackage -us -uc -b && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/i/capture/the/path" HOME="/nonexistent/second-build" dpkg-genchanges -S > ../gemmlowp_0.0~git20211220.e844ffd-1_source.changes dpkg-buildpackage: info: source package gemmlowp dpkg-buildpackage: info: source version 0.0~git20211220.e844ffd-1 dpkg-buildpackage: info: source distribution unstable dpkg-buildpackage: info: source changed by Mo Zhou dpkg-source --before-build . dpkg-buildpackage: info: host architecture amd64 fakeroot debian/rules clean dh clean -Scmake debian/rules override_dh_auto_clean make[1]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd' rm -f CMakeLists.txt dh_auto_clean make[1]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd' dh_clean -O-Scmake debian/rules build dh build -Scmake dh_update_autotools_config -O-Scmake dh_autoreconf -O-Scmake debian/rules override_dh_auto_configure make[1]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd' ln -s contrib/CMakeLists.txt . dh_auto_configure -- \ -DCMAKE_C_FLAGS="-g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2" \ -DCMAKE_CXX_FLAGS="-g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2" cd obj-x86_64-linux-gnu && cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=None -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_INSTALL_LOCALSTATEDIR=/var -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_USE_PACKAGE_REGISTRY=OFF -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON -DFETCHCONTENT_FULLY_DISCONNECTED=ON -DCMAKE_INSTALL_RUNSTATEDIR=/run -DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON "-GUnix Makefiles" -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_INSTALL_LIBDIR=lib/x86_64-linux-gnu "-DCMAKE_C_FLAGS=-g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2" "-DCMAKE_CXX_FLAGS=-g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2" .. -- The C compiler identification is GNU 12.2.0 -- The CXX compiler identification is GNU 12.2.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Configuring done -- Generating done CMake Warning: Manually-specified variables were not used by the project: CMAKE_EXPORT_NO_PACKAGE_REGISTRY CMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY CMAKE_FIND_USE_PACKAGE_REGISTRY FETCHCONTENT_FULLY_DISCONNECTED -- Build files have been written to: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu make[1]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd' dh_auto_build -O-Scmake cd obj-x86_64-linux-gnu && make -j16 "INSTALL=install --strip-program=true" VERBOSE=1 make[1]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -S"/build/gemmlowp-0.0~git20211220.e844ffd" -B"/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu//CMakeFiles/progress.marks" make -f CMakeFiles/Makefile2 all make[2]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/depend make -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/depend make -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/depend make -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/depend make -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/depend make -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/depend make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/eight_bit_int_gemm.dir/DependInfo.cmake" --color= make -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/depend make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/benchmark.dir/DependInfo.cmake" --color= make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/benchmark_all_sizes.dir/DependInfo.cmake" --color= make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_math_helpers.dir/DependInfo.cmake" --color= make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_blocking_counter.dir/DependInfo.cmake" --color= make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_allocator.dir/DependInfo.cmake" --color= make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_fixedpoint.dir/DependInfo.cmake" --color= make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/build make -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/build make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/build make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/build make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/build make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/build make -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/build make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 11%] Building CXX object CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o [ 11%] Building CXX object CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o [ 17%] Building CXX object CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o [ 23%] Building CXX object CMakeFiles/test_allocator.dir/test/test_allocator.cc.o [ 41%] Building CXX object CMakeFiles/benchmark.dir/test/benchmark.cc.o [ 41%] Building CXX object CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o [ 41%] Building CXX object CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -MF CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o.d -o CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/test_blocking_counter.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -MF CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o.d -o CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/test_math_helpers.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -MF CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o.d -o CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/eight_bit_int_gemm/eight_bit_int_gemm.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -MF CMakeFiles/test_allocator.dir/test/test_allocator.cc.o.d -o CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/test_allocator.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -MF CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o.d -o CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/test_fixedpoint.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -DBENCHMARK_8bit -DBENCHMARK_QUICK -std=gnu++11 -MD -MT CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -MF CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o.d -o CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/benchmark_all_sizes.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/benchmark.dir/test/benchmark.cc.o -MF CMakeFiles/benchmark.dir/test/benchmark.cc.o.d -o CMakeFiles/benchmark.dir/test/benchmark.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/benchmark.cc" [ 47%] Linking CXX executable test_allocator /usr/bin/cmake -E cmake_link_script CMakeFiles/test_allocator.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -o test_allocator [ 52%] Linking CXX executable test_blocking_counter /usr/bin/cmake -E cmake_link_script CMakeFiles/test_blocking_counter.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -o test_blocking_counter -lpthread make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 52%] Built target test_allocator make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 52%] Built target test_blocking_counter [ 58%] Linking CXX executable test_math_helpers /usr/bin/cmake -E cmake_link_script CMakeFiles/test_math_helpers.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -o test_math_helpers make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 58%] Built target test_math_helpers [ 64%] Linking CXX executable benchmark_all_sizes /usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark_all_sizes.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -o benchmark_all_sizes -lpthread [ 70%] Linking CXX executable benchmark /usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/benchmark.dir/test/benchmark.cc.o -o benchmark -lpthread make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 70%] Built target benchmark_all_sizes make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 70%] Built target benchmark [ 76%] Linking CXX executable test_fixedpoint /usr/bin/cmake -E cmake_link_script CMakeFiles/test_fixedpoint.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -o test_fixedpoint make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 76%] Built target test_fixedpoint [ 82%] Linking CXX static library libeight_bit_int_gemm.a /usr/bin/cmake -P CMakeFiles/eight_bit_int_gemm.dir/cmake_clean_target.cmake /usr/bin/cmake -E cmake_link_script CMakeFiles/eight_bit_int_gemm.dir/link.txt --verbose=1 /usr/bin/ar qc libeight_bit_int_gemm.a CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o /usr/bin/ranlib libeight_bit_int_gemm.a make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 82%] Built target eight_bit_int_gemm make -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/depend make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' cd "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles/test_gemmlowp.dir/DependInfo.cmake" --color= make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/build make[3]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [ 88%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test.cc.o [ 94%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc" /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -c "/build/gemmlowp-0.0~git20211220.e844ffd/test/test_data.cc" /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In file included from /usr/include/stdio.h:906, from /usr/include/c++/12/cstdio:42, from /usr/include/c++/12/ext/string_conversions.h:43, from /usr/include/c++/12/bits/basic_string.h:3960, from /usr/include/c++/12/string:53, from /usr/include/c++/12/bits/locale_classes.h:40, from /usr/include/c++/12/bits/ios_base.h:41, from /usr/include/c++/12/ios:42, from /usr/include/c++/12/ostream:38, from /usr/include/c++/12/iostream:39, from /build/gemmlowp-0.0~git20211220.e844ffd/test/test.h:26, from /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:15: In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 4>, gemmlowp::KernelSideFormat, 5> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 4>, gemmlowp::KernelSideFormat, 5> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 3>, gemmlowp::KernelSideFormat, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 2>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::ReferenceKernel, 1>, gemmlowp::KernelSideFormat, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=] 123 | snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=] 123 | snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::SingleThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:123:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]': /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=] 163 | snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name()); | ^~ In function 'int snprintf(char*, size_t, const char*, ...)', inlined from 'static const char* gemmlowp::MultiThreadGemmWrapper::Name() [with Kernel = gemmlowp::DefaultKernel, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams, gemmlowp::OperandRange<0, 255> >]' at /build/gemmlowp-0.0~git20211220.e844ffd/test/test.cc:163:13: /usr/include/x86_64-linux-gnu/bits/stdio2.h:54:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256 54 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ [100%] Linking CXX executable test_gemmlowp /usr/bin/cmake -E cmake_link_script CMakeFiles/test_gemmlowp.dir/link.txt --verbose=1 /usr/bin/c++ -g -O2 -ffile-prefix-map=/build/gemmlowp-0.0~git20211220.e844ffd=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_gemmlowp.dir/test/test.cc.o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -o test_gemmlowp libeight_bit_int_gemm.a -lpthread make[3]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' [100%] Built target test_gemmlowp make[2]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -E cmake_progress_start "/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/CMakeFiles" 0 make[1]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' dh_auto_test -O-Scmake cd obj-x86_64-linux-gnu && make -j16 test ARGS\+=--verbose ARGS\+=-j16 make[1]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' Running tests... /usr/bin/ctest --force-new-ctest-process --verbose -j16 UpdateCTestConfiguration from :/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Parse Config file:/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl UpdateCTestConfiguration from :/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Parse Config file:/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/DartConfiguration.tcl Test project /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu Constructing a list of tests Done constructing a list of tests Updating test list for fixtures Added 0 tests to meet fixture requirements Checking test dependency graph... Checking test dependency graph end test 1 Start 1: test_math_helpers 1: Test command: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_math_helpers 1: Working Directory: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 1: Test timeout computed to be: 1500 test 2 Start 2: test_blocking_counter 2: Test command: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_blocking_counter 2: Working Directory: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 2: Test timeout computed to be: 1500 test 3 Start 3: test_allocator 3: Test command: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_allocator 3: Working Directory: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 3: Test timeout computed to be: 1500 test 4 Start 4: test_fixedpoint 4: Test command: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_fixedpoint 4: Working Directory: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 4: Test timeout computed to be: 1500 test 5 Start 5: test_gemmlowp 5: Test command: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu/test_gemmlowp 5: Working Directory: /build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu 5: Test timeout computed to be: 1500 1/5 Test #1: test_math_helpers ................ Passed 0.01 sec 2/5 Test #3: test_allocator ................... Passed 0.01 sec 5: TestWithSmallData: PASS 5: number of matrix entries: 8 5: median value: 136 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 3/5 Test #2: test_blocking_counter ............ Passed 0.03 sec 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 4: PASS (Scalar int32) 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 4: PASS (Scalar int16) 4/5 Test #4: test_fixedpoint .................. Passed 0.41 sec 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 6 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 8 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10 5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12 5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14 5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16 5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16 5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22 5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24 5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18 5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18 5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 24 5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/0/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/10/0, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/10, mult 1, shift 12 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/10/10, mult 10, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 256/1/17, mult 4, shift 16 5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 18 5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20 5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 24 5: TestWithRealData: PASS with Lhs: 8 bit, Rhs: 8 bit 5: number of matrix entries: 49152 5: median value: 104 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestWithRealData: PASS with (legacy, no longer requantizing) Lhs: 7 bit, Rhs: 5 bit 5: number of matrix entries: 49152 5: median value: 104 5: median unsigned diff: 0 (tolerating 2) 5: max unsigned diff: 0 (tolerating 10) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0.2) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestOutputStages: PASS with ResultOrder=RowMajor 5: TestOutputStages: PASS with ResultOrder=ColMajor 5: TestWithSmallDataPerChannelQuantization: PASS 5: number of matrix entries: 18 5: median value: 127 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestWithLargeDataPerChannelQuantization: PASS 5: number of matrix entries: 550 5: median value: 7 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: TestMultithreadedPerChannelQuantization: PASS 5: number of matrix entries: 1280 5: median value: 0 5: median unsigned diff: 0 (tolerating 0) 5: max unsigned diff: 0 (tolerating 0) 5: median signed diff: 0 (tolerating 0) 5: mean signed diff: 0 (tolerating 0) 5: No error: 100.00 % of entries 5: Error in 1..1 range: 0.00 % of entries 5: Error in 2..3 range: 0.00 % of entries 5: Error in 4..7 range: 0.00 % of entries 5: Error in 8..15 range: 0.00 % of entries 5: Error in 16..31 range: 0.00 % of entries 5: Error in 32..63 range: 0.00 % of entries 5: Error in 64..127 range: 0.00 % of entries 5: Error in 128..255 range: 0.00 % of entries 5: All tests passed. 5/5 Test #5: test_gemmlowp .................... Passed 176.79 sec 100% tests passed, 0 tests failed out of 5 Total Test time (real) = 176.80 sec make[1]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' create-stamp debian/debhelper-build-stamp fakeroot debian/rules binary dh binary -Scmake dh_testroot -O-Scmake dh_prep -O-Scmake dh_auto_install --destdir=debian/libgemmlowp-dev/ -O-Scmake cd obj-x86_64-linux-gnu && make -j16 install DESTDIR=/build/gemmlowp-0.0\~git20211220.e844ffd/debian/libgemmlowp-dev AM_UPDATE_INFO_DIR=no "INSTALL=install --strip-program=true" make[1]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' /usr/bin/cmake -S"/build/gemmlowp-0.0~git20211220.e844ffd" -B"/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu" --check-build-system CMakeFiles/Makefile.cmake 0 make -f CMakeFiles/Makefile2 preinstall make[2]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' make[2]: Nothing to be done for 'preinstall'. make[2]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' Install the project... /usr/bin/cmake -P cmake_install.cmake -- Install configuration: "None" -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/eight_bit_int_gemm/eight_bit_int_gemm.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/base.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_common.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemm.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemv.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_operations_common.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_single_thread_gemm.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_common.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_gemm.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_transform.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_32.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_64.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_gemm.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_transform.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_32.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_64.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_32.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_64.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/bit_depth.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/gemmlowp.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/map.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/public/output_stages.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/instrumentation.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/profiler.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/pthread_everywhere.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/allocator.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/block_params.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/common.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/compute.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/detect_platform.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/dispatch_gemm_shape.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_avx.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_default.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_msa.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_neon.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_reference.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_sse.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/multi_thread_gemm.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_avx.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_msa.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_neon.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_sse.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_avx.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_msa.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_neon.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_sse.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/platform.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_common_neon_sse.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_msa.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_neon.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_sse.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/single_thread_gemm.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/unpack.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_avx.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_msa.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_neon.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_sse.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_wasmsimd.h -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/libeight_bit_int_gemm.a -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/cmake/gemmlowp/gemmlowp-config.cmake -- Installing: /build/gemmlowp-0.0~git20211220.e844ffd/debian/libgemmlowp-dev/usr/lib/x86_64-linux-gnu/cmake/gemmlowp/gemmlowp-config-none.cmake make[1]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd/obj-x86_64-linux-gnu' dh_install -O-Scmake debian/rules override_dh_installdocs make[1]: Entering directory '/build/gemmlowp-0.0~git20211220.e844ffd' mkdir -p debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/ install meta/README debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/ dh_installdocs make[1]: Leaving directory '/build/gemmlowp-0.0~git20211220.e844ffd' dh_installchangelogs -O-Scmake dh_installexamples -O-Scmake dh_installinit -O-Scmake dh_perl -O-Scmake dh_link -O-Scmake dh_strip_nondeterminism -O-Scmake dh_compress -O-Scmake dh_fixperms -O-Scmake dh_missing -O-Scmake dh_dwz -a -O-Scmake dh_strip -a -O-Scmake dh_makeshlibs -a -O-Scmake dh_shlibdeps -a -O-Scmake dh_installdeb -O-Scmake dh_gencontrol -O-Scmake dh_md5sums -O-Scmake dh_builddeb -O-Scmake dpkg-deb: building package 'libgemmlowp-dev' in '../libgemmlowp-dev_0.0~git20211220.e844ffd-1_amd64.deb'. dpkg-genbuildinfo --build=binary -O../gemmlowp_0.0~git20211220.e844ffd-1_amd64.buildinfo dpkg-genchanges --build=binary -O../gemmlowp_0.0~git20211220.e844ffd-1_amd64.changes dpkg-genchanges: info: binary-only upload (no source code included) dpkg-source --after-build . dpkg-buildpackage: info: binary-only upload (no source included) dpkg-genchanges: info: including full source code in upload I: copying local configuration I: user script /srv/workspace/pbuilder/41679/tmp/hooks/B01_cleanup starting I: user script /srv/workspace/pbuilder/41679/tmp/hooks/B01_cleanup finished I: unmounting dev/ptmx filesystem I: unmounting dev/pts filesystem I: unmounting dev/shm filesystem I: unmounting proc filesystem I: unmounting sys filesystem I: cleaning the build env I: removing directory /srv/workspace/pbuilder/41679 and its subdirectories I: Current time: Thu May 23 03:55:16 +14 2024 I: pbuilder-time-stamp: 1716386116