I: pbuilder: network access will be disabled during build I: Current time: Sat Jan 6 15:47:24 -12 2024 I: pbuilder-time-stamp: 1704599244 I: Building the build Environment I: extracting base tarball [/var/cache/pbuilder/bullseye-reproducible-base.tgz] I: copying local configuration W: --override-config is not set; not updating apt.conf Read the manpage for details. I: mounting /proc filesystem I: mounting /sys filesystem I: creating /{dev,run}/shm I: mounting /dev/pts filesystem I: redirecting /dev/ptmx to /dev/pts/ptmx I: policy-rc.d already exists I: Copying source file I: copying [python-streamz_0.6.2-1.dsc] I: copying [./python-streamz_0.6.2.orig.tar.gz] I: copying [./python-streamz_0.6.2-1.debian.tar.xz] I: Extracting source gpgv: unknown type of key resource 'trustedkeys.kbx' gpgv: keyblock resource '/tmp/dpkg-verify-sig.IQ_WMuNx/trustedkeys.kbx': General error gpgv: Signature made Mon Jan 18 02:42:11 2021 -12 gpgv: using RSA key 3E99A526F5DCC0CBBF1CEEA600BAE74B343369F1 gpgv: issuer "npatra974@gmail.com" gpgv: Can't check signature: No public key dpkg-source: warning: failed to verify signature on ./python-streamz_0.6.2-1.dsc dpkg-source: info: extracting python-streamz in python-streamz-0.6.2 dpkg-source: info: unpacking python-streamz_0.6.2.orig.tar.gz dpkg-source: info: unpacking python-streamz_0.6.2-1.debian.tar.xz dpkg-source: info: using patch list from debian/patches/series dpkg-source: info: applying disable-unsupported-tests.patch I: Not using root during the build. I: Installing the build-deps I: user script /srv/workspace/pbuilder/19490/tmp/hooks/D02_print_environment starting I: set BUILDDIR='/build/reproducible-path' BUILDUSERGECOS='first user,first room,first work-phone,first home-phone,first other' BUILDUSERNAME='pbuilder1' BUILD_ARCH='armhf' DEBIAN_FRONTEND='noninteractive' DEB_BUILD_OPTIONS='buildinfo=+all reproducible=+all,-fixfilepath parallel=3 ' DISTRIBUTION='bullseye' HOME='/root' HOST_ARCH='armhf' IFS=' ' INVOCATION_ID='c72a6c93ca644858be96dd859cdf3a64' LANG='C' LANGUAGE='en_US:en' LC_ALL='C' MAIL='/var/mail/root' OPTIND='1' PATH='/usr/sbin:/usr/bin:/sbin:/bin:/usr/games' PBCURRENTCOMMANDLINEOPERATION='build' PBUILDER_OPERATION='build' PBUILDER_PKGDATADIR='/usr/share/pbuilder' PBUILDER_PKGLIBDIR='/usr/lib/pbuilder' PBUILDER_SYSCONFDIR='/etc' PPID='19490' PS1='# ' PS2='> ' PS4='+ ' PWD='/' SHELL='/bin/bash' SHLVL='2' SUDO_COMMAND='/usr/bin/timeout -k 18.1h 18h /usr/bin/ionice -c 3 /usr/bin/nice /usr/sbin/pbuilder --build --configfile /srv/reproducible-results/rbuild-debian/r-b-build.y4SizMDJ/pbuilderrc_zis7 --distribution bullseye --hookdir /etc/pbuilder/first-build-hooks --debbuildopts -b --basetgz /var/cache/pbuilder/bullseye-reproducible-base.tgz --buildresult /srv/reproducible-results/rbuild-debian/r-b-build.y4SizMDJ/b1 --logfile b1/build.log python-streamz_0.6.2-1.dsc' SUDO_GID='114' SUDO_UID='108' SUDO_USER='jenkins' TERM='unknown' TZ='/usr/share/zoneinfo/Etc/GMT+12' USER='root' _='/usr/bin/systemd-run' http_proxy='http://10.0.0.15:3142/' I: uname -a Linux jtx1c 5.10.0-26-arm64 #1 SMP Debian 5.10.197-1 (2023-09-29) aarch64 GNU/Linux I: ls -l /bin total 3580 -rwxr-xr-x 1 root root 816764 Mar 27 2022 bash -rwxr-xr-x 3 root root 26052 Jul 20 2020 bunzip2 -rwxr-xr-x 3 root root 26052 Jul 20 2020 bzcat lrwxrwxrwx 1 root root 6 Jul 20 2020 bzcmp -> bzdiff -rwxr-xr-x 1 root root 2225 Jul 20 2020 bzdiff lrwxrwxrwx 1 root root 6 Jul 20 2020 bzegrep -> bzgrep -rwxr-xr-x 1 root root 4877 Sep 4 2019 bzexe lrwxrwxrwx 1 root root 6 Jul 20 2020 bzfgrep -> bzgrep -rwxr-xr-x 1 root root 3775 Jul 20 2020 bzgrep -rwxr-xr-x 3 root root 26052 Jul 20 2020 bzip2 -rwxr-xr-x 1 root root 9636 Jul 20 2020 bzip2recover lrwxrwxrwx 1 root root 6 Jul 20 2020 bzless -> bzmore -rwxr-xr-x 1 root root 1297 Jul 20 2020 bzmore -rwxr-xr-x 1 root root 26668 Sep 22 2020 cat -rwxr-xr-x 1 root root 43104 Sep 22 2020 chgrp -rwxr-xr-x 1 root root 38984 Sep 22 2020 chmod -rwxr-xr-x 1 root root 43112 Sep 22 2020 chown -rwxr-xr-x 1 root root 92616 Sep 22 2020 cp -rwxr-xr-x 1 root root 75524 Dec 10 2020 dash -rwxr-xr-x 1 root root 75880 Sep 22 2020 date -rwxr-xr-x 1 root root 55436 Sep 22 2020 dd -rwxr-xr-x 1 root root 59912 Sep 22 2020 df -rwxr-xr-x 1 root root 96764 Sep 22 2020 dir -rwxr-xr-x 1 root root 55012 Jan 20 2022 dmesg lrwxrwxrwx 1 root root 8 Nov 6 2019 dnsdomainname -> hostname lrwxrwxrwx 1 root root 8 Nov 6 2019 domainname -> hostname -rwxr-xr-x 1 root root 22508 Sep 22 2020 echo -rwxr-xr-x 1 root root 28 Jan 24 2023 egrep -rwxr-xr-x 1 root root 22496 Sep 22 2020 false -rwxr-xr-x 1 root root 28 Jan 24 2023 fgrep -rwxr-xr-x 1 root root 47492 Jan 20 2022 findmnt -rwsr-xr-x 1 root root 26076 Feb 26 2021 fusermount -rwxr-xr-x 1 root root 124508 Jan 24 2023 grep -rwxr-xr-x 2 root root 2346 Apr 9 2022 gunzip -rwxr-xr-x 1 root root 6447 Apr 9 2022 gzexe -rwxr-xr-x 1 root root 64212 Apr 9 2022 gzip -rwxr-xr-x 1 root root 13784 Nov 6 2019 hostname -rwxr-xr-x 1 root root 43180 Sep 22 2020 ln -rwxr-xr-x 1 root root 35068 Feb 7 2020 login -rwxr-xr-x 1 root root 96764 Sep 22 2020 ls -rwxr-xr-x 1 root root 99940 Jan 20 2022 lsblk -rwxr-xr-x 1 root root 51408 Sep 22 2020 mkdir -rwxr-xr-x 1 root root 43184 Sep 22 2020 mknod -rwxr-xr-x 1 root root 30780 Sep 22 2020 mktemp -rwxr-xr-x 1 root root 34408 Jan 20 2022 more -rwsr-xr-x 1 root root 34400 Jan 20 2022 mount -rwxr-xr-x 1 root root 9824 Jan 20 2022 mountpoint -rwxr-xr-x 1 root root 88524 Sep 22 2020 mv lrwxrwxrwx 1 root root 8 Nov 6 2019 nisdomainname -> hostname lrwxrwxrwx 1 root root 14 Dec 16 2021 pidof -> /sbin/killall5 -rwxr-xr-x 1 root root 26652 Sep 22 2020 pwd lrwxrwxrwx 1 root root 4 Mar 27 2022 rbash -> bash -rwxr-xr-x 1 root root 30740 Sep 22 2020 readlink -rwxr-xr-x 1 root root 43104 Sep 22 2020 rm -rwxr-xr-x 1 root root 30732 Sep 22 2020 rmdir -rwxr-xr-x 1 root root 14144 Sep 27 2020 run-parts -rwxr-xr-x 1 root root 76012 Dec 22 2018 sed lrwxrwxrwx 1 root root 4 Nov 6 21:26 sh -> dash -rwxr-xr-x 1 root root 22532 Sep 22 2020 sleep -rwxr-xr-x 1 root root 55360 Sep 22 2020 stty -rwsr-xr-x 1 root root 46704 Jan 20 2022 su -rwxr-xr-x 1 root root 22532 Sep 22 2020 sync -rwxr-xr-x 1 root root 340872 Feb 16 2021 tar -rwxr-xr-x 1 root root 9808 Sep 27 2020 tempfile -rwxr-xr-x 1 root root 67696 Sep 22 2020 touch -rwxr-xr-x 1 root root 22496 Sep 22 2020 true -rwxr-xr-x 1 root root 9636 Feb 26 2021 ulockmgr_server -rwsr-xr-x 1 root root 22108 Jan 20 2022 umount -rwxr-xr-x 1 root root 22520 Sep 22 2020 uname -rwxr-xr-x 2 root root 2346 Apr 9 2022 uncompress -rwxr-xr-x 1 root root 96764 Sep 22 2020 vdir -rwxr-xr-x 1 root root 38512 Jan 20 2022 wdctl lrwxrwxrwx 1 root root 8 Nov 6 2019 ypdomainname -> hostname -rwxr-xr-x 1 root root 1984 Apr 9 2022 zcat -rwxr-xr-x 1 root root 1678 Apr 9 2022 zcmp -rwxr-xr-x 1 root root 5898 Apr 9 2022 zdiff -rwxr-xr-x 1 root root 29 Apr 9 2022 zegrep -rwxr-xr-x 1 root root 29 Apr 9 2022 zfgrep -rwxr-xr-x 1 root root 2081 Apr 9 2022 zforce -rwxr-xr-x 1 root root 8049 Apr 9 2022 zgrep -rwxr-xr-x 1 root root 2206 Apr 9 2022 zless -rwxr-xr-x 1 root root 1842 Apr 9 2022 zmore -rwxr-xr-x 1 root root 4577 Apr 9 2022 znew I: user script /srv/workspace/pbuilder/19490/tmp/hooks/D02_print_environment finished -> Attempting to satisfy build-dependencies -> Creating pbuilder-satisfydepends-dummy package Package: pbuilder-satisfydepends-dummy Version: 0.invalid.0 Architecture: armhf Maintainer: Debian Pbuilder Team Description: Dummy package to satisfy dependencies with aptitude - created by pbuilder This package was created automatically by pbuilder to satisfy the build-dependencies of the package being currently built. Depends: debhelper-compat (= 13), dh-python, python3-all, python3-setuptools, python3-six, python3-toolz, python3-tornado, python3-pytest, python3-requests, python3-dask, python3-distributed, python3-numpy, python3-pandas, python3-flaky dpkg-deb: building package 'pbuilder-satisfydepends-dummy' in '/tmp/satisfydepends-aptitude/pbuilder-satisfydepends-dummy.deb'. Selecting previously unselected package pbuilder-satisfydepends-dummy. (Reading database ... 19448 files and directories currently installed.) Preparing to unpack .../pbuilder-satisfydepends-dummy.deb ... Unpacking pbuilder-satisfydepends-dummy (0.invalid.0) ... dpkg: pbuilder-satisfydepends-dummy: dependency problems, but configuring anyway as you requested: pbuilder-satisfydepends-dummy depends on debhelper-compat (= 13); however: Package debhelper-compat is not installed. pbuilder-satisfydepends-dummy depends on dh-python; however: Package dh-python is not installed. pbuilder-satisfydepends-dummy depends on python3-all; however: Package python3-all is not installed. pbuilder-satisfydepends-dummy depends on python3-setuptools; however: Package python3-setuptools is not installed. pbuilder-satisfydepends-dummy depends on python3-six; however: Package python3-six is not installed. pbuilder-satisfydepends-dummy depends on python3-toolz; however: Package python3-toolz is not installed. pbuilder-satisfydepends-dummy depends on python3-tornado; however: Package python3-tornado is not installed. pbuilder-satisfydepends-dummy depends on python3-pytest; however: Package python3-pytest is not installed. pbuilder-satisfydepends-dummy depends on python3-requests; however: Package python3-requests is not installed. pbuilder-satisfydepends-dummy depends on python3-dask; however: Package python3-dask is not installed. pbuilder-satisfydepends-dummy depends on python3-distributed; however: Package python3-distributed is not installed. pbuilder-satisfydepends-dummy depends on python3-numpy; however: Package python3-numpy is not installed. pbuilder-satisfydepends-dummy depends on python3-pandas; however: Package python3-pandas is not installed. pbuilder-satisfydepends-dummy depends on python3-flaky; however: Package python3-flaky is not installed. Setting up pbuilder-satisfydepends-dummy (0.invalid.0) ... Reading package lists... Building dependency tree... Reading state information... Initializing package states... Writing extended state information... Building tag database... pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) The following NEW packages will be installed: autoconf{a} automake{a} autopoint{a} autotools-dev{a} bsdextrautils{a} ca-certificates{a} debhelper{a} dh-autoreconf{a} dh-python{a} dh-strip-nondeterminism{a} dwz{a} file{a} gettext{a} gettext-base{a} groff-base{a} intltool-debian{a} libarchive-zip-perl{a} libblas3{a} libdebhelper-perl{a} libelf1{a} libexpat1{a} libfile-stripnondeterminism-perl{a} libgfortran5{a} libicu67{a} liblapack3{a} libmagic-mgc{a} libmagic1{a} libmpdec3{a} libpipeline1{a} libpython3-stdlib{a} libpython3.9-minimal{a} libpython3.9-stdlib{a} libreadline8{a} libsigsegv2{a} libsub-override-perl{a} libtool{a} libuchardet0{a} libxml2{a} libyaml-0-2{a} m4{a} man-db{a} media-types{a} openssl{a} po-debconf{a} python3{a} python3-all{a} python3-attr{a} python3-certifi{a} python3-chardet{a} python3-click{a} python3-cloudpickle{a} python3-colorama{a} python3-dask{a} python3-dateutil{a} python3-distributed{a} python3-distutils{a} python3-flaky{a} python3-fsspec{a} python3-heapdict{a} python3-idna{a} python3-importlib-metadata{a} python3-iniconfig{a} python3-lib2to3{a} python3-minimal{a} python3-more-itertools{a} python3-msgpack{a} python3-numpy{a} python3-packaging{a} python3-pandas{a} python3-pandas-lib{a} python3-pkg-resources{a} python3-pluggy{a} python3-psutil{a} python3-py{a} python3-pyparsing{a} python3-pytest{a} python3-requests{a} python3-setuptools{a} python3-six{a} python3-sortedcontainers{a} python3-tblib{a} python3-toml{a} python3-toolz{a} python3-tornado{a} python3-tz{a} python3-urllib3{a} python3-yaml{a} python3-zict{a} python3-zipp{a} python3.9{a} python3.9-minimal{a} readline-common{a} sensible-utils{a} The following packages are RECOMMENDED but will NOT be installed: curl libarchive-cpio-perl libltdl-dev libmail-sendmail-perl lynx python3-bottleneck python3-bs4 python3-html5lib python3-jinja2 python3-lxml python3-matplotlib python3-numexpr python3-odf python3-openpyxl python3-partd python3-pygments python3-scipy python3-tables python3-xlwt wget 0 packages upgraded, 93 newly installed, 0 to remove and 0 not upgraded. Need to get 38.1 MB of archives. After unpacking 149 MB will be used. Writing extended state information... Get: 1 http://deb.debian.org/debian bullseye/main armhf bsdextrautils armhf 2.36.1-8+deb11u1 [139 kB] Get: 2 http://deb.debian.org/debian bullseye/main armhf libuchardet0 armhf 0.0.7-1 [65.0 kB] Get: 3 http://deb.debian.org/debian bullseye/main armhf groff-base armhf 1.22.4-6 [847 kB] Get: 4 http://deb.debian.org/debian bullseye/main armhf libpipeline1 armhf 1.5.3-1 [30.1 kB] Get: 5 http://deb.debian.org/debian bullseye/main armhf man-db armhf 2.9.4-2 [1319 kB] Get: 6 http://deb.debian.org/debian bullseye/main armhf libpython3.9-minimal armhf 3.9.2-1 [790 kB] Get: 7 http://deb.debian.org/debian bullseye/main armhf libexpat1 armhf 2.2.10-2+deb11u5 [78.4 kB] Get: 8 http://deb.debian.org/debian bullseye/main armhf python3.9-minimal armhf 3.9.2-1 [1630 kB] Get: 9 http://deb.debian.org/debian bullseye/main armhf python3-minimal armhf 3.9.2-3 [38.2 kB] Get: 10 http://deb.debian.org/debian bullseye/main armhf media-types all 4.0.0 [30.3 kB] Get: 11 http://deb.debian.org/debian bullseye/main armhf libmpdec3 armhf 2.5.1-1 [74.9 kB] Get: 12 http://deb.debian.org/debian bullseye/main armhf readline-common all 8.1-1 [73.7 kB] Get: 13 http://deb.debian.org/debian bullseye/main armhf libreadline8 armhf 8.1-1 [147 kB] Get: 14 http://deb.debian.org/debian bullseye/main armhf libpython3.9-stdlib armhf 3.9.2-1 [1608 kB] Get: 15 http://deb.debian.org/debian bullseye/main armhf python3.9 armhf 3.9.2-1 [466 kB] Get: 16 http://deb.debian.org/debian bullseye/main armhf libpython3-stdlib armhf 3.9.2-3 [21.4 kB] Get: 17 http://deb.debian.org/debian bullseye/main armhf python3 armhf 3.9.2-3 [37.9 kB] Get: 18 http://deb.debian.org/debian bullseye/main armhf sensible-utils all 0.0.14 [14.8 kB] Get: 19 http://deb.debian.org/debian bullseye/main armhf openssl armhf 1.1.1w-0+deb11u1 [835 kB] Get: 20 http://deb.debian.org/debian bullseye/main armhf ca-certificates all 20210119 [158 kB] Get: 21 http://deb.debian.org/debian bullseye/main armhf libmagic-mgc armhf 1:5.39-3+deb11u1 [273 kB] Get: 22 http://deb.debian.org/debian bullseye/main armhf libmagic1 armhf 1:5.39-3+deb11u1 [120 kB] Get: 23 http://deb.debian.org/debian bullseye/main armhf file armhf 1:5.39-3+deb11u1 [68.2 kB] Get: 24 http://deb.debian.org/debian bullseye/main armhf gettext-base armhf 0.21-4 [171 kB] Get: 25 http://deb.debian.org/debian bullseye/main armhf libsigsegv2 armhf 2.13-1 [34.0 kB] Get: 26 http://deb.debian.org/debian bullseye/main armhf m4 armhf 1.4.18-5 [192 kB] Get: 27 http://deb.debian.org/debian bullseye/main armhf autoconf all 2.69-14 [313 kB] Get: 28 http://deb.debian.org/debian bullseye/main armhf autotools-dev all 20180224.1+nmu1 [77.1 kB] Get: 29 http://deb.debian.org/debian bullseye/main armhf automake all 1:1.16.3-2 [814 kB] Get: 30 http://deb.debian.org/debian bullseye/main armhf autopoint all 0.21-4 [510 kB] Get: 31 http://deb.debian.org/debian bullseye/main armhf libdebhelper-perl all 13.3.4 [189 kB] Get: 32 http://deb.debian.org/debian bullseye/main armhf libtool all 2.4.6-15 [513 kB] Get: 33 http://deb.debian.org/debian bullseye/main armhf dh-autoreconf all 20 [17.1 kB] Get: 34 http://deb.debian.org/debian bullseye/main armhf libarchive-zip-perl all 1.68-1 [104 kB] Get: 35 http://deb.debian.org/debian bullseye/main armhf libsub-override-perl all 0.09-2 [10.2 kB] Get: 36 http://deb.debian.org/debian bullseye/main armhf libfile-stripnondeterminism-perl all 1.12.0-1 [26.3 kB] Get: 37 http://deb.debian.org/debian bullseye/main armhf dh-strip-nondeterminism all 1.12.0-1 [15.4 kB] Get: 38 http://deb.debian.org/debian bullseye/main armhf libelf1 armhf 0.183-1 [161 kB] Get: 39 http://deb.debian.org/debian bullseye/main armhf dwz armhf 0.13+20210201-1 [179 kB] Get: 40 http://deb.debian.org/debian bullseye/main armhf libicu67 armhf 67.1-7 [8319 kB] Get: 41 http://deb.debian.org/debian bullseye/main armhf libxml2 armhf 2.9.10+dfsg-6.7+deb11u4 [602 kB] Get: 42 http://deb.debian.org/debian bullseye/main armhf gettext armhf 0.21-4 [1243 kB] Get: 43 http://deb.debian.org/debian bullseye/main armhf intltool-debian all 0.35.0+20060710.5 [26.8 kB] Get: 44 http://deb.debian.org/debian bullseye/main armhf po-debconf all 1.0.21+nmu1 [248 kB] Get: 45 http://deb.debian.org/debian bullseye/main armhf debhelper all 13.3.4 [1049 kB] Get: 46 http://deb.debian.org/debian bullseye/main armhf python3-lib2to3 all 3.9.2-1 [77.8 kB] Get: 47 http://deb.debian.org/debian bullseye/main armhf python3-distutils all 3.9.2-1 [143 kB] Get: 48 http://deb.debian.org/debian bullseye/main armhf dh-python all 4.20201102+nmu1 [99.4 kB] Get: 49 http://deb.debian.org/debian bullseye/main armhf libblas3 armhf 3.9.0-3+deb11u1 [109 kB] Get: 50 http://deb.debian.org/debian bullseye/main armhf libgfortran5 armhf 10.2.1-6 [237 kB] Get: 51 http://deb.debian.org/debian bullseye/main armhf liblapack3 armhf 3.9.0-3+deb11u1 [1652 kB] Get: 52 http://deb.debian.org/debian bullseye/main armhf libyaml-0-2 armhf 0.2.2-1 [42.0 kB] Get: 53 http://deb.debian.org/debian bullseye/main armhf python3-all armhf 3.9.2-3 [1056 B] Get: 54 http://deb.debian.org/debian bullseye/main armhf python3-attr all 20.3.0-1 [52.9 kB] Get: 55 http://deb.debian.org/debian bullseye/main armhf python3-certifi all 2020.6.20-1 [151 kB] Get: 56 http://deb.debian.org/debian bullseye/main armhf python3-pkg-resources all 52.0.0-4 [190 kB] Get: 57 http://deb.debian.org/debian bullseye/main armhf python3-chardet all 4.0.0-1 [99.0 kB] Get: 58 http://deb.debian.org/debian bullseye/main armhf python3-colorama all 0.4.4-1 [28.5 kB] Get: 59 http://deb.debian.org/debian bullseye/main armhf python3-click all 7.1.2-1 [75.7 kB] Get: 60 http://deb.debian.org/debian bullseye/main armhf python3-cloudpickle all 1.6.0-1 [21.6 kB] Get: 61 http://deb.debian.org/debian bullseye/main armhf python3-fsspec all 0.8.4-1 [65.5 kB] Get: 62 http://deb.debian.org/debian bullseye/main armhf python3-toolz all 0.9.0-1.1 [42.0 kB] Get: 63 http://deb.debian.org/debian bullseye/main armhf python3-yaml armhf 5.3.1-5 [129 kB] Get: 64 http://deb.debian.org/debian bullseye/main armhf python3-dask all 2021.01.0+dfsg-1 [672 kB] Get: 65 http://deb.debian.org/debian bullseye/main armhf python3-six all 1.16.0-2 [17.5 kB] Get: 66 http://deb.debian.org/debian bullseye/main armhf python3-dateutil all 2.8.1-6 [79.2 kB] Get: 67 http://deb.debian.org/debian bullseye/main armhf python3-msgpack armhf 1.0.0-6+b1 [64.6 kB] Get: 68 http://deb.debian.org/debian bullseye/main armhf python3-psutil armhf 5.8.0-1 [183 kB] Get: 69 http://deb.debian.org/debian bullseye/main armhf python3-sortedcontainers all 2.1.0-2 [31.4 kB] Get: 70 http://deb.debian.org/debian bullseye/main armhf python3-tblib all 1.7.0-1 [13.9 kB] Get: 71 http://deb.debian.org/debian bullseye/main armhf python3-heapdict all 1.0.1-1 [5288 B] Get: 72 http://deb.debian.org/debian bullseye/main armhf python3-zict all 2.0.0-1 [9400 B] Get: 73 http://deb.debian.org/debian bullseye/main armhf python3-distributed all 2021.01.0+ds.1-2.1+deb11u1 [474 kB] Get: 74 http://deb.debian.org/debian bullseye/main armhf python3-flaky all 3.7.0-1 [20.1 kB] Get: 75 http://deb.debian.org/debian bullseye/main armhf python3-idna all 2.10-1 [37.4 kB] Get: 76 http://deb.debian.org/debian bullseye/main armhf python3-more-itertools all 4.2.0-3 [42.7 kB] Get: 77 http://deb.debian.org/debian bullseye/main armhf python3-zipp all 1.0.0-3 [6060 B] Get: 78 http://deb.debian.org/debian bullseye/main armhf python3-importlib-metadata all 1.6.0-2 [10.3 kB] Get: 79 http://deb.debian.org/debian bullseye/main armhf python3-iniconfig all 1.1.1-1 [6308 B] Get: 80 http://deb.debian.org/debian bullseye/main armhf python3-numpy armhf 1:1.19.5-1 [2981 kB] Get: 81 http://deb.debian.org/debian bullseye/main armhf python3-pyparsing all 2.4.7-1 [109 kB] Get: 82 http://deb.debian.org/debian bullseye/main armhf python3-packaging all 20.9-2 [33.5 kB] Get: 83 http://deb.debian.org/debian bullseye/main armhf python3-tz all 2021.1-1 [34.8 kB] Get: 84 http://deb.debian.org/debian bullseye/main armhf python3-pandas-lib armhf 1.1.5+dfsg-2 [3026 kB] Get: 85 http://deb.debian.org/debian bullseye/main armhf python3-pandas all 1.1.5+dfsg-2 [2096 kB] Get: 86 http://deb.debian.org/debian bullseye/main armhf python3-pluggy all 0.13.0-6 [22.3 kB] Get: 87 http://deb.debian.org/debian bullseye/main armhf python3-py all 1.10.0-1 [94.2 kB] Get: 88 http://deb.debian.org/debian bullseye/main armhf python3-toml all 0.10.1-1 [15.9 kB] Get: 89 http://deb.debian.org/debian bullseye/main armhf python3-pytest all 6.0.2-2 [211 kB] Get: 90 http://deb.debian.org/debian bullseye/main armhf python3-urllib3 all 1.26.5-1~exp1 [114 kB] Get: 91 http://deb.debian.org/debian bullseye/main armhf python3-requests all 2.25.1+dfsg-2 [69.3 kB] Get: 92 http://deb.debian.org/debian bullseye/main armhf python3-setuptools all 52.0.0-4 [366 kB] Get: 93 http://deb.debian.org/debian bullseye/main armhf python3-tornado armhf 6.1.0-1+b1 [337 kB] Fetched 38.1 MB in 5s (8271 kB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package bsdextrautils. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 19448 files and directories currently installed.) Preparing to unpack .../0-bsdextrautils_2.36.1-8+deb11u1_armhf.deb ... Unpacking bsdextrautils (2.36.1-8+deb11u1) ... Selecting previously unselected package libuchardet0:armhf. Preparing to unpack .../1-libuchardet0_0.0.7-1_armhf.deb ... Unpacking libuchardet0:armhf (0.0.7-1) ... Selecting previously unselected package groff-base. Preparing to unpack .../2-groff-base_1.22.4-6_armhf.deb ... Unpacking groff-base (1.22.4-6) ... Selecting previously unselected package libpipeline1:armhf. Preparing to unpack .../3-libpipeline1_1.5.3-1_armhf.deb ... Unpacking libpipeline1:armhf (1.5.3-1) ... Selecting previously unselected package man-db. Preparing to unpack .../4-man-db_2.9.4-2_armhf.deb ... Unpacking man-db (2.9.4-2) ... Selecting previously unselected package libpython3.9-minimal:armhf. Preparing to unpack .../5-libpython3.9-minimal_3.9.2-1_armhf.deb ... Unpacking libpython3.9-minimal:armhf (3.9.2-1) ... Selecting previously unselected package libexpat1:armhf. Preparing to unpack .../6-libexpat1_2.2.10-2+deb11u5_armhf.deb ... Unpacking libexpat1:armhf (2.2.10-2+deb11u5) ... Selecting previously unselected package python3.9-minimal. Preparing to unpack .../7-python3.9-minimal_3.9.2-1_armhf.deb ... Unpacking python3.9-minimal (3.9.2-1) ... Setting up libpython3.9-minimal:armhf (3.9.2-1) ... Setting up libexpat1:armhf (2.2.10-2+deb11u5) ... Setting up python3.9-minimal (3.9.2-1) ... Selecting previously unselected package python3-minimal. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 20315 files and directories currently installed.) Preparing to unpack .../0-python3-minimal_3.9.2-3_armhf.deb ... Unpacking python3-minimal (3.9.2-3) ... Selecting previously unselected package media-types. Preparing to unpack .../1-media-types_4.0.0_all.deb ... Unpacking media-types (4.0.0) ... Selecting previously unselected package libmpdec3:armhf. Preparing to unpack .../2-libmpdec3_2.5.1-1_armhf.deb ... Unpacking libmpdec3:armhf (2.5.1-1) ... Selecting previously unselected package readline-common. Preparing to unpack .../3-readline-common_8.1-1_all.deb ... Unpacking readline-common (8.1-1) ... Selecting previously unselected package libreadline8:armhf. Preparing to unpack .../4-libreadline8_8.1-1_armhf.deb ... Unpacking libreadline8:armhf (8.1-1) ... Selecting previously unselected package libpython3.9-stdlib:armhf. Preparing to unpack .../5-libpython3.9-stdlib_3.9.2-1_armhf.deb ... Unpacking libpython3.9-stdlib:armhf (3.9.2-1) ... Selecting previously unselected package python3.9. Preparing to unpack .../6-python3.9_3.9.2-1_armhf.deb ... Unpacking python3.9 (3.9.2-1) ... Selecting previously unselected package libpython3-stdlib:armhf. Preparing to unpack .../7-libpython3-stdlib_3.9.2-3_armhf.deb ... Unpacking libpython3-stdlib:armhf (3.9.2-3) ... Setting up python3-minimal (3.9.2-3) ... Selecting previously unselected package python3. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 20736 files and directories currently installed.) Preparing to unpack .../00-python3_3.9.2-3_armhf.deb ... Unpacking python3 (3.9.2-3) ... Selecting previously unselected package sensible-utils. Preparing to unpack .../01-sensible-utils_0.0.14_all.deb ... Unpacking sensible-utils (0.0.14) ... Selecting previously unselected package openssl. Preparing to unpack .../02-openssl_1.1.1w-0+deb11u1_armhf.deb ... Unpacking openssl (1.1.1w-0+deb11u1) ... Selecting previously unselected package ca-certificates. Preparing to unpack .../03-ca-certificates_20210119_all.deb ... Unpacking ca-certificates (20210119) ... Selecting previously unselected package libmagic-mgc. Preparing to unpack .../04-libmagic-mgc_1%3a5.39-3+deb11u1_armhf.deb ... Unpacking libmagic-mgc (1:5.39-3+deb11u1) ... Selecting previously unselected package libmagic1:armhf. Preparing to unpack .../05-libmagic1_1%3a5.39-3+deb11u1_armhf.deb ... Unpacking libmagic1:armhf (1:5.39-3+deb11u1) ... Selecting previously unselected package file. Preparing to unpack .../06-file_1%3a5.39-3+deb11u1_armhf.deb ... Unpacking file (1:5.39-3+deb11u1) ... Selecting previously unselected package gettext-base. Preparing to unpack .../07-gettext-base_0.21-4_armhf.deb ... Unpacking gettext-base (0.21-4) ... Selecting previously unselected package libsigsegv2:armhf. Preparing to unpack .../08-libsigsegv2_2.13-1_armhf.deb ... Unpacking libsigsegv2:armhf (2.13-1) ... Selecting previously unselected package m4. Preparing to unpack .../09-m4_1.4.18-5_armhf.deb ... Unpacking m4 (1.4.18-5) ... Selecting previously unselected package autoconf. Preparing to unpack .../10-autoconf_2.69-14_all.deb ... Unpacking autoconf (2.69-14) ... Selecting previously unselected package autotools-dev. Preparing to unpack .../11-autotools-dev_20180224.1+nmu1_all.deb ... Unpacking autotools-dev (20180224.1+nmu1) ... Selecting previously unselected package automake. Preparing to unpack .../12-automake_1%3a1.16.3-2_all.deb ... Unpacking automake (1:1.16.3-2) ... Selecting previously unselected package autopoint. Preparing to unpack .../13-autopoint_0.21-4_all.deb ... Unpacking autopoint (0.21-4) ... Selecting previously unselected package libdebhelper-perl. Preparing to unpack .../14-libdebhelper-perl_13.3.4_all.deb ... Unpacking libdebhelper-perl (13.3.4) ... Selecting previously unselected package libtool. Preparing to unpack .../15-libtool_2.4.6-15_all.deb ... Unpacking libtool (2.4.6-15) ... Selecting previously unselected package dh-autoreconf. Preparing to unpack .../16-dh-autoreconf_20_all.deb ... Unpacking dh-autoreconf (20) ... Selecting previously unselected package libarchive-zip-perl. Preparing to unpack .../17-libarchive-zip-perl_1.68-1_all.deb ... Unpacking libarchive-zip-perl (1.68-1) ... Selecting previously unselected package libsub-override-perl. Preparing to unpack .../18-libsub-override-perl_0.09-2_all.deb ... Unpacking libsub-override-perl (0.09-2) ... Selecting previously unselected package libfile-stripnondeterminism-perl. Preparing to unpack .../19-libfile-stripnondeterminism-perl_1.12.0-1_all.deb ... Unpacking libfile-stripnondeterminism-perl (1.12.0-1) ... Selecting previously unselected package dh-strip-nondeterminism. Preparing to unpack .../20-dh-strip-nondeterminism_1.12.0-1_all.deb ... Unpacking dh-strip-nondeterminism (1.12.0-1) ... Selecting previously unselected package libelf1:armhf. Preparing to unpack .../21-libelf1_0.183-1_armhf.deb ... Unpacking libelf1:armhf (0.183-1) ... Selecting previously unselected package dwz. Preparing to unpack .../22-dwz_0.13+20210201-1_armhf.deb ... Unpacking dwz (0.13+20210201-1) ... Selecting previously unselected package libicu67:armhf. Preparing to unpack .../23-libicu67_67.1-7_armhf.deb ... Unpacking libicu67:armhf (67.1-7) ... Selecting previously unselected package libxml2:armhf. Preparing to unpack .../24-libxml2_2.9.10+dfsg-6.7+deb11u4_armhf.deb ... Unpacking libxml2:armhf (2.9.10+dfsg-6.7+deb11u4) ... Selecting previously unselected package gettext. Preparing to unpack .../25-gettext_0.21-4_armhf.deb ... Unpacking gettext (0.21-4) ... Selecting previously unselected package intltool-debian. Preparing to unpack .../26-intltool-debian_0.35.0+20060710.5_all.deb ... Unpacking intltool-debian (0.35.0+20060710.5) ... Selecting previously unselected package po-debconf. Preparing to unpack .../27-po-debconf_1.0.21+nmu1_all.deb ... Unpacking po-debconf (1.0.21+nmu1) ... Selecting previously unselected package debhelper. Preparing to unpack .../28-debhelper_13.3.4_all.deb ... Unpacking debhelper (13.3.4) ... Selecting previously unselected package python3-lib2to3. Preparing to unpack .../29-python3-lib2to3_3.9.2-1_all.deb ... Unpacking python3-lib2to3 (3.9.2-1) ... Selecting previously unselected package python3-distutils. Preparing to unpack .../30-python3-distutils_3.9.2-1_all.deb ... Unpacking python3-distutils (3.9.2-1) ... Selecting previously unselected package dh-python. Preparing to unpack .../31-dh-python_4.20201102+nmu1_all.deb ... Unpacking dh-python (4.20201102+nmu1) ... Selecting previously unselected package libblas3:armhf. Preparing to unpack .../32-libblas3_3.9.0-3+deb11u1_armhf.deb ... Unpacking libblas3:armhf (3.9.0-3+deb11u1) ... Selecting previously unselected package libgfortran5:armhf. Preparing to unpack .../33-libgfortran5_10.2.1-6_armhf.deb ... Unpacking libgfortran5:armhf (10.2.1-6) ... Selecting previously unselected package liblapack3:armhf. Preparing to unpack .../34-liblapack3_3.9.0-3+deb11u1_armhf.deb ... Unpacking liblapack3:armhf (3.9.0-3+deb11u1) ... Selecting previously unselected package libyaml-0-2:armhf. Preparing to unpack .../35-libyaml-0-2_0.2.2-1_armhf.deb ... Unpacking libyaml-0-2:armhf (0.2.2-1) ... Selecting previously unselected package python3-all. Preparing to unpack .../36-python3-all_3.9.2-3_armhf.deb ... Unpacking python3-all (3.9.2-3) ... Selecting previously unselected package python3-attr. Preparing to unpack .../37-python3-attr_20.3.0-1_all.deb ... Unpacking python3-attr (20.3.0-1) ... Selecting previously unselected package python3-certifi. Preparing to unpack .../38-python3-certifi_2020.6.20-1_all.deb ... Unpacking python3-certifi (2020.6.20-1) ... Selecting previously unselected package python3-pkg-resources. Preparing to unpack .../39-python3-pkg-resources_52.0.0-4_all.deb ... Unpacking python3-pkg-resources (52.0.0-4) ... Selecting previously unselected package python3-chardet. Preparing to unpack .../40-python3-chardet_4.0.0-1_all.deb ... Unpacking python3-chardet (4.0.0-1) ... Selecting previously unselected package python3-colorama. Preparing to unpack .../41-python3-colorama_0.4.4-1_all.deb ... Unpacking python3-colorama (0.4.4-1) ... Selecting previously unselected package python3-click. Preparing to unpack .../42-python3-click_7.1.2-1_all.deb ... Unpacking python3-click (7.1.2-1) ... Selecting previously unselected package python3-cloudpickle. Preparing to unpack .../43-python3-cloudpickle_1.6.0-1_all.deb ... Unpacking python3-cloudpickle (1.6.0-1) ... Selecting previously unselected package python3-fsspec. Preparing to unpack .../44-python3-fsspec_0.8.4-1_all.deb ... Unpacking python3-fsspec (0.8.4-1) ... Selecting previously unselected package python3-toolz. Preparing to unpack .../45-python3-toolz_0.9.0-1.1_all.deb ... Unpacking python3-toolz (0.9.0-1.1) ... Selecting previously unselected package python3-yaml. Preparing to unpack .../46-python3-yaml_5.3.1-5_armhf.deb ... Unpacking python3-yaml (5.3.1-5) ... Selecting previously unselected package python3-dask. Preparing to unpack .../47-python3-dask_2021.01.0+dfsg-1_all.deb ... Unpacking python3-dask (2021.01.0+dfsg-1) ... Selecting previously unselected package python3-six. Preparing to unpack .../48-python3-six_1.16.0-2_all.deb ... Unpacking python3-six (1.16.0-2) ... Selecting previously unselected package python3-dateutil. Preparing to unpack .../49-python3-dateutil_2.8.1-6_all.deb ... Unpacking python3-dateutil (2.8.1-6) ... Selecting previously unselected package python3-msgpack. Preparing to unpack .../50-python3-msgpack_1.0.0-6+b1_armhf.deb ... Unpacking python3-msgpack (1.0.0-6+b1) ... Selecting previously unselected package python3-psutil. Preparing to unpack .../51-python3-psutil_5.8.0-1_armhf.deb ... Unpacking python3-psutil (5.8.0-1) ... Selecting previously unselected package python3-sortedcontainers. Preparing to unpack .../52-python3-sortedcontainers_2.1.0-2_all.deb ... Unpacking python3-sortedcontainers (2.1.0-2) ... Selecting previously unselected package python3-tblib. Preparing to unpack .../53-python3-tblib_1.7.0-1_all.deb ... Unpacking python3-tblib (1.7.0-1) ... Selecting previously unselected package python3-heapdict. Preparing to unpack .../54-python3-heapdict_1.0.1-1_all.deb ... Unpacking python3-heapdict (1.0.1-1) ... Selecting previously unselected package python3-zict. Preparing to unpack .../55-python3-zict_2.0.0-1_all.deb ... Unpacking python3-zict (2.0.0-1) ... Selecting previously unselected package python3-distributed. Preparing to unpack .../56-python3-distributed_2021.01.0+ds.1-2.1+deb11u1_all.deb ... Unpacking python3-distributed (2021.01.0+ds.1-2.1+deb11u1) ... Selecting previously unselected package python3-flaky. Preparing to unpack .../57-python3-flaky_3.7.0-1_all.deb ... Unpacking python3-flaky (3.7.0-1) ... Selecting previously unselected package python3-idna. Preparing to unpack .../58-python3-idna_2.10-1_all.deb ... Unpacking python3-idna (2.10-1) ... Selecting previously unselected package python3-more-itertools. Preparing to unpack .../59-python3-more-itertools_4.2.0-3_all.deb ... Unpacking python3-more-itertools (4.2.0-3) ... Selecting previously unselected package python3-zipp. Preparing to unpack .../60-python3-zipp_1.0.0-3_all.deb ... Unpacking python3-zipp (1.0.0-3) ... Selecting previously unselected package python3-importlib-metadata. Preparing to unpack .../61-python3-importlib-metadata_1.6.0-2_all.deb ... Unpacking python3-importlib-metadata (1.6.0-2) ... Selecting previously unselected package python3-iniconfig. Preparing to unpack .../62-python3-iniconfig_1.1.1-1_all.deb ... Unpacking python3-iniconfig (1.1.1-1) ... Selecting previously unselected package python3-numpy. Preparing to unpack .../63-python3-numpy_1%3a1.19.5-1_armhf.deb ... Unpacking python3-numpy (1:1.19.5-1) ... Selecting previously unselected package python3-pyparsing. Preparing to unpack .../64-python3-pyparsing_2.4.7-1_all.deb ... Unpacking python3-pyparsing (2.4.7-1) ... Selecting previously unselected package python3-packaging. Preparing to unpack .../65-python3-packaging_20.9-2_all.deb ... Unpacking python3-packaging (20.9-2) ... Selecting previously unselected package python3-tz. Preparing to unpack .../66-python3-tz_2021.1-1_all.deb ... Unpacking python3-tz (2021.1-1) ... Selecting previously unselected package python3-pandas-lib:armhf. Preparing to unpack .../67-python3-pandas-lib_1.1.5+dfsg-2_armhf.deb ... Unpacking python3-pandas-lib:armhf (1.1.5+dfsg-2) ... Selecting previously unselected package python3-pandas. Preparing to unpack .../68-python3-pandas_1.1.5+dfsg-2_all.deb ... Unpacking python3-pandas (1.1.5+dfsg-2) ... Selecting previously unselected package python3-pluggy. Preparing to unpack .../69-python3-pluggy_0.13.0-6_all.deb ... Unpacking python3-pluggy (0.13.0-6) ... Selecting previously unselected package python3-py. Preparing to unpack .../70-python3-py_1.10.0-1_all.deb ... Unpacking python3-py (1.10.0-1) ... Selecting previously unselected package python3-toml. Preparing to unpack .../71-python3-toml_0.10.1-1_all.deb ... Unpacking python3-toml (0.10.1-1) ... Selecting previously unselected package python3-pytest. Preparing to unpack .../72-python3-pytest_6.0.2-2_all.deb ... Unpacking python3-pytest (6.0.2-2) ... Selecting previously unselected package python3-urllib3. Preparing to unpack .../73-python3-urllib3_1.26.5-1~exp1_all.deb ... Unpacking python3-urllib3 (1.26.5-1~exp1) ... Selecting previously unselected package python3-requests. Preparing to unpack .../74-python3-requests_2.25.1+dfsg-2_all.deb ... Unpacking python3-requests (2.25.1+dfsg-2) ... Selecting previously unselected package python3-setuptools. Preparing to unpack .../75-python3-setuptools_52.0.0-4_all.deb ... Unpacking python3-setuptools (52.0.0-4) ... Selecting previously unselected package python3-tornado. Preparing to unpack .../76-python3-tornado_6.1.0-1+b1_armhf.deb ... Unpacking python3-tornado (6.1.0-1+b1) ... Setting up media-types (4.0.0) ... Setting up libpipeline1:armhf (1.5.3-1) ... Setting up bsdextrautils (2.36.1-8+deb11u1) ... update-alternatives: using /usr/bin/write.ul to provide /usr/bin/write (write) in auto mode Setting up libicu67:armhf (67.1-7) ... Setting up libmagic-mgc (1:5.39-3+deb11u1) ... Setting up libarchive-zip-perl (1.68-1) ... Setting up libyaml-0-2:armhf (0.2.2-1) ... Setting up libdebhelper-perl (13.3.4) ... Setting up libmagic1:armhf (1:5.39-3+deb11u1) ... Setting up gettext-base (0.21-4) ... Setting up file (1:5.39-3+deb11u1) ... Setting up autotools-dev (20180224.1+nmu1) ... Setting up libblas3:armhf (3.9.0-3+deb11u1) ... update-alternatives: using /usr/lib/arm-linux-gnueabihf/blas/libblas.so.3 to provide /usr/lib/arm-linux-gnueabihf/libblas.so.3 (libblas.so.3-arm-linux-gnueabihf) in auto mode Setting up libsigsegv2:armhf (2.13-1) ... Setting up autopoint (0.21-4) ... Setting up libgfortran5:armhf (10.2.1-6) ... Setting up sensible-utils (0.0.14) ... Setting up libuchardet0:armhf (0.0.7-1) ... Setting up libmpdec3:armhf (2.5.1-1) ... Setting up libsub-override-perl (0.09-2) ... Setting up openssl (1.1.1w-0+deb11u1) ... Setting up libelf1:armhf (0.183-1) ... Setting up readline-common (8.1-1) ... Setting up libxml2:armhf (2.9.10+dfsg-6.7+deb11u4) ... Setting up libfile-stripnondeterminism-perl (1.12.0-1) ... Setting up liblapack3:armhf (3.9.0-3+deb11u1) ... update-alternatives: using /usr/lib/arm-linux-gnueabihf/lapack/liblapack.so.3 to provide /usr/lib/arm-linux-gnueabihf/liblapack.so.3 (liblapack.so.3-arm-linux-gnueabihf) in auto mode Setting up gettext (0.21-4) ... Setting up libtool (2.4.6-15) ... Setting up libreadline8:armhf (8.1-1) ... Setting up m4 (1.4.18-5) ... Setting up intltool-debian (0.35.0+20060710.5) ... Setting up ca-certificates (20210119) ... Updating certificates in /etc/ssl/certs... 129 added, 0 removed; done. Setting up autoconf (2.69-14) ... Setting up dh-strip-nondeterminism (1.12.0-1) ... Setting up dwz (0.13+20210201-1) ... Setting up groff-base (1.22.4-6) ... Setting up libpython3.9-stdlib:armhf (3.9.2-1) ... Setting up libpython3-stdlib:armhf (3.9.2-3) ... Setting up automake (1:1.16.3-2) ... update-alternatives: using /usr/bin/automake-1.16 to provide /usr/bin/automake (automake) in auto mode Setting up po-debconf (1.0.21+nmu1) ... Setting up man-db (2.9.4-2) ... Not building database; man-db/auto-update is not 'true'. Setting up dh-autoreconf (20) ... Setting up python3.9 (3.9.2-1) ... Setting up debhelper (13.3.4) ... Setting up python3 (3.9.2-3) ... Setting up python3-sortedcontainers (2.1.0-2) ... Setting up python3-psutil (5.8.0-1) ... Setting up python3-tz (2021.1-1) ... Setting up python3-cloudpickle (1.6.0-1) ... Setting up python3-six (1.16.0-2) ... Setting up python3-flaky (3.7.0-1) ... Setting up python3-pyparsing (2.4.7-1) ... Setting up python3-certifi (2020.6.20-1) ... Setting up python3-idna (2.10-1) ... Setting up python3-toml (0.10.1-1) ... Setting up python3-urllib3 (1.26.5-1~exp1) ... Setting up python3-toolz (0.9.0-1.1) ... Setting up python3-dateutil (2.8.1-6) ... Setting up python3-msgpack (1.0.0-6+b1) ... Setting up python3-lib2to3 (3.9.2-1) ... Setting up python3-pkg-resources (52.0.0-4) ... Setting up python3-distutils (3.9.2-1) ... Setting up dh-python (4.20201102+nmu1) ... Setting up python3-more-itertools (4.2.0-3) ... Setting up python3-heapdict (1.0.1-1) ... Setting up python3-iniconfig (1.1.1-1) ... Setting up python3-attr (20.3.0-1) ... Setting up python3-tornado (6.1.0-1+b1) ... Setting up python3-setuptools (52.0.0-4) ... Setting up python3-tblib (1.7.0-1) ... Setting up python3-py (1.10.0-1) ... Setting up python3-colorama (0.4.4-1) ... Setting up python3-fsspec (0.8.4-1) ... Setting up python3-all (3.9.2-3) ... Setting up python3-yaml (5.3.1-5) ... Setting up python3-zipp (1.0.0-3) ... Setting up python3-click (7.1.2-1) ... Setting up python3-packaging (20.9-2) ... Setting up python3-chardet (4.0.0-1) ... Setting up python3-requests (2.25.1+dfsg-2) ... Setting up python3-numpy (1:1.19.5-1) ... Setting up python3-zict (2.0.0-1) ... Setting up python3-importlib-metadata (1.6.0-2) ... Setting up python3-pandas-lib:armhf (1.1.5+dfsg-2) ... Setting up python3-dask (2021.01.0+dfsg-1) ... Setting up python3-distributed (2021.01.0+ds.1-2.1+deb11u1) ... Setting up python3-pandas (1.1.5+dfsg-2) ... Setting up python3-pluggy (0.13.0-6) ... Setting up python3-pytest (6.0.2-2) ... Processing triggers for libc-bin (2.31-13+deb11u6) ... Processing triggers for ca-certificates (20210119) ... Updating certificates in /etc/ssl/certs... 0 added, 0 removed; done. Running hooks in /etc/ca-certificates/update.d... done. Reading package lists... Building dependency tree... Reading state information... Reading extended state information... Initializing package states... Writing extended state information... Building tag database... -> Finished parsing the build-deps I: Building the package I: Running cd /build/reproducible-path/python-streamz-0.6.2/ && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" HOME="/nonexistent/first-build" dpkg-buildpackage -us -uc -b && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" HOME="/nonexistent/first-build" dpkg-genchanges -S > ../python-streamz_0.6.2-1_source.changes dpkg-buildpackage: info: source package python-streamz dpkg-buildpackage: info: source version 0.6.2-1 dpkg-buildpackage: info: source distribution unstable dpkg-buildpackage: info: source changed by Nilesh Patra dpkg-source --before-build . dpkg-buildpackage: info: host architecture armhf debian/rules clean dh clean --with python3 --buildsystem=pybuild dh_auto_clean -O--buildsystem=pybuild I: pybuild base:232: python3.9 setup.py clean running clean removing '/build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build' (and everything under it) 'build/bdist.linux-armhf' does not exist -- can't clean it 'build/scripts-3.9' does not exist -- can't clean it dh_autoreconf_clean -O--buildsystem=pybuild dh_clean -O--buildsystem=pybuild debian/rules binary dh binary --with python3 --buildsystem=pybuild dh_update_autotools_config -O--buildsystem=pybuild dh_autoreconf -O--buildsystem=pybuild dh_auto_configure -O--buildsystem=pybuild I: pybuild base:232: python3.9 setup.py config running config dh_auto_build -O--buildsystem=pybuild I: pybuild base:232: /usr/bin/python3 setup.py build running build running build_py creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/sources.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/collection.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/batch.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/utils.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/orderedweakset.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/__init__.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/dask.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/utils_test.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/plugins.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/sinks.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/graph.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/aggregations.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/utils.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/__init__.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_dask.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_sinks.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_sources.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_graph.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_kafka.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/__init__.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/py3_test_core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_plugins.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_batch.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests package init file 'streamz/dataframe/tests/__init__.py' not found (or not a regular file) creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests copying streamz/dataframe/tests/test_dataframe_utils.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests copying streamz/dataframe/tests/test_dataframes.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests dh_auto_test -O--buildsystem=pybuild I: pybuild pybuild:284: rm /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests/test_dask.py I: pybuild base:232: cd /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build; python3.9 -m pytest ============================= test session starts ============================== platform linux -- Python 3.9.2, pytest-6.0.2, py-1.10.0, pluggy-0.13.0 rootdir: /build/reproducible-path/python-streamz-0.6.2, configfile: setup.cfg plugins: flaky-3.7.0 collected 1534 items / 2 skipped / 1532 selected streamz/dataframe/tests/test_dataframe_utils.py .s.s [ 0%] streamz/dataframe/tests/test_dataframes.py ............................. [ 2%] ........................................................................ [ 6%] ...F...........F....F....F....F....F....F....F....F....F....F....F....F. [ 11%] ...F....F....F....F....F....F....F....F....F....F....F....F..FF........s [ 16%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 20%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 25%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 30%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 35%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 39%] sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss. [ 44%] ...xxxxxx....................ss......................................... [ 49%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 53%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 58%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 63%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 67%] .FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..F [ 72%] F..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF. [ 77%] .X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X [ 81%] ..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X.. [ 86%] FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X....FF...........xF. [ 91%] streamz/tests/test_batch.py .... [ 91%] streamz/tests/test_core.py ............................................. [ 94%] .............................................................F......... [ 98%] streamz/tests/test_plugins.py .... [ 99%] streamz/tests/test_sinks.py ..... [ 99%] streamz/tests/test_sources.py .xxxX.F.xxxxx [100%] =================================== FAILURES =================================== _______________________ test_dataframe_simple[1] _______________________ func = at 0xf4267418> @pytest.mark.parametrize('func', [ lambda df: df.query('x > 1 and x < 4', engine='python'), lambda df: df.x.value_counts().nlargest(2) ]) def test_dataframe_simple(func): df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]}) expected = func(df) a = DataFrame(example=df) L = func(a).stream.sink_to_list() a.emit(df) > assert_eq(L[0], expected) streamz/dataframe/tests/test_dataframes.py:191: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2 1 3 1 Name: x, dtype: int32, b = 2 1 3 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-0-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267778> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-1-2] __________ agg = at 0xf42676a0> grouper = at 0xf42677c0> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-2-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267808> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: y, dtype: int32, b = 0 2 1 1 Name: y, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-3-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267850> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-1-0-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267778> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 b = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-1-1-2] __________ agg = at 0xf42676a0> grouper = at 0xf42677c0> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-1-2-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267808> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-1-3-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267850> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-0-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267778> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-1-2] __________ agg = at 0xf42676a0> grouper = at 0xf42677c0> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-2-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267808> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0 2 1 1, b = y 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-3-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267850> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-0-0-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267778> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-0-1-2] __________ agg = at 0xf42676a0> grouper = at 0xf42677c0> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-0-2-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267808> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: y, dtype: int32, b = 0 2 1 1 Name: y, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-0-3-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267850> indexer = at 0xf4267898>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-1-0-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267778> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 b = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-1-1-2] __________ agg = at 0xf42676a0> grouper = at 0xf42677c0> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-1-2-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267808> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-1-3-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267850> indexer = at 0xf42678e0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-0-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267778> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-1-2] __________ agg = at 0xf42676a0> grouper = at 0xf42677c0> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-2-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267808> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0 2 1 1, b = y 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-3-2] __________ agg = at 0xf42676a0> grouper = at 0xf4267850> indexer = at 0xf4267928>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___________________________ test_value_counts[core] ____________________________ stream = def test_value_counts(stream): s = pd.Series(['a', 'b', 'a']) a = Series(example=s, stream=stream) b = a.value_counts() assert b._stream_type == 'updating' result = b.stream.gather().sink_to_list() a.emit(s) a.emit(s) > assert_eq(result[-1], pd.concat([s, s], axis=0).value_counts()) streamz/dataframe/tests/test_dataframes.py:317: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = a 4 b 2 dtype: int32, b = a 4 b 2 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___________________________ test_value_counts[dask] ____________________________ stream = def test_value_counts(stream): s = pd.Series(['a', 'b', 'a']) a = Series(example=s, stream=stream) b = a.value_counts() assert b._stream_type == 'updating' result = b.stream.gather().sink_to_list() a.emit(s) a.emit(s) > assert_eq(result[-1], pd.concat([s, s], axis=0).value_counts()) streamz/dataframe/tests/test_dataframes.py:317: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = a 4 b 2 dtype: int32, b = a 4 b 2 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-0-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a928> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int32 b = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-0-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a928> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int32 b = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-0-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a928> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int32 b = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-0-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a928> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int32 b = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-1-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a970> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-1-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a970> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-1-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a970> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-1-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a970> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-2-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a9b8> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-2-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a9b8> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-2-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a9b8> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-2-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a9b8> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-3-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426aa00> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-3-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426aa00> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-3-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426aa00> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-3-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426aa00> indexer = at 0xf426aa48> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[1-0-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a928> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 b = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-0-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a928> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 b = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-0-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a928> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 b = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-0-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a928> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 b = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-1-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a970> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-1-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a970> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-1-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a970> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-1-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a970> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-2-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a9b8> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-2-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a9b8> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-2-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a9b8> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-2-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a9b8> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-3-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426aa00> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-3-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426aa00> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-3-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426aa00> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-3-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426aa00> indexer = at 0xf426aa90> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-0-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a928> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 b = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-0-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a928> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 b = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-0-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a928> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 b = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-0-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a928> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 b = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-1-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a970> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-1-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a970> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-1-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a970> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-1-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a970> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-2-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426a9b8> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-2-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426a9b8> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-2-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a9b8> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-2-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426a9b8> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-3-0-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a898> grouper = at 0xf426aa00> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-3-0-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a898> grouper = at 0xf426aa00> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-3-1-10h-2] ___ func = at 0xf426a778>, value = Timedelta('0 days 10:00:00') getter = at 0xf426a8e0> grouper = at 0xf426aa00> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-3-1-1d-2] ___ func = at 0xf426a778>, value = Timedelta('1 days 00:00:00') getter = at 0xf426a8e0> grouper = at 0xf426aa00> indexer = at 0xf426aad8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[0-0-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af10> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 2.0 1 1 b = x y -overlapped-index-name-0 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 b = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 2.0 1 1 b = x y -overlapped-index-name-0 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 b = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 1 1, b = x y 0 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 1 1, b = x y 0 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426af58> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 2.0 1 b = x -overlapped-index-name-0 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 b = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 2.0 1 b = x -overlapped-index-name-0 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 b = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426adf0> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae38> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 1, b = x 0 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 2 1 1, b = x 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 1, b = x 0 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 2 1 1, b = x 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426ae80> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-0-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-0-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-0-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-0-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ad60> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-1-1-2] ______ func = at 0xf426abf8>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-1-1-3] ______ func = at 0xf426ac40>, n = 1 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-1-4-2] ______ func = at 0xf426abf8>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-1-4-3] ______ func = at 0xf426ac40>, n = 4 getter = at 0xf426ada8> grouper = at 0xf426aec8> indexer = at 0xf426afa0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ________________ test_groupby_aggregate_with_start_state[core] _________________ stream = def test_groupby_aggregate_with_start_state(stream): example = pd.DataFrame({'name': [], 'amount': []}) sdf = DataFrame(stream, example=example).groupby(['name']) output0 = sdf.amount.sum(start=None).stream.gather().sink_to_list() output1 = sdf.amount.mean(with_state=True, start=None).stream.gather().sink_to_list() output2 = sdf.amount.count(start=None).stream.gather().sink_to_list() df = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50, 100]}) stream.emit(df) out_df0 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50.0, 100.0]}) out_df1 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [1, 1]}) assert assert_eq(output0[0].reset_index(), out_df0) assert assert_eq(output1[0][1].reset_index(), out_df0) > assert assert_eq(output2[0].reset_index(), out_df1) streamz/dataframe/tests/test_dataframes.py:917: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = name amount 0 Alice 1 1 Tom 1 b = name amount 0 Alice 1 1 Tom 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 1] (column name="amount") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ________________ test_groupby_aggregate_with_start_state[dask] _________________ stream = def test_groupby_aggregate_with_start_state(stream): example = pd.DataFrame({'name': [], 'amount': []}) sdf = DataFrame(stream, example=example).groupby(['name']) output0 = sdf.amount.sum(start=None).stream.gather().sink_to_list() output1 = sdf.amount.mean(with_state=True, start=None).stream.gather().sink_to_list() output2 = sdf.amount.count(start=None).stream.gather().sink_to_list() df = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50, 100]}) stream.emit(df) out_df0 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50.0, 100.0]}) out_df1 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [1, 1]}) assert assert_eq(output0[0].reset_index(), out_df0) assert assert_eq(output1[0][1].reset_index(), out_df0) > assert assert_eq(output2[0].reset_index(), out_df1) streamz/dataframe/tests/test_dataframes.py:917: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = name amount 0 Alice 1 1 Tom 1 b = name amount 0 Alice 1 1 Tom 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 1] (column name="amount") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___________________________________ test_gc ____________________________________ def test_func(): with pristine_loop() as loop: cor = gen.coroutine(func) try: > loop.run_sync(cor, timeout=timeout) streamz/utils_test.py:69: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3/dist-packages/tornado/ioloop.py:530: in run_sync return future_cell[0].result() /usr/lib/python3/dist-packages/tornado/gen.py:775: in run yielded = self.gen.send(value) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ @gen_test() def test_gc(): sdf = sd.Random(freq='5ms', interval='100ms') a = DataFrame({'volatility': sdf.x.rolling('100ms').var(), 'sub': sdf.x - sdf.x.rolling('100ms').mean()}) n = len(sdf.stream.downstreams) a = DataFrame({'volatility': sdf.x.rolling('100ms').var(), 'sub': sdf.x - sdf.x.rolling('100ms').mean()}) yield gen.sleep(0.1) a = DataFrame({'volatility': sdf.x.rolling('100ms').var(), 'sub': sdf.x - sdf.x.rolling('100ms').mean()}) yield gen.sleep(0.1) a = DataFrame({'volatility': sdf.x.rolling('100ms').var(), 'sub': sdf.x - sdf.x.rolling('100ms').mean()}) yield gen.sleep(0.1) assert len(sdf.stream.downstreams) == n del a import gc; gc.collect() > assert len(sdf.stream.downstreams) == 0 E assert 2 == 0 E + where 2 = len() E + where = >.downstreams E + where > = Random - elements like:\nEmpty DataFrame\nColumns: [x, y, z]\nIndex: [].stream streamz/dataframe/tests/test_dataframes.py:573: AssertionError _________________________________ test_buffer __________________________________ def test_func(): with pristine_loop() as loop: cor = gen.coroutine(func) try: > loop.run_sync(cor, timeout=timeout) streamz/utils_test.py:69: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3/dist-packages/tornado/ioloop.py:530: in run_sync return future_cell[0].result() /usr/lib/python3/dist-packages/tornado/gen.py:775: in run yielded = self.gen.send(value) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ @gen_test() def test_buffer(): source = Stream(asynchronous=True) L = source.map(inc).buffer(10).map(inc).rate_limit(0.05).sink_to_list() start = time() for i in range(10): yield source.emit(i) stop = time() > assert stop - start < 0.01 E assert (1704599530.4124794 - 1704599530.3990235) < 0.01 streamz/tests/test_core.py:558: AssertionError _______________________ test_from_iterable_backpressure ________________________ def test_from_iterable_backpressure(): it = iter(range(5)) source = Source.from_iterable(it) L = source.rate_limit(0.1).sink_to_list() source.start() wait_for(lambda: L == [0], 1, period=0.01) > assert next(it) == 2 # 1 is in blocked _emit E assert 1 == 2 E + where 1 = next() streamz/tests/test_sources.py:151: AssertionError =============================== warnings summary =============================== .pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests/test_dataframes.py::test_windowing_n[1-1-4] .pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests/test_dataframes.py::test_windowing_n[1-1-5] /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/aggregations.py:99: RuntimeWarning: invalid value encountered in double_scalars result = result * n / (n - self.ddof) .pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests/test_dataframes.py::test_gc_random /usr/lib/python3/dist-packages/distributed/deploy/spec.py:330: DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11. await asyncio.wait(tasks) -- Docs: https://docs.pytest.org/en/stable/warnings.html ===Flaky Test Report=== test_tcp failed (2 runs remaining out of 3). local variable 'sock2' referenced before assignment [] test_tcp failed (1 runs remaining out of 3). condition not reached within 2 seconds [, ] test_tcp failed; it passed 0 out of the required 1 times. condition not reached within 2 seconds [, ] test_tcp_async failed (2 runs remaining out of 3). Operation timed out after 60 seconds [, ] test_tcp_async failed (1 runs remaining out of 3). Operation timed out after 60 seconds [, ] test_tcp_async failed; it passed 0 out of the required 1 times. Operation timed out after 60 seconds [, ] ===End Flaky Test Report=== =========================== short test summary info ============================ FAILED streamz/dataframe/tests/test_dataframes.py::test_dataframe_simple[1] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_value_counts[core] - ... FAILED streamz/dataframe/tests/test_dataframes.py::test_value_counts[dask] - ... FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate_with_start_state[core] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate_with_start_state[dask] FAILED streamz/dataframe/tests/test_dataframes.py::test_gc - assert 2 == 0 FAILED streamz/tests/test_core.py::test_buffer - assert (1704599530.4124794 -... FAILED streamz/tests/test_sources.py::test_from_iterable_backpressure - asser... = 176 failed, 814 passed, 438 skipped, 15 xfailed, 97 xpassed, 3 warnings in 515.79s (0:08:35) = E: pybuild pybuild:353: test: plugin distutils failed with: exit code=1: cd /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build; python3.9 -m pytest dh_auto_test: error: pybuild --test --test-pytest -i python{version} -p 3.9 returned exit code 13 make: *** [debian/rules:10: binary] Error 25 dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2 I: copying local configuration E: Failed autobuilding of package I: unmounting dev/ptmx filesystem I: unmounting dev/pts filesystem I: unmounting dev/shm filesystem I: unmounting proc filesystem I: unmounting sys filesystem I: cleaning the build env I: removing directory /srv/workspace/pbuilder/19490 and its subdirectories