You’ll notice that printing a dataset still shows a preview of array values,
│ │ │ │ even if they are actually Dask arrays. We can do this quickly with Dask because
│ │ │ │ we only need to compute the first few values (typically from the first block).
│ │ │ │ To reveal the true nature of an array, print a DataArray:
Once you’ve manipulated a Dask array, you can still write a dataset too big to
│ │ │ │ ├── html2text {}
│ │ │ │ │ @@ -114,15 +114,15 @@
│ │ │ │ │ Youâll notice that printing a dataset still shows a preview of array values,
│ │ │ │ │ even if they are actually Dask arrays. We can do this quickly with Dask because
│ │ │ │ │ we only need to compute the first few values (typically from the first block).
│ │ │ │ │ To reveal the true nature of an array, print a DataArray:
│ │ │ │ │ In [3]: ds.temperature
│ │ │ │ │ Out[3]:
│ │ │ │ │
│ │ │ │ │ -dask.array
│ │ │ │ │ Coordinates:
│ │ │ │ │ * time (time) datetime64[ns] 2015-01-01 2015-01-02 ... 2015-01-30
│ │ │ │ │ * longitude (longitude) int64 0 1 2 3 4 5 6 7 ... 173 174 175 176 177 178
│ │ │ │ │ 179
│ │ │ │ │ * latitude (latitude) float64 89.5 88.5 87.5 86.5 ... -87.5 -88.5 -89.5
│ │ │ │ │ Once youâve manipulated a Dask array, you can still write a dataset too big
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/data-structures.html
│ │ │ │ @@ -892,18 +892,18 @@
│ │ │ │ a method call with an external function (e.g., ds.pipe(func)) instead of
│ │ │ │ simply calling it (e.g., func(ds)). This allows you to write pipelines for
│ │ │ │ transforming your data (using “method chaining”) instead of writing hard to
│ │ │ │ follow nested function calls:
│ │ │ │
# these lines are equivalent, but with pipe we can make the logic flow
│ │ │ │ # entirely from left to right
│ │ │ │ In [60]: plt.plot((2*ds.temperature.sel(x=0)).mean("y"))
│ │ │ │ -Out[60]: [<matplotlib.lines.Line2D at 0xffff88655e80>]
│ │ │ │ +Out[60]: [<matplotlib.lines.Line2D at 0xffff689d4fa0>]
│ │ │ │
│ │ │ │ In [61]: (ds.temperature.sel(x=0).pipe(lambdax:2*x).mean("y").pipe(plt.plot))
│ │ │ │ -Out[61]: [<matplotlib.lines.Line2D at 0xffff886611c0>]
│ │ │ │ +Out[61]: [<matplotlib.lines.Line2D at 0xffff689df2e0>]
│ │ │ │
│ │ │ │
│ │ │ │
Both pipe and assign replicate the pandas methods of the same names
│ │ │ │ (DataFrame.pipe and
│ │ │ │ DataFrame.assign).
│ │ │ │
With xarray, there is no performance penalty for creating new datasets, even if
│ │ │ │ variables are lazily loaded from a file on disk. Creating new objects instead
│ │ │ │ ├── html2text {}
│ │ │ │ │ @@ -619,19 +619,19 @@
│ │ │ │ │ There is also the pipe() method that allows you to use a method call with an
│ │ │ │ │ external function (e.g., ds.pipe(func)) instead of simply calling it (e.g.,
│ │ │ │ │ func(ds)). This allows you to write pipelines for transforming your data (using
│ │ │ │ │ âmethod chainingâ) instead of writing hard to follow nested function calls:
│ │ │ │ │ # these lines are equivalent, but with pipe we can make the logic flow
│ │ │ │ │ # entirely from left to right
│ │ │ │ │ In [60]: plt.plot((2 * ds.temperature.sel(x=0)).mean("y"))
│ │ │ │ │ -Out[60]: []
│ │ │ │ │ +Out[60]: []
│ │ │ │ │
│ │ │ │ │ In [61]: (ds.temperature.sel(x=0).pipe(lambda x: 2 * x).mean("y").pipe
│ │ │ │ │ (plt.plot))
│ │ │ │ │ -Out[61]: []
│ │ │ │ │ +Out[61]: []
│ │ │ │ │ Both pipe and assign replicate the pandas methods of the same names
│ │ │ │ │ (DataFrame.pipe and DataFrame.assign).
│ │ │ │ │ With xarray, there is no performance penalty for creating new datasets, even if
│ │ │ │ │ variables are lazily loaded from a file on disk. Creating new objects instead
│ │ │ │ │ of mutating existing objects often results in easier to understand code, so we
│ │ │ │ │ encourage using this approach.
│ │ │ │ │ **** Renaming variables¶ ****
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/ERA5-GRIB-example.html
│ │ │ │ @@ -523,15 +523,15 @@
│ │ │ │ /build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ 76# May want to add an option to remove it.
│ │ │ │ 77ifnot _os.path.isdir(longdir):
│ │ │ │ ---> 78_os.mkdir(longdir)
│ │ │ │ 79
│ │ │ │ 80 url ="/".join((github_url,"raw", branch, fullname))
│ │ │ │
│ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'
│ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'
│ │ │ │
│ │ │ │
│ │ │ │
Let’s create a simple plot of 2-m air temperature in degrees Celsius:
Write equations to calculate the vertical coordinate. These will be only evaluated when data is requested. Information about the ROMS vertical coordinate can be found (here)[https://www.myroms.org/wiki/Vertical_S-coordinate]
│ │ │ │
In short, for Vtransform==2 as used in this example,
│ │ │ │ ├── html2text {}
│ │ │ │ │ @@ -125,15 +125,15 @@
│ │ │ │ │ open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ │ 76 # May want to add an option to remove it.
│ │ │ │ │ 77 if not _os.path.isdir(longdir):
│ │ │ │ │ ---> 78 _os.mkdir(longdir)
│ │ │ │ │ 79
│ │ │ │ │ 80 url = "/".join((github_url, "raw", branch, fullname))
│ │ │ │ │
│ │ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-
│ │ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-
│ │ │ │ │ build/.xarray_tutorial_data'
│ │ │ │ │ ***** Add a lazilly calculated vertical coordinates¶ *****
│ │ │ │ │ Write equations to calculate the vertical coordinate. These will be only
│ │ │ │ │ evaluated when data is requested. Information about the ROMS vertical
│ │ │ │ │ coordinate can be found (here)[https://www.myroms.org/wiki/Vertical_S-
│ │ │ │ │ coordinate]
│ │ │ │ │ In short, for Vtransform==2 as used in this example,
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/ROMS_ocean_model.ipynb.gz
│ │ │ │ ├── ROMS_ocean_model.ipynb
│ │ │ │ │ ├── Pretty-printed
│ │ │ │ │ │┄ Similarity: 0.9998774509803922%
│ │ │ │ │ │┄ Differences: {"'cells'": '{5: {\'outputs\': {0: {\'evalue\': "[Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'", \'traceback\': {insert: [(4, '
│ │ │ │ │ │┄ '"\\x1b[0;31mFileNotFoundError\\x1b[0m: [Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'")], delete: [4]}}}}}'}
│ │ │ │ │ │ @@ -69,22 +69,22 @@
│ │ │ │ │ │ {
│ │ │ │ │ │ "cell_type": "code",
│ │ │ │ │ │ "execution_count": 2,
│ │ │ │ │ │ "metadata": {},
│ │ │ │ │ │ "outputs": [
│ │ │ │ │ │ {
│ │ │ │ │ │ "ename": "FileNotFoundError",
│ │ │ │ │ │ - "evalue": "[Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'",
│ │ │ │ │ │ + "evalue": "[Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'",
│ │ │ │ │ │ "output_type": "error",
│ │ │ │ │ │ "traceback": [
│ │ │ │ │ │ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
│ │ │ │ │ │ "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
│ │ │ │ │ │ "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# load in the file\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mds\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mxr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtutorial\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mopen_dataset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'ROMS_example.nc'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mchunks\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m{\u001b[0m\u001b[0;34m'ocean_time'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;31m# This is a way to turn on chunking and lazy evaluation. Opening with mfdataset, or\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;31m# setting the chunking in the open_dataset would also achive this.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ "\u001b[0;32m/build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py\u001b[0m in \u001b[0;36mopen_dataset\u001b[0;34m(name, cache, cache_dir, github_url, branch, **kws)\u001b[0m\n\u001b[1;32m 76\u001b[0m \u001b[0;31m# May want to add an option to remove it.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 77\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0misdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 78\u001b[0;31m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmkdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 79\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 80\u001b[0m \u001b[0murl\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"/\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgithub_url\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"raw\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbranch\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfullname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ - "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'"
│ │ │ │ │ │ + "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'"
│ │ │ │ │ │ ]
│ │ │ │ │ │ }
│ │ │ │ │ │ ],
│ │ │ │ │ │ "source": [
│ │ │ │ │ │ "# load in the file\n",
│ │ │ │ │ │ "ds = xr.tutorial.open_dataset('ROMS_example.nc', chunks={'ocean_time': 1})\n",
│ │ │ │ │ │ "\n",
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/apply_ufunc_vectorize_1d.html
│ │ │ │ @@ -555,15 +555,15 @@
│ │ │ │ /build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ 76# May want to add an option to remove it.
│ │ │ │ 77ifnot _os.path.isdir(longdir):
│ │ │ │ ---> 78_os.mkdir(longdir)
│ │ │ │ 79
│ │ │ │ 80 url ="/".join((github_url,"raw", branch, fullname))
│ │ │ │
│ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'
│ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'
│ │ │ │
│ │ │ │
│ │ │ │
The function we will apply is np.interp which expects 1D numpy arrays. This functionality is already implemented in xarray so we use that capability to make sure we are not making mistakes.
│ │ │ │
│ │ │ │
[2]:
│ │ │ │
│ │ │ │
│ │ │ │ ├── html2text {}
│ │ │ │ │ @@ -116,15 +116,15 @@
│ │ │ │ │ open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ │ 76 # May want to add an option to remove it.
│ │ │ │ │ 77 if not _os.path.isdir(longdir):
│ │ │ │ │ ---> 78 _os.mkdir(longdir)
│ │ │ │ │ 79
│ │ │ │ │ 80 url = "/".join((github_url, "raw", branch, fullname))
│ │ │ │ │
│ │ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-
│ │ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-
│ │ │ │ │ build/.xarray_tutorial_data'
│ │ │ │ │ The function we will apply is np.interp which expects 1D numpy arrays. This
│ │ │ │ │ functionality is already implemented in xarray so we use that capability to
│ │ │ │ │ make sure we are not making mistakes.
│ │ │ │ │ [2]:
│ │ │ │ │ newlat = np.linspace(15, 75, 100)
│ │ │ │ │ air.interp(lat=newlat)
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/apply_ufunc_vectorize_1d.ipynb.gz
│ │ │ │ ├── apply_ufunc_vectorize_1d.ipynb
│ │ │ │ │ ├── Pretty-printed
│ │ │ │ │ │┄ Similarity: 0.9999510017421602%
│ │ │ │ │ │┄ Differences: {"'cells'": '{2: {\'outputs\': {0: {\'evalue\': "[Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'", \'traceback\': {insert: [(5, '
│ │ │ │ │ │┄ '"\\x1b[0;31mFileNotFoundError\\x1b[0m: [Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'")], delete: [5]}}}}}'}
│ │ │ │ │ │ @@ -39,23 +39,23 @@
│ │ │ │ │ │ "end_time": "2020-01-15T14:45:51.659160Z",
│ │ │ │ │ │ "start_time": "2020-01-15T14:45:50.528742Z"
│ │ │ │ │ │ }
│ │ │ │ │ │ },
│ │ │ │ │ │ "outputs": [
│ │ │ │ │ │ {
│ │ │ │ │ │ "ename": "FileNotFoundError",
│ │ │ │ │ │ - "evalue": "[Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'",
│ │ │ │ │ │ + "evalue": "[Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'",
│ │ │ │ │ │ "output_type": "error",
│ │ │ │ │ │ "traceback": [
│ │ │ │ │ │ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
│ │ │ │ │ │ "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
│ │ │ │ │ │ "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m air = (\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mxr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtutorial\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mload_dataset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"air_temperature\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0;34m.\u001b[0m\u001b[0mair\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msortby\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"lat\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# np.interp needs coordinate in ascending order\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;34m.\u001b[0m\u001b[0misel\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtime\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mslice\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlon\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mslice\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ "\u001b[0;32m/build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py\u001b[0m in \u001b[0;36mload_dataset\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 111\u001b[0m \u001b[0mopen_dataset\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 112\u001b[0m \"\"\"\n\u001b[0;32m--> 113\u001b[0;31m \u001b[0;32mwith\u001b[0m \u001b[0mopen_dataset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mds\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 114\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mds\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 115\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ "\u001b[0;32m/build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py\u001b[0m in \u001b[0;36mopen_dataset\u001b[0;34m(name, cache, cache_dir, github_url, branch, **kws)\u001b[0m\n\u001b[1;32m 76\u001b[0m \u001b[0;31m# May want to add an option to remove it.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 77\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0misdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 78\u001b[0;31m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmkdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 79\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 80\u001b[0m \u001b[0murl\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"/\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgithub_url\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"raw\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbranch\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfullname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ - "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'"
│ │ │ │ │ │ + "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'"
│ │ │ │ │ │ ]
│ │ │ │ │ │ }
│ │ │ │ │ │ ],
│ │ │ │ │ │ "source": [
│ │ │ │ │ │ "import xarray as xr\n",
│ │ │ │ │ │ "import numpy as np\n",
│ │ │ │ │ │ "\n",
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/area_weighted_temperature.html
│ │ │ │ @@ -555,15 +555,15 @@
│ │ │ │ /build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ 76# May want to add an option to remove it.
│ │ │ │ 77ifnot _os.path.isdir(longdir):
│ │ │ │ ---> 78_os.mkdir(longdir)
│ │ │ │ 79
│ │ │ │ 80 url ="/".join((github_url,"raw", branch, fullname))
│ │ │ │
│ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'
│ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'
│ │ │ │
We first have to come up with the weights, - calculate the month lengths for each monthly data record - calculate weights using groupby('time.season')
│ │ │ │
Finally, we just need to multiply our weights by the Dataset and sum allong the time dimension. Creating a DataArray for the month length is as easy as using the days_in_month accessor on the time coordinate. The calendar type, in this case 'noleap', is automatically considered in this operation.
│ │ │ │ ├── html2text {}
│ │ │ │ │ @@ -85,15 +85,15 @@
│ │ │ │ │ open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ │ 76 # May want to add an option to remove it.
│ │ │ │ │ 77 if not _os.path.isdir(longdir):
│ │ │ │ │ ---> 78 _os.mkdir(longdir)
│ │ │ │ │ 79
│ │ │ │ │ 80 url = "/".join((github_url, "raw", branch, fullname))
│ │ │ │ │
│ │ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-
│ │ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-
│ │ │ │ │ build/.xarray_tutorial_data'
│ │ │ │ │ ***** Now for the heavy lifting:¶ *****
│ │ │ │ │ We first have to come up with the weights, - calculate the month lengths for
│ │ │ │ │ each monthly data record - calculate weights using groupby('time.season')
│ │ │ │ │ Finally, we just need to multiply our weights by the Dataset and sum allong the
│ │ │ │ │ time dimension. Creating a DataArray for the month length is as easy as using
│ │ │ │ │ the days_in_month accessor on the time coordinate. The calendar type, in this
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/monthly-means.ipynb.gz
│ │ │ │ ├── monthly-means.ipynb
│ │ │ │ │ ├── Pretty-printed
│ │ │ │ │ │┄ Similarity: 0.999810606060606%
│ │ │ │ │ │┄ Differences: {"'cells'": '{3: {\'outputs\': {0: {\'evalue\': "[Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'", \'traceback\': {insert: [(4, '
│ │ │ │ │ │┄ '"\\x1b[0;31mFileNotFoundError\\x1b[0m: [Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'")], delete: [4]}}}}}'}
│ │ │ │ │ │ @@ -47,22 +47,22 @@
│ │ │ │ │ │ "end_time": "2018-11-28T20:51:36.072316Z",
│ │ │ │ │ │ "start_time": "2018-11-28T20:51:36.016594Z"
│ │ │ │ │ │ }
│ │ │ │ │ │ },
│ │ │ │ │ │ "outputs": [
│ │ │ │ │ │ {
│ │ │ │ │ │ "ename": "FileNotFoundError",
│ │ │ │ │ │ - "evalue": "[Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'",
│ │ │ │ │ │ + "evalue": "[Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'",
│ │ │ │ │ │ "output_type": "error",
│ │ │ │ │ │ "traceback": [
│ │ │ │ │ │ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
│ │ │ │ │ │ "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
│ │ │ │ │ │ "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mds\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mxr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtutorial\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mopen_dataset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'rasm'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mds\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ "\u001b[0;32m/build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py\u001b[0m in \u001b[0;36mopen_dataset\u001b[0;34m(name, cache, cache_dir, github_url, branch, **kws)\u001b[0m\n\u001b[1;32m 76\u001b[0m \u001b[0;31m# May want to add an option to remove it.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 77\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0misdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 78\u001b[0;31m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmkdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 79\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 80\u001b[0m \u001b[0murl\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"/\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgithub_url\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"raw\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbranch\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfullname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ - "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'"
│ │ │ │ │ │ + "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'"
│ │ │ │ │ │ ]
│ │ │ │ │ │ }
│ │ │ │ │ │ ],
│ │ │ │ │ │ "source": [
│ │ │ │ │ │ "ds = xr.tutorial.open_dataset('rasm').load()\n",
│ │ │ │ │ │ "ds"
│ │ │ │ │ │ ]
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/multidimensional-coords.html
│ │ │ │ @@ -527,15 +527,15 @@
│ │ │ │ /build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ 76# May want to add an option to remove it.
│ │ │ │ 77ifnot _os.path.isdir(longdir):
│ │ │ │ ---> 78_os.mkdir(longdir)
│ │ │ │ 79
│ │ │ │ 80 url ="/".join((github_url,"raw", branch, fullname))
│ │ │ │
│ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'
│ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'
│ │ │ │
│ │ │ │
│ │ │ │
In this example, the logical coordinates are x and y, while the physical coordinates are xc and yc, which represent the latitudes and longitude of the data.
│ │ │ │
│ │ │ │
[3]:
│ │ │ │
│ │ │ │
│ │ │ │ ├── html2text {}
│ │ │ │ │ @@ -83,15 +83,15 @@
│ │ │ │ │ open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ │ 76 # May want to add an option to remove it.
│ │ │ │ │ 77 if not _os.path.isdir(longdir):
│ │ │ │ │ ---> 78 _os.mkdir(longdir)
│ │ │ │ │ 79
│ │ │ │ │ 80 url = "/".join((github_url, "raw", branch, fullname))
│ │ │ │ │
│ │ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-
│ │ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-
│ │ │ │ │ build/.xarray_tutorial_data'
│ │ │ │ │ In this example, the logical coordinates are x and y, while the physical
│ │ │ │ │ coordinates are xc and yc, which represent the latitudes and longitude of the
│ │ │ │ │ data.
│ │ │ │ │ [3]:
│ │ │ │ │ print(ds.xc.attrs)
│ │ │ │ │ print(ds.yc.attrs)
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/multidimensional-coords.ipynb.gz
│ │ │ │ ├── multidimensional-coords.ipynb
│ │ │ │ │ ├── Pretty-printed
│ │ │ │ │ │┄ Similarity: 0.9998697916666667%
│ │ │ │ │ │┄ Differences: {"'cells'": '{3: {\'outputs\': {0: {\'evalue\': "[Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'", \'traceback\': {insert: [(4, '
│ │ │ │ │ │┄ '"\\x1b[0;31mFileNotFoundError\\x1b[0m: [Errno 2] No such file or directory: '
│ │ │ │ │ │┄ '\'/nonexistent/second-build/.xarray_tutorial_data\'")], delete: [4]}}}}}'}
│ │ │ │ │ │ @@ -45,22 +45,22 @@
│ │ │ │ │ │ "end_time": "2018-11-28T20:50:13.629720Z",
│ │ │ │ │ │ "start_time": "2018-11-28T20:50:13.484542Z"
│ │ │ │ │ │ }
│ │ │ │ │ │ },
│ │ │ │ │ │ "outputs": [
│ │ │ │ │ │ {
│ │ │ │ │ │ "ename": "FileNotFoundError",
│ │ │ │ │ │ - "evalue": "[Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'",
│ │ │ │ │ │ + "evalue": "[Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'",
│ │ │ │ │ │ "output_type": "error",
│ │ │ │ │ │ "traceback": [
│ │ │ │ │ │ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
│ │ │ │ │ │ "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
│ │ │ │ │ │ "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mds\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mxr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtutorial\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mopen_dataset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'rasm'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mds\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ "\u001b[0;32m/build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py\u001b[0m in \u001b[0;36mopen_dataset\u001b[0;34m(name, cache, cache_dir, github_url, branch, **kws)\u001b[0m\n\u001b[1;32m 76\u001b[0m \u001b[0;31m# May want to add an option to remove it.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 77\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0misdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 78\u001b[0;31m \u001b[0m_os\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmkdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlongdir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 79\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 80\u001b[0m \u001b[0murl\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"/\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgithub_url\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"raw\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbranch\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfullname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
│ │ │ │ │ │ - "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'"
│ │ │ │ │ │ + "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'"
│ │ │ │ │ │ ]
│ │ │ │ │ │ }
│ │ │ │ │ │ ],
│ │ │ │ │ │ "source": [
│ │ │ │ │ │ "ds = xr.tutorial.open_dataset('rasm').load()\n",
│ │ │ │ │ │ "ds"
│ │ │ │ │ │ ]
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/examples/visualization_gallery.html
│ │ │ │ @@ -533,15 +533,15 @@
│ │ │ │ /build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ 76# May want to add an option to remove it.
│ │ │ │ 77ifnot _os.path.isdir(longdir):
│ │ │ │ ---> 78_os.mkdir(longdir)
│ │ │ │ 79
│ │ │ │ 80 url ="/".join((github_url,"raw", branch, fullname))
│ │ │ │
│ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'
│ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'
│ │ │ │
(The suffix .zarr is optional–just a reminder that a zarr store lives
│ │ │ │ there.) If the directory does not exist, it will be created. If a zarr
│ │ │ │ store is already present at that path, an error will be raised, preventing it
│ │ │ │ from being overwritten. To override this behavior and overwrite an existing
│ │ │ │ store, add mode='w' when invoking to_zarr().
│ │ │ │ @@ -1113,15 +1113,15 @@
│ │ │ │ These options can be passed to the to_zarr method as variable encoding.
│ │ │ │ For example:
│ │ │ │
In [42]: importzarr
│ │ │ │
│ │ │ │ In [43]: compressor=zarr.Blosc(cname="zstd",clevel=3,shuffle=2)
│ │ │ │
│ │ │ │ In [44]: ds.to_zarr("foo.zarr",encoding={"foo":{"compressor":compressor}})
│ │ │ │ -Out[44]: <xarray.backends.zarr.ZarrStore at 0xffff57542400>
│ │ │ │ +Out[44]: <xarray.backends.zarr.ZarrStore at 0xffff4074e520>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
Note
│ │ │ │
Not all native zarr compression and filtering options have been tested with
│ │ │ │ xarray.
Finally, you can use region to write to limited regions of existing arrays
│ │ │ │ in an existing Zarr store. This is a good option for writing data in parallel
│ │ │ │ from independent processes.
│ │ │ │
To scale this up to writing large datasets, the first step is creating an
│ │ │ │ initial Zarr store without writing all of its array data. This can be done by
│ │ │ │ @@ -1218,33 +1218,33 @@
│ │ │ │
│ │ │ │ In [51]: ds=xr.Dataset({"foo":("x",dummies)})
│ │ │ │
│ │ │ │ In [52]: path="path/to/directory.zarr"
│ │ │ │
│ │ │ │ # Now we write the metadata without computing any array values
│ │ │ │ In [53]: ds.to_zarr(path,compute=False,consolidated=True)
│ │ │ │ -Out[53]: Delayed('_finalize_store-03d2aeb5-469e-4639-96ca-4f833ed6b714')
│ │ │ │ +Out[53]: Delayed('_finalize_store-aa004604-7456-43e7-a304-cbd78eb3f262')
│ │ │ │
│ │ │ │
│ │ │ │
Now, a Zarr store with the correct variable shapes and attributes exists that
│ │ │ │ can be filled out by subsequent calls to to_zarr. The region provides a
│ │ │ │ mapping from dimension names to Python slice objects indicating where the
│ │ │ │ data should be written (in index space, not coordinate space), e.g.,
│ │ │ │
# For convenience, we'll slice a single dataset, but in the real use-case
│ │ │ │ # we would create them separately, possibly even from separate processes.
│ │ │ │ In [54]: ds=xr.Dataset({"foo":("x",np.arange(30))})
│ │ │ │
│ │ │ │ In [55]: ds.isel(x=slice(0,10)).to_zarr(path,region={"x":slice(0,10)})
│ │ │ │ -Out[55]: <xarray.backends.zarr.ZarrStore at 0xffff5752cc40>
│ │ │ │ +Out[55]: <xarray.backends.zarr.ZarrStore at 0xffff4074eac0>
│ │ │ │
│ │ │ │ In [56]: ds.isel(x=slice(10,20)).to_zarr(path,region={"x":slice(10,20)})
│ │ │ │ -Out[56]: <xarray.backends.zarr.ZarrStore at 0xffff57542ca0>
│ │ │ │ +Out[56]: <xarray.backends.zarr.ZarrStore at 0xffff4074e700>
│ │ │ │
│ │ │ │ In [57]: ds.isel(x=slice(20,30)).to_zarr(path,region={"x":slice(20,30)})
│ │ │ │ -Out[57]: <xarray.backends.zarr.ZarrStore at 0xffff886f31c0>
│ │ │ │ +Out[57]: <xarray.backends.zarr.ZarrStore at 0xffff68a71340>
│ │ │ │
│ │ │ │
│ │ │ │
Concurrent writes with region are safe as long as they modify distinct
│ │ │ │ chunks in the underlying Zarr arrays (or use an appropriate lock).
│ │ │ │
As a safety check to make it harder to inadvertently override existing values,
│ │ │ │ if you set region then all variables included in a Dataset must have
│ │ │ │ dimensions included in region. Other variables (typically coordinates)
│ │ │ │ ├── html2text {}
│ │ │ │ │ @@ -762,15 +762,15 @@
│ │ │ │ │ ....: "y": pd.date_range("2000-01-01", periods=5),
│ │ │ │ │ ....: "z": ("x", list("abcd")),
│ │ │ │ │ ....: },
│ │ │ │ │ ....: )
│ │ │ │ │ ....:
│ │ │ │ │
│ │ │ │ │ In [39]: ds.to_zarr("path/to/directory.zarr")
│ │ │ │ │ -Out[39]:
│ │ │ │ │ +Out[39]:
│ │ │ │ │ (The suffix .zarr is optionalâjust a reminder that a zarr store lives there.)
│ │ │ │ │ If the directory does not exist, it will be created. If a zarr store is already
│ │ │ │ │ present at that path, an error will be raised, preventing it from being
│ │ │ │ │ overwritten. To override this behavior and overwrite an existing store, add
│ │ │ │ │ mode='w' when invoking to_zarr().
│ │ │ │ │ To store variable length strings, convert them to object arrays first with
│ │ │ │ │ dtype=object.
│ │ │ │ │ @@ -806,15 +806,15 @@
│ │ │ │ │ zarr. These are described in the zarr_documentation. These options can be
│ │ │ │ │ passed to the to_zarr method as variable encoding. For example:
│ │ │ │ │ In [42]: import zarr
│ │ │ │ │
│ │ │ │ │ In [43]: compressor = zarr.Blosc(cname="zstd", clevel=3, shuffle=2)
│ │ │ │ │
│ │ │ │ │ In [44]: ds.to_zarr("foo.zarr", encoding={"foo": {"compressor": compressor}})
│ │ │ │ │ -Out[44]:
│ │ │ │ │ +Out[44]:
│ │ │ │ │ Note
│ │ │ │ │ Not all native zarr compression and filtering options have been tested with
│ │ │ │ │ xarray.
│ │ │ │ │ **** Consolidated Metadata¶ ****
│ │ │ │ │ Xarray needs to read all of the zarr metadata when it opens a dataset. In some
│ │ │ │ │ storage mediums, such as with cloud object storage (e.g. amazon S3), this can
│ │ │ │ │ introduce significant overhead, because two separate HTTP calls to the object
│ │ │ │ │ @@ -854,28 +854,28 @@
│ │ │ │ │ ....: "y": [1, 2, 3, 4, 5],
│ │ │ │ │ ....: "t": pd.date_range("2001-01-01", periods=2),
│ │ │ │ │ ....: },
│ │ │ │ │ ....: )
│ │ │ │ │ ....:
│ │ │ │ │
│ │ │ │ │ In [46]: ds1.to_zarr("path/to/directory.zarr")
│ │ │ │ │ -Out[46]:
│ │ │ │ │ +Out[46]:
│ │ │ │ │
│ │ │ │ │ In [47]: ds2 = xr.Dataset(
│ │ │ │ │ ....: {"foo": (("x", "y", "t"), np.random.rand(4, 5, 2))},
│ │ │ │ │ ....: coords={
│ │ │ │ │ ....: "x": [10, 20, 30, 40],
│ │ │ │ │ ....: "y": [1, 2, 3, 4, 5],
│ │ │ │ │ ....: "t": pd.date_range("2001-01-03", periods=2),
│ │ │ │ │ ....: },
│ │ │ │ │ ....: )
│ │ │ │ │ ....:
│ │ │ │ │
│ │ │ │ │ In [48]: ds2.to_zarr("path/to/directory.zarr", append_dim="t")
│ │ │ │ │ -Out[48]:
│ │ │ │ │ +Out[48]:
│ │ │ │ │ Finally, you can use region to write to limited regions of existing arrays in
│ │ │ │ │ an existing Zarr store. This is a good option for writing data in parallel from
│ │ │ │ │ independent processes.
│ │ │ │ │ To scale this up to writing large datasets, the first step is creating an
│ │ │ │ │ initial Zarr store without writing all of its array data. This can be done by
│ │ │ │ │ first creating a Dataset with dummy values stored in dask, and then calling
│ │ │ │ │ to_zarr with compute=False to write only metadata (including attrs) to Zarr:
│ │ │ │ │ @@ -887,31 +887,31 @@
│ │ │ │ │
│ │ │ │ │ In [51]: ds = xr.Dataset({"foo": ("x", dummies)})
│ │ │ │ │
│ │ │ │ │ In [52]: path = "path/to/directory.zarr"
│ │ │ │ │
│ │ │ │ │ # Now we write the metadata without computing any array values
│ │ │ │ │ In [53]: ds.to_zarr(path, compute=False, consolidated=True)
│ │ │ │ │ -Out[53]: Delayed('_finalize_store-03d2aeb5-469e-4639-96ca-4f833ed6b714')
│ │ │ │ │ +Out[53]: Delayed('_finalize_store-aa004604-7456-43e7-a304-cbd78eb3f262')
│ │ │ │ │ Now, a Zarr store with the correct variable shapes and attributes exists that
│ │ │ │ │ can be filled out by subsequent calls to to_zarr. The region provides a mapping
│ │ │ │ │ from dimension names to Python slice objects indicating where the data should
│ │ │ │ │ be written (in index space, not coordinate space), e.g.,
│ │ │ │ │ # For convenience, we'll slice a single dataset, but in the real use-case
│ │ │ │ │ # we would create them separately, possibly even from separate processes.
│ │ │ │ │ In [54]: ds = xr.Dataset({"foo": ("x", np.arange(30))})
│ │ │ │ │
│ │ │ │ │ In [55]: ds.isel(x=slice(0, 10)).to_zarr(path, region={"x": slice(0, 10)})
│ │ │ │ │ -Out[55]:
│ │ │ │ │ +Out[55]:
│ │ │ │ │
│ │ │ │ │ In [56]: ds.isel(x=slice(10, 20)).to_zarr(path, region={"x": slice(10, 20)})
│ │ │ │ │ -Out[56]:
│ │ │ │ │ +Out[56]:
│ │ │ │ │
│ │ │ │ │ In [57]: ds.isel(x=slice(20, 30)).to_zarr(path, region={"x": slice(20, 30)})
│ │ │ │ │ -Out[57]:
│ │ │ │ │ +Out[57]:
│ │ │ │ │ Concurrent writes with region are safe as long as they modify distinct chunks
│ │ │ │ │ in the underlying Zarr arrays (or use an appropriate lock).
│ │ │ │ │ As a safety check to make it harder to inadvertently override existing values,
│ │ │ │ │ if you set region then all variables included in a Dataset must have dimensions
│ │ │ │ │ included in region. Other variables (typically coordinates) need to be
│ │ │ │ │ explicitly dropped and/or written in a separate calls to to_zarr with mode='a'.
│ │ │ │ │ ***** GRIB format via cfgrib¶ *****
│ │ │ ├── ./usr/share/doc/python-xarray-doc/html/plotting.html
│ │ │ │ @@ -326,15 +326,15 @@
│ │ │ │ /build/reproducible-path/python-xarray-0.16.2/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, **kws)
│ │ │ │ 76# May want to add an option to remove it.
│ │ │ │ 77ifnot_os.path.isdir(longdir):
│ │ │ │ ---> 78_os.mkdir(longdir)
│ │ │ │ 79
│ │ │ │ 80url="/".join((github_url,"raw",branch,fullname))
│ │ │ │
│ │ │ │ -FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/first-build/.xarray_tutorial_data'
│ │ │ │ +FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/second-build/.xarray_tutorial_data'
│ │ │ │
│ │ │ │ In [6]: airtemps
│ │ │ │ ---------------------------------------------------------------------------
│ │ │ │ NameErrorTraceback (most recent call last)
│ │ │ │ <ipython-input-6-eb57b540ddce>in<module>
│ │ │ │ ----> 1airtemps
│ │ │ │
│ │ │ │ @@ -852,15 +852,15 @@
│ │ │ │ --> 171ref_var=variables[ref_name]
│ │ │ │ 172
│ │ │ │ 173ifvar_nameisNone:
│ │ │ │
│ │ │ │ KeyError: 'lat'
│ │ │ │
│ │ │ │ In [51]: b.plot()
│ │ │ │ -Out[51]: [<matplotlib.lines.Line2D at 0xffff56f524c0>]
│ │ │ │ +Out[51]: [<matplotlib.lines.Line2D at 0xffff40184d00>]
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
Since this is a thin wrapper around matplotlib, all the functionality of
│ │ │ │ @@ -1314,56 +1314,56 @@
│ │ │ │ Data variables:
│ │ │ │ A (x, y, z, w) float64 -0.104 0.02719 -0.0425 ... -0.1175 -0.0183
│ │ │ │ B (x, y, z, w) float64 0.0 0.0 0.0 0.0 ... 1.369 1.408 1.387 1.417
│ │ │ │
│ │ │ │
│ │ │ │
Suppose we want to scatter A against B
│ │ │ │
In [95]: ds.plot.scatter(x="A",y="B")
│ │ │ │ -Out[95]: <matplotlib.collections.PathCollection at 0xffff570bd760>
│ │ │ │ +Out[95]: <matplotlib.collections.PathCollection at 0xffff402f9580>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
The hue kwarg lets you vary the color by variable value
│ │ │ │
In [96]: ds.plot.scatter(x="A",y="B",hue="w")
│ │ │ │ Out[96]:
│ │ │ │ -[<matplotlib.collections.PathCollection at 0xffff6c7d9760>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff6c7d9f70>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff6c3df6d0>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff57521cd0>]
│ │ │ │ +[<matplotlib.collections.PathCollection at 0xffff682bbe80>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff688c7100>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff6acc2df0>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff6812a580>]
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
When hue is specified, a colorbar is added for numeric hue DataArrays by
│ │ │ │ default and a legend is added for non-numeric hue DataArrays (as above).
│ │ │ │ You can force a legend instead of a colorbar by setting hue_style='discrete'.
│ │ │ │ Additionally, the boolean kwarg add_guide can be used to prevent the display of a legend or colorbar (as appropriate).
│ │ │ │
In [97]: ds=ds.assign(w=[1,2,3,5])
│ │ │ │
│ │ │ │ In [98]: ds.plot.scatter(x="A",y="B",hue="w",hue_style="discrete")
│ │ │ │ Out[98]:
│ │ │ │ -[<matplotlib.collections.PathCollection at 0xffff6c08c6a0>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff6c0e0250>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff6c076b50>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff6c0b6af0>]
│ │ │ │ +[<matplotlib.collections.PathCollection at 0xffff68277040>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff582932b0>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff5836ff10>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff690cc430>]
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
The markersize kwarg lets you vary the point’s size by variable value. You can additionally pass size_norm to control how the variable’s values are mapped to point sizes.
│ │ │ │
In [99]: ds.plot.scatter(x="A",y="B",hue="z",hue_style="discrete",markersize="z")
│ │ │ │ Out[99]:
│ │ │ │ -[<matplotlib.collections.PathCollection at 0xffff56f20520>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff57555970>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff56f8dc70>,
│ │ │ │ - <matplotlib.collections.PathCollection at 0xffff570e7760>]
│ │ │ │ +[<matplotlib.collections.PathCollection at 0xffff585f61c0>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff40241640>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff68d013a0>,
│ │ │ │ + <matplotlib.collections.PathCollection at 0xffff40241e80>]
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
Faceting is also possible
│ │ │ │
In [100]: ds.plot.scatter(x="A",y="B",col="x",row="z",hue="w",hue_style="discrete")
│ │ │ │ -Out[100]: <xarray.plot.facetgrid.FacetGrid at 0xffff56f5dd30>
│ │ │ │ +Out[100]: <xarray.plot.facetgrid.FacetGrid at 0xffff40100160>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
For more advanced scatter plots, we recommend converting the relevant data variables to a pandas DataFrame and using the extensive plotting capabilities of seaborn.
In [109]: importxarray.plotasxplt
│ │ │ │
│ │ │ │ In [110]: da=xr.DataArray(range(5))
│ │ │ │
│ │ │ │ In [111]: fig,axes=plt.subplots(ncols=2,nrows=2)
│ │ │ │
│ │ │ │ In [112]: da.plot(ax=axes[0,0])
│ │ │ │ -Out[112]: [<matplotlib.lines.Line2D at 0xffff56cea4f0>]
│ │ │ │ +Out[112]: [<matplotlib.lines.Line2D at 0xffff2b6d8df0>]
│ │ │ │
│ │ │ │ In [113]: da.plot.line(ax=axes[0,1])
│ │ │ │ -Out[113]: [<matplotlib.lines.Line2D at 0xffff6c0ef160>]
│ │ │ │ +Out[113]: [<matplotlib.lines.Line2D at 0xffff82dc2af0>]
│ │ │ │
│ │ │ │ In [114]: xplt.plot(da,ax=axes[1,0])
│ │ │ │ -Out[114]: [<matplotlib.lines.Line2D at 0xffff56cd5e80>]
│ │ │ │ +Out[114]: [<matplotlib.lines.Line2D at 0xffff2b6d8430>]
│ │ │ │
│ │ │ │ In [115]: xplt.line(da,ax=axes[1,1])
│ │ │ │ -Out[115]: [<matplotlib.lines.Line2D at 0xffff56cd5940>]
│ │ │ │ +Out[115]: [<matplotlib.lines.Line2D at 0xffff2b663dc0>]
│ │ │ │
│ │ │ │ In [116]: plt.tight_layout()
│ │ │ │
│ │ │ │ In [117]: plt.draw()
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │ @@ -1542,15 +1542,15 @@
│ │ │ │
│ │ │ │
The plot will produce an image corresponding to the values of the array.
│ │ │ │ Hence the top left pixel will be a different color than the others.
│ │ │ │ Before reading on, you may want to look at the coordinates and
│ │ │ │ think carefully about what the limits, labels, and orientation for
│ │ │ │ each of the axes should be.
│ │ │ │
In [122]: a.plot()
│ │ │ │ -Out[122]: <matplotlib.collections.QuadMesh at 0xffff56d273a0>
│ │ │ │ +Out[122]: <matplotlib.collections.QuadMesh at 0xffff2b731700>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
It may seem strange that
│ │ │ │ the values on the y axis are decreasing with -0.5 on the top. This is because
│ │ │ │ the pixels are centered over their coordinates, and the
│ │ │ │ axis labels and ranges correspond to the values of the
│ │ │ │ @@ -1572,81 +1572,81 @@
│ │ │ │ .....: np.arange(20).reshape(4,5),
│ │ │ │ .....: dims=["y","x"],
│ │ │ │ .....: coords={"lat":(("y","x"),lat),"lon":(("y","x"),lon)},
│ │ │ │ .....: )
│ │ │ │ .....:
│ │ │ │
│ │ │ │ In [127]: da.plot.pcolormesh("lon","lat")
│ │ │ │ -Out[127]: <matplotlib.collections.QuadMesh at 0xffffae3e6c70>
│ │ │ │ +Out[127]: <matplotlib.collections.QuadMesh at 0xffff82de7610>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
Note that in this case, xarray still follows the pixel centered convention.
│ │ │ │ This might be undesirable in some cases, for example when your data is defined
│ │ │ │ on a polar projection (GH781). This is why the default is to not follow
│ │ │ │ this convention when plotting on a map:
│ │ │ │
In [128]: importcartopy.crsasccrs
│ │ │ │
│ │ │ │ In [129]: ax=plt.subplot(projection=ccrs.PlateCarree())
│ │ │ │
│ │ │ │ In [130]: da.plot.pcolormesh("lon","lat",ax=ax)
│ │ │ │ -Out[130]: <matplotlib.collections.QuadMesh at 0xffff6c3a67f0>
│ │ │ │ +Out[130]: <matplotlib.collections.QuadMesh at 0xffff402ea850>
│ │ │ │
│ │ │ │ In [131]: ax.scatter(lon,lat,transform=ccrs.PlateCarree())
│ │ │ │ -Out[131]: <matplotlib.collections.PathCollection at 0xffff56c8f5b0>
│ │ │ │ +Out[131]: <matplotlib.collections.PathCollection at 0xffff82f86130>
│ │ │ │
│ │ │ │ In [132]: ax.coastlines()
│ │ │ │ -Out[132]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffffae50cf10>
│ │ │ │ +Out[132]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffff582934f0>
│ │ │ │
│ │ │ │ In [133]: ax.gridlines(draw_labels=True)
│ │ │ │ -Out[133]: <cartopy.mpl.gridliner.Gridliner at 0xffffae50ca30>
│ │ │ │ +Out[133]: <cartopy.mpl.gridliner.Gridliner at 0xffff58293af0>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
You can however decide to infer the cell boundaries and use the
│ │ │ │ infer_intervals keyword:
│ │ │ │
In [134]: ax=plt.subplot(projection=ccrs.PlateCarree())
│ │ │ │
│ │ │ │ In [135]: da.plot.pcolormesh("lon","lat",ax=ax,infer_intervals=True)
│ │ │ │ -Out[135]: <matplotlib.collections.QuadMesh at 0xffffae57ea00>
│ │ │ │ +Out[135]: <matplotlib.collections.QuadMesh at 0xffff82fd67c0>
│ │ │ │
│ │ │ │ In [136]: ax.scatter(lon,lat,transform=ccrs.PlateCarree())
│ │ │ │ -Out[136]: <matplotlib.collections.PathCollection at 0xffffae57e3d0>
│ │ │ │ +Out[136]: <matplotlib.collections.PathCollection at 0xffff2b376730>
│ │ │ │
│ │ │ │ In [137]: ax.coastlines()
│ │ │ │ -Out[137]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffff5695ad60>
│ │ │ │ +Out[137]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffff2b3ac790>
│ │ │ │
│ │ │ │ In [138]: ax.gridlines(draw_labels=True)
│ │ │ │ -Out[138]: <cartopy.mpl.gridliner.Gridliner at 0xffff5695ad30>
│ │ │ │ +Out[138]: <cartopy.mpl.gridliner.Gridliner at 0xffff2b3acc70>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
Note
│ │ │ │
The data model of xarray does not support datasets with cell boundaries
│ │ │ │ yet. If you want to use these coordinates, you’ll have to make the plots
│ │ │ │ outside the xarray framework.
│ │ │ │
│ │ │ │
One can also make line plots with multidimensional coordinates. In this case, hue must be a dimension name, not a coordinate name.
│ │ │ │
In [139]: f,ax=plt.subplots(2,1)
│ │ │ │
│ │ │ │ In [140]: da.plot.line(x="lon",hue="y",ax=ax[0])
│ │ │ │ Out[140]:
│ │ │ │ -[<matplotlib.lines.Line2D at 0xffff57525280>,
│ │ │ │ - <matplotlib.lines.Line2D at 0xffff56c54dc0>,
│ │ │ │ - <matplotlib.lines.Line2D at 0xffff56d1a8b0>,
│ │ │ │ - <matplotlib.lines.Line2D at 0xffff570f30a0>]
│ │ │ │ +[<matplotlib.lines.Line2D at 0xffff82ef1d90>,
│ │ │ │ + <matplotlib.lines.Line2D at 0xffff58589b20>,
│ │ │ │ + <matplotlib.lines.Line2D at 0xffff40067fa0>,
│ │ │ │ + <matplotlib.lines.Line2D at 0xffff82f865e0>]
│ │ │ │
│ │ │ │ In [141]: da.plot.line(x="lon",hue="x",ax=ax[1])
│ │ │ │ Out[141]:
│ │ │ │ -[<matplotlib.lines.Line2D at 0xffff56882160>,
│ │ │ │ - <matplotlib.lines.Line2D at 0xffff56882400>,
│ │ │ │ - <matplotlib.lines.Line2D at 0xffff56882460>,
│ │ │ │ - <matplotlib.lines.Line2D at 0xffff568825b0>,
│ │ │ │ - <matplotlib.lines.Line2D at 0xffff56882670>]
│ │ │ │ +[<matplotlib.lines.Line2D at 0xffff2b292220>,
│ │ │ │ + <matplotlib.lines.Line2D at 0xffff2b292b80>,
│ │ │ │ + <matplotlib.lines.Line2D at 0xffff2b292be0>,
│ │ │ │ + <matplotlib.lines.Line2D at 0xffff2b292d30>,
│ │ │ │ + <matplotlib.lines.Line2D at 0xffff2b292df0>]
│ │ │ │
Visualizing your datasets is quick and convenient:
│ │ │ │
In [37]: data.plot()
│ │ │ │ -Out[37]: <matplotlib.collections.QuadMesh at 0xffff567e1e50>
│ │ │ │ +Out[37]: <matplotlib.collections.QuadMesh at 0xffff2b1f8610>
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
Note the automatic labeling with names and units. Our effort in adding metadata attributes has paid off! Many aspects of these figures are customizable: see Plotting.