861 KB
/srv/reproducible-results/rbuild-debian/tmp.urzYok0Vhj/b1/keras_2.2.4-1_i386.changes vs.
/srv/reproducible-results/rbuild-debian/tmp.urzYok0Vhj/b2/keras_2.2.4-1_i386.changes
278 B
Files
    
Offset 1, 3 lines modifiedOffset 1, 3 lines modified
  
1 ·7f6b7a2fa2214f99f7c5c7294c40b752·322836·doc·optional·keras-doc_2.2.4-1_all.deb1 ·31f921a89170d7aa5d28ac846cf8592d·322844·doc·optional·keras-doc_2.2.4-1_all.deb
2 ·1cf5883a8b6bf008e851f6f6f9f9e130·202000·python·optional·python3-keras_2.2.4-1_all.deb2 ·1cf5883a8b6bf008e851f6f6f9f9e130·202000·python·optional·python3-keras_2.2.4-1_all.deb
860 KB
keras-doc_2.2.4-1_all.deb
452 B
file list
    
Offset 1, 3 lines modifiedOffset 1, 3 lines modified
1 -rw-r--r--···0········0········0········4·2019-01-17·22:44:17.000000·debian-binary1 -rw-r--r--···0········0········0········4·2019-01-17·22:44:17.000000·debian-binary
2 -rw-r--r--···0········0········0·····3820·2019-01-17·22:44:17.000000·control.tar.xz2 -rw-r--r--···0········0········0·····3816·2019-01-17·22:44:17.000000·control.tar.xz
3 -rw-r--r--···0········0········0···318824·2019-01-17·22:44:17.000000·data.tar.xz3 -rw-r--r--···0········0········0···318836·2019-01-17·22:44:17.000000·data.tar.xz
98.0 B
control.tar.xz
70.0 B
control.tar
48.0 B
./md5sums
30.0 B
./md5sums
Files differ
860 KB
data.tar.xz
860 KB
data.tar
15.2 KB
./usr/share/doc/keras-doc/html/callbacks.html
Ordering differences only
    
Offset 197, 17 lines modifiedOffset 197, 19 lines modified
197 ····<a·class="current"·href="callbacks.html">Callbacks</a>197 ····<a·class="current"·href="callbacks.html">Callbacks</a>
198 ····<ul·class="subnav">198 ····<ul·class="subnav">
199 ············199 ············
200 ····<li·class="toctree-l2"><a·href="#usage-of-callbacks">Usage·of·callbacks</a></li>200 ····<li·class="toctree-l2"><a·href="#usage-of-callbacks">Usage·of·callbacks</a></li>
201 ····201 ····
202 ········<ul>202 ········<ul>
203 ········203 ········
204 ············<li><a·class="toctree-l3"·href="#csvlogger">CSVLogger</a></li>204 ············<li><a·class="toctree-l3"·href="#callback">Callback</a></li>
205 ········205 ········
206 ············<li><a·class="toctree-l3"·href="#lambdacallback">LambdaCallback</a></li>206 ············<li><a·class="toctree-l3"·href="#baselogger">BaseLogger</a></li>
 207 ········
 208 ············<li><a·class="toctree-l3"·href="#terminateonnan">TerminateOnNaN</a></li>
207 ········209 ········
208 ············<li><a·class="toctree-l3"·href="#progbarlogger">ProgbarLogger</a></li>210 ············<li><a·class="toctree-l3"·href="#progbarlogger">ProgbarLogger</a></li>
209 ········211 ········
210 ············<li><a·class="toctree-l3"·href="#history">History</a></li>212 ············<li><a·class="toctree-l3"·href="#history">History</a></li>
211 ········213 ········
212 ············<li><a·class="toctree-l3"·href="#modelcheckpoint">ModelCheckpoint</a></li>214 ············<li><a·class="toctree-l3"·href="#modelcheckpoint">ModelCheckpoint</a></li>
213 ········215 ········
Offset 217, 19 lines modifiedOffset 219, 17 lines modified
217 ········219 ········
218 ············<li><a·class="toctree-l3"·href="#learningratescheduler">LearningRateScheduler</a></li>220 ············<li><a·class="toctree-l3"·href="#learningratescheduler">LearningRateScheduler</a></li>
219 ········221 ········
220 ············<li><a·class="toctree-l3"·href="#tensorboard">TensorBoard</a></li>222 ············<li><a·class="toctree-l3"·href="#tensorboard">TensorBoard</a></li>
221 ········223 ········
222 ············<li><a·class="toctree-l3"·href="#reducelronplateau">ReduceLROnPlateau</a></li>224 ············<li><a·class="toctree-l3"·href="#reducelronplateau">ReduceLROnPlateau</a></li>
223 ········225 ········
224 ············<li><a·class="toctree-l3"·href="#callback">Callback</a></li>226 ············<li><a·class="toctree-l3"·href="#csvlogger">CSVLogger</a></li>
225 ········ 
226 ············<li><a·class="toctree-l3"·href="#baselogger">BaseLogger</a></li> 
227 ········227 ········
228 ············<li><a·class="toctree-l3"·href="#terminateonnan">TerminateOnNaN</a></li>228 ············<li><a·class="toctree-l3"·href="#lambdacallback">LambdaCallback</a></li>
229 ········229 ········
230 ········</ul>230 ········</ul>
231 ····231 ····
  
232 ····<li·class="toctree-l2"><a·href="#create-a-callback">Create·a·callback</a></li>232 ····<li·class="toctree-l2"><a·href="#create-a-callback">Create·a·callback</a></li>
233 ····233 ····
234 ········<ul>234 ········<ul>
Offset 328, 88 lines modifiedOffset 328, 63 lines modified
328 </div>328 </div>
329 ··········<div·role="main">329 ··········<div·role="main">
330 ············<div·class="section">330 ············<div·class="section">
331 ··············331 ··············
332 ················<h2·id="usage-of-callbacks">Usage·of·callbacks</h2>332 ················<h2·id="usage-of-callbacks">Usage·of·callbacks</h2>
333 <p>A·callback·is·a·set·of·functions·to·be·applied·at·given·stages·of·the·training·procedure.·You·can·use·callbacks·to·get·a·view·on·internal·states·and·statistics·of·the·model·during·training.·You·can·pass·a·list·of·callbacks·(as·the·keyword·argument·<code>callbacks</code>)·to·the·<code>.fit()</code>·method·of·the·<code>Sequential</code>·or·<code>Model</code>·classes.·The·relevant·methods·of·the·callbacks·will·then·be·called·at·each·stage·of·the·training.·</p>333 <p>A·callback·is·a·set·of·functions·to·be·applied·at·given·stages·of·the·training·procedure.·You·can·use·callbacks·to·get·a·view·on·internal·states·and·statistics·of·the·model·during·training.·You·can·pass·a·list·of·callbacks·(as·the·keyword·argument·<code>callbacks</code>)·to·the·<code>.fit()</code>·method·of·the·<code>Sequential</code>·or·<code>Model</code>·classes.·The·relevant·methods·of·the·callbacks·will·then·be·called·at·each·stage·of·the·training.·</p>
334 <hr·/>334 <hr·/>
335 <p><span·style="float:right;"><a·href="https://github.com/keras-team/keras/blob/master/keras/callbacks.py#L1138">[source]</a></span></p>335 <p><span·style="float:right;"><a·href="https://github.com/keras-team/keras/blob/master/keras/callbacks.py#L148">[source]</a></span></p>
336 <h3·id="csvlogger">CSVLogger</h3>336 <h3·id="callback">Callback</h3>
337 <pre><code·class="python">keras.callbacks.CSVLogger(filename,·separator=',',·append=False)337 <pre><code·class="python">keras.callbacks.Callback()
338 </code></pre> 
  
339 <p>Callback·that·streams·epoch·results·to·a·csv·file.</p> 
340 <p>Supports·all·values·that·can·be·represented·as·a·string, 
341 including·1D·iterables·such·as·np.ndarray.</p> 
342 <p><strong>Example</strong></p> 
343 <pre><code·class="python">csv_logger·=·CSVLogger('training.log') 
344 model.fit(X_train,·Y_train,·callbacks=[csv_logger]) 
345 </code></pre>338 </code></pre>
  
346 <p><strong>Arguments</strong></p>339 <p>Abstract·base·class·used·to·build·new·callbacks.</p>
 340 <p><strong>Properties</strong></p>
347 <ul>341 <ul>
348 <li><strong>filename</strong>:·filename·of·the·csv·file,·e.g.·'run/log.csv'.</li>342 <li><strong>params</strong>:·dict.·Training·parameters
349 <li><strong>separator</strong>:·string·used·to·separate·elements·in·the·csv·file.</li>343 ····(eg.·verbosity,·batch·size,·number·of·epochs...).</li>
350 <li><strong>append</strong>:·True:·append·if·file·exists·(useful·for·continuing344 <li><strong>model</strong>:·instance·of·<code>keras.models.Model</code>.
351 ····training).·False:·overwrite·existing·file,</li>345 ····Reference·of·the·model·being·trained.</li>
352 </ul>346 </ul>
 347 <p>The·<code>logs</code>·dictionary·that·callback·methods
 348 take·as·argument·will·contain·keys·for·quantities·relevant·to
 349 the·current·batch·or·epoch.</p>
 350 <p>Currently,·the·<code>.fit()</code>·method·of·the·<code>Sequential</code>·model·class
 351 will·include·the·following·quantities·in·the·<code>logs</code>·that
 352 it·passes·to·its·callbacks:</p>
 353 <p>on_epoch_end:·logs·include·<code>acc</code>·and·<code>loss</code>,·and
 354 optionally·include·<code>val_loss</code>
 355 (if·validation·is·enabled·in·<code>fit</code>),·and·<code>val_acc</code>
 356 (if·validation·and·accuracy·monitoring·are·enabled).
 357 on_batch_begin:·logs·include·<code>size</code>,
 358 the·number·of·samples·in·the·current·batch.
 359 on_batch_end:·logs·include·<code>loss</code>,·and·optionally·<code>acc</code>
 360 (if·accuracy·monitoring·is·enabled).</p>
353 <hr·/>361 <hr·/>
354 <p><span·style="float:right;"><a·href="https://github.com/keras-team/keras/blob/master/keras/callbacks.py#L1226">[source]</a></span></p>362 <p><span·style="float:right;"><a·href="https://github.com/keras-team/keras/blob/master/keras/callbacks.py#L204">[source]</a></span></p>
355 <h3·id="lambdacallback">LambdaCallback</h3>363 <h3·id="baselogger">BaseLogger</h3>
356 <pre><code·class="python">keras.callbacks.LambdaCallback(on_epoch_begin=None,·on_epoch_end=None,·on_batch_begin=None,·on_batch_end=None,·on_train_begin=None,·on_train_end=None)364 <pre><code·class="python">keras.callbacks.BaseLogger(stateful_metrics=None)
357 </code></pre>365 </code></pre>
  
358 <p>Callback·for·creating·simple,·custom·callbacks·on-the-fly.</p>366 <p>Callback·that·accumulates·epoch·averages·of·metrics.</p>
359 <p>This·callback·is·constructed·with·anonymous·functions·that·will·be·called367 <p>This·callback·is·automatically·applied·to·every·Keras·model.</p>
360 at·the·appropriate·time.·Note·that·the·callbacks·expects·positional 
361 arguments,·as:</p> 
362 <ul> 
363 <li><code>on_epoch_begin</code>·and·<code>on_epoch_end</code>·expect·two·positional·arguments: 
364 <code>epoch</code>,·<code>logs</code></li> 
365 <li><code>on_batch_begin</code>·and·<code>on_batch_end</code>·expect·two·positional·arguments: 
366 <code>batch</code>,·<code>logs</code></li> 
367 <li><code>on_train_begin</code>·and·<code>on_train_end</code>·expect·one·positional·argument: 
368 <code>logs</code></li> 
369 </ul> 
370 <p><strong>Arguments</strong></p>368 <p><strong>Arguments</strong></p>
371 <ul>369 <ul>
372 <li><strong>on_epoch_begin</strong>:·called·at·the·beginning·of·every·epoch.</li>370 <li><strong>stateful_metrics</strong>:·Iterable·of·string·names·of·metrics·that
373 <li><strong>on_epoch_end</strong>:·called·at·the·end·of·every·epoch.</li>371 ····should·<em>not</em>·be·averaged·over·an·epoch.
374 <li><strong>on_batch_begin</strong>:·called·at·the·beginning·of·every·batch.</li>372 ····Metrics·in·this·list·will·be·logged·as-is·in·<code>on_epoch_end</code>.
375 <li><strong>on_batch_end</strong>:·called·at·the·end·of·every·batch.</li>373 ····All·others·will·be·averaged·in·<code>on_epoch_end</code>.</li>
376 <li><strong>on_train_begin</strong>:·called·at·the·beginning·of·model·training.</li> 
377 <li><strong>on_train_end</strong>:·called·at·the·end·of·model·training.</li> 
378 </ul>374 </ul>
379 <p><strong>Example</strong></p>375 <hr·/>
380 <pre><code·class="python">#·Print·the·batch·number·at·the·beginning·of·every·batch.376 <p><span·style="float:right;"><a·href="https://github.com/keras-team/keras/blob/master/keras/callbacks.py#L251">[source]</a></span></p>
381 batch_print_callback·=·LambdaCallback(377 <h3·id="terminateonnan">TerminateOnNaN</h3>
382 ····on_batch_begin=lambda·batch,logs:·print(batch))378 <pre><code·class="python">keras.callbacks.TerminateOnNaN()
  
383 #·Stream·the·epoch·loss·to·a·file·in·JSON·format.·The·file·content 
384 #·is·not·well-formed·JSON·but·rather·has·a·JSON·object·per·line. 
385 import·json 
386 json_log·=·open('loss_log.json',·mode='wt',·buffering=1) 
387 json_logging_callback·=·LambdaCallback( 
388 ····on_epoch_end=lambda·epoch,·logs:·json_log.write( 
389 ········json.dumps({'epoch':·epoch,·'loss':·logs['loss']})·+·'\n'), 
390 ····on_train_end=lambda·logs:·json_log.close() 
391 ) 
  
392 #·Terminate·some·processes·after·having·finished·model·training. 
393 processes·=·... 
394 cleanup_callback·=·LambdaCallback( 
Max diff block lines reached; 7166/15463 bytes (46.34%) of diff not shown.
844 KB
./usr/share/doc/keras-doc/html/search/search_index.json
844 KB
/srv/reproducible-results/rbuild-debian/tmp.urzYok0Vhj/dbd-tmp-0BcyzAB/diffoscope_0n3fb150/tmpnh0m6lh2/0/104.json vs.
/srv/reproducible-results/rbuild-debian/tmp.urzYok0Vhj/dbd-tmp-0BcyzAB/diffoscope_0n3fb150/tmphoep6mtb/0/104.json
Differences: { "replace": "OrderedDict([('config', OrderedDict([('lang', ['en']), ('prebuild_index', False), ('separator', '[\\\\s\\\\-]+')])), ('docs', [OrderedDict([('location', 'index.html'), ('text', 'Keras: The Python Deep Learning library You have just found Keras. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow , CNTK , or Theano . It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research. Use Keras if you need a deep learning library that: Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility). Supports both convolutional networks and recurrent networks, as well as combinations of the two. Runs seamlessly on CPU and GPU. Read the documentation at Keras.io . Keras is compatible with: Python 2.7-3.6 . Guiding principles User friendliness. Keras is an API designed for human beings, not machines. It puts user experience front and center. Keras follows best practices for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear and actionable feedback upon user error. Modularity. A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as few restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models. Easy extensibility. New modules are simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research. Work with Python . No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility. Getting started: 30 seconds to Keras The core data structure of Keras is a model , a way to organize layers. The simplest type of model is the Sequential model, a linear stack of layers. For more complex architectures, you should use the Keras functional API , which allows to build arbitrary graphs of layers. Here is the Sequential model: from keras.models import Sequential model = Sequential() Stacking layers is as easy as .add() : from keras.layers import Dense model.add(Dense(units=64, activation=\\'relu\\', input_dim=100)) model.add(Dense(units=10, activation=\\'softmax\\')) Once your model looks good, configure its learning process with .compile() : model.compile(loss=\\'categorical_crossentropy\\', optimizer=\\'sgd\\', metrics=[\\'accuracy\\']) If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code). model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.9, nesterov=True)) You can now iterate on your training data in batches: # x_train and y_train are Numpy arrays --just like in the Scikit-Learn API. model.fit(x_train, y_train, epochs=5, batch_size=32) Alternatively, you can feed batches to your model manually: model.train_on_batch(x_batch, y_batch) Evaluate your performance in one line: loss_and_metrics = model.evaluate(x_test, y_test, batch_size=128) Or generate predictions on new data: classes = model.predict(x_test, batch_size=128) Building a question answering system, an image classification model, a Neural Turing Machine, or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful? For a more in-depth tutorial about Keras, you can check out: Getting started with the Sequential model Getting started with the functional API In the examples folder of the repository, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc. Installation Before installing Keras, please install one of its backend engines: TensorFlow, Theano, or CNTK. We recommend the TensorFlow backend. TensorFlow installation instructions . Theano installation instructions . CNTK installation instructions . You may also consider installing the following optional dependencies : cuDNN (recommended if you plan on running Keras on GPU). HDF5 and h5py (required if you plan on saving Keras models to disk). graphviz and pydot (used by visualization utilities to plot model graphs). Then, you can install Keras itself. There are two ways to install Keras: Install Keras from PyPI (recommended): sudo pip install keras If you are using a virtualenv, you may want to avoid using sudo: pip install keras Alternatively: install Keras from the GitHub source: First, clone Keras using git : git clone https://github.com/keras-team/keras.git Then, cd to the Keras folder and run the install command: cd keras sudo python setup.py install Configuring your Keras backend By default, Keras will use TensorFlow as its tensor manipulation library. Follow these instructions to configure the Keras backend. Support You can ask questions and join the development discussion: On the Keras Google group . On the Keras Slack channel . Use this link to request an invitation to the channel. You can also post bug reports and feature requests (only) in GitHub issues . Make sure to read our guidelines first. Why this name, Keras? Keras (\u03ba\u03ad\u03c1\u03b1\u03c2) means horn in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the Odyssey , where dream spirits ( Oneiroi , singular Oneiros ) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It\\'s a play on the words \u03ba\u03ad\u03c1\u03b1\u03c2 (horn) / \u03ba\u03c1\u03b1\u03af\u03bd\u03c9 (fulfill), and \u1f10\u03bb\u03ad\u03c6\u03b1\u03c2 (ivory) / \u1f10\u03bb\u03b5\u03c6\u03b1\u03af\u03c1\u03bf\u03bc\u03b1\u03b9 (deceive). Keras was initially developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System). \"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them.\" Homer, Odyssey 19. 562 ff (Shewring translation).'), ('title', 'Home')]), OrderedDict([('location', 'index.html#keras-the-python-deep-learning-library'), ('text', ''), ('title', 'Keras: The Python Deep Learning library')]), OrderedDict([('location', 'index.html#you-have-just-found-keras'), ('text', 'Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow , CNTK , or Theano . It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research. Use Keras if you need a deep learning library that: Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility). Supports both convolutional networks and recurrent networks, as well as combinations of the two. Runs seamlessly on CPU and GPU. Read the documentation at Keras.io . Keras is compatible with: Python 2.7-3.6 .'), ('title', 'You have just found Keras.')]), OrderedDict([('location', 'index.html#guiding-principles'), ('text', 'User friendliness. Keras is an API designed for human beings, not machines. It puts user experience front and center. Keras follows best practices for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear and actionable feedback upon user error. Modularity. A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as few restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models. Easy extensibility. New modules are simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research. Work with Python . No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.'), ('title', 'Guiding principles')]), OrderedDict([('location', 'index.html#getting-started-30-seconds-to-keras'), ('text', \"The core data structure of Keras is a model , a way to organize layers. The simplest type of model is the Sequential model, a linear stack of layers. For more complex architectures, you should use the Keras functional API , which allows to build arbitrary graphs of layers. Here is the Sequential model: from keras.models import Sequential model = Sequential() Stacking layers is as easy as .add() : from keras.layers import Dense model.add(Dense(units=64, activation='relu', input_dim=100)) model.add(Dense(units=10, activation='softmax')) Once your model looks good, configure its learning process with .compile() : model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy']) If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code). model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.9, nesterov=True)) You can now iterate on your training data in batches: # x_train and y_train are Numpy arrays --just like in the Scikit-Learn API. model.fit(x_train, y_train, epochs=5, batch_size=32) Alternatively, you can feed batches to your model manually: model.train_on_batch(x_batch, y_batch) Evaluate your performance in one line: loss_and_metrics = model.evaluate(x_test, y_test, batch_size=128) Or generate predictions on new data: classes = model.predict(x_test, batch_size=128) Building a question answering system, an image classification model, a Neural Turing Machine, or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful? For a more in-depth tutorial about Keras, you can check out: Getting started with the Sequential model Getting started with the functional API In the examples folder of the repository, you will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc.\"), ('title', 'Getting started: 30 seconds to Keras')]), OrderedDict([('location', 'index.html#installation'), ('text', 'Before installing Keras, please install one of its backend engines: TensorFlow, Theano, or CNTK. We recommend the TensorFlow backend. TensorFlow installation instructions . Theano installation instructions . CNTK installation instructions . You may also consider installing the following optional dependencies : cuDNN (recommended if you plan on running Keras on GPU). HDF5 and h5py (required if you plan on saving Keras models to disk). graphviz and pydot (used by visualization utilities to plot model graphs). Then, you can install Keras itself. There are two ways to install Keras: Install Keras from PyPI (recommended): sudo pip install keras If you are using a virtualenv, you may want to avoid using sudo: pip install keras Alternatively: install Keras from the GitHub source: First, clone Keras using git : git clone https://github.com/keras-team/keras.git Then, cd to the Keras folder and run the install command: cd keras sudo python setup.py install'), ('title', 'Installation')]), OrderedDict([('location', 'index.html#configuring-your-keras-backend'), ('text', 'By default, Keras will use TensorFlow as its tensor manipulation library. Follow these instructions to configure the Keras backend.'), ('title', 'Configuring your Keras backend')]), OrderedDict([('location', 'index.html#support'), ('text', 'You can ask questions and join the development discussion: On the Keras Google group . On the Keras Slack channel . Use this link to request an invitation to the channel. You can also post bug reports and feature requests (only) in GitHub issues . Make sure to read our guidelines first.'), ('title', 'Support')]), OrderedDict([('location', 'index.html#why-this-name-keras'), ('text', 'Keras (\u03ba\u03ad\u03c1\u03b1\u03c2) means horn in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the Odyssey , where dream spirits ( Oneiroi , singular Oneiros ) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It\\'s a play on the words \u03ba\u03ad\u03c1\u03b1\u03c2 (horn) / \u03ba\u03c1\u03b1\u03af\u03bd\u03c9 (fulfill), and \u1f10\u03bb\u03ad\u03c6\u03b1\u03c2 (ivory) / \u1f10\u03bb\u03b5\u03c6\u03b1\u03af\u03c1\u03bf\u03bc\u03b1\u03b9 (deceive). Keras was initially developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System). \"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them.\" Homer, Odyssey 19. 562 ff (Shewring translation).'), ('title', 'Why this name, Keras?')]), OrderedDict([('location', 'activations.html'), ('text', 'Usage of activations Activations can either be used through an Activation layer, or through the activation argument supported by all forward layers: from keras.layers import Activation, Dense model.add(Dense(64)) model.add(Activation(\\'tanh\\')) This is equivalent to: model.add(Dense(64, activation=\\'tanh\\')) You can also pass an element-wise TensorFlow/Theano/CNTK function as an activation: from keras import backend as K model.add(Dense(64, activation=K.tanh)) Available activations softmax keras.activations.softmax(x, axis=-1) Softmax activation function. Arguments x : Input tensor. axis : Integer, axis along which the softmax normalization is applied. Returns Tensor, output of softmax transformation. Raises ValueError : In case dim(x) == 1 . elu keras.activations.elu(x, alpha=1.0) Exponential linear unit. Arguments x : Input tensor. alpha : A scalar, slope of negative section. Returns The exponential linear activation: x if x > 0 and alpha * (exp(x)-1) if x < 0 . References [Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)](https://arxiv.org/abs/1511.07289) selu keras.activations.selu(x) Scaled Exponential Linear Unit (SELU). SELU is equal to: scale * elu(x, alpha) , where alpha and scale are pre-defined constants. The values of alpha and scale are chosen so that the mean and variance of the inputs are preserved between two consecutive layers as long as the weights are initialized correctly (see lecun_normal initialization) and the number of inputs is \"large enough\" (see references for more information). Arguments x : A tensor or variable to compute the activation function for. Returns The scaled exponential unit activation: scale * elu(x, alpha) . Note To be used together with the initialization \"lecun_normal\". To be used together with the dropout variant \"AlphaDropout\". References Self-Normalizing Neural Networks softplus keras.activations.softplus(x) Softplus activation function. Arguments x : Input tensor. Returns The softplus activation: log(exp(x) + 1) . softsign keras.activations.softsign(x) Softsign activation function. Arguments x : Input tensor. Returns The softplus activation: x / (abs(x) + 1) . relu keras.activations.relu(x, alpha=0.0, max_value=None, threshold=0.0) Rectified Linear Unit. With default values, it returns element-wise max(x, 0) . Otherwise, it follows: f(x) = max_value for x >= max_value , f(x) = x for threshold <= x < max_value , f(x) = alpha * (x - threshold) otherwise. Arguments x : Input tensor. alpha : float. Slope of the negative part. Defaults to zero. max_value : float. Saturation threshold. threshold : float. Threshold value for thresholded activation. Returns A tensor. tanh keras.activations.tanh(x) Hyperbolic tangent activation function. sigmoid keras.activations.sigmoid(x) Sigmoid activation function. hard_sigmoid keras.activations.hard_sigmoid(x) Hard sigmoid activation function. Faster to compute than sigmoid activation. Arguments x : Input tensor. Returns Hard sigmoid activation: 0 if x < -2.5 1 if x > 2.5 0.2 * x + 0.5 if -2.5 <= x <= 2.5 . exponential keras.activations.exponential(x) Exponential (base e) activation function. linear keras.activations.linear(x) Linear (i.e. identity) activation function. On \"Advanced Activations\" Activations that are more complex than a simple TensorFlow/Theano/CNTK function (eg. learnable activations, which maintain a state) are available as Advanced Activation layers , and can be found in the module keras.layers.advanced_activations . These include PReLU and LeakyReLU .'), ('title', 'Activations')]), OrderedDict([('location', 'activations.html#usage-of-activations'), ('text', \"Activations can either be used through an Activation layer, or through the activation argument supported by all forward layers: from keras.layers import Activation, Dense model.add(Dense(64)) model.add(Activation('tanh')) This is equivalent to: model.add(Dense(64, activation='tanh')) You can also pass an element-wise TensorFlow/Theano/CNTK function as an activation: from keras import backend as K model.add(Dense(64, activation=K.tanh))\"), ('title', 'Usage of activations')]), OrderedDict([('location', 'activations.html#available-activations'), ('text', ''), ('title', 'Available activations')]), OrderedDict([('location', 'activations.html#softmax'), ('text', 'keras.activations.softmax(x, axis=-1) Softmax activation function. Arguments x : Input tensor. axis : Integer, axis along which the softmax normalization is applied. Returns Tensor, output of softmax transformation. Raises ValueError : In case dim(x) == 1 .'), ('title', 'softmax')]), OrderedDict([('location', 'activations.html#elu'), ('text', 'keras.activations.elu(x, alpha=1.0) Exponential linear unit. Arguments x : Input tensor. alpha : A scalar, slope of negative section. Returns The exponential linear activation: x if x > 0 and alpha * (exp(x)-1) if x < 0 . References [Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)](https://arxiv.org/abs/1511.07289)'), ('title', 'elu')]), OrderedDict([('location', 'activations.html#selu'), ('text', 'keras.activations.selu(x) Scaled Exponential Linear Unit (SELU). SELU is equal to: scale * elu(x, alpha) , where alpha and scale are pre-defined constants. The values of alpha and scale are chosen so that the mean and variance of the inputs are preserved between two consecutive layers as long as the weights are initialized correctly (see lecun_normal initialization) and the number of inputs is \"large enough\" (see references for more information). Arguments x : A tensor or variable to compute the activation function for. Returns The scaled exponential unit activation: scale * elu(x, alpha) . Note To be used together with the initialization \"lecun_normal\". To be used together with the dropout variant \"AlphaDropout\". References Self-Normalizing Neural Networks'), ('title', 'selu')]), OrderedDict([('location', 'activations.html#softplus'), ('text', 'keras.activations.softplus(x) Softplus activation function. Arguments x : Input tensor. Returns The softplus activation: log(exp(x) + 1) .'), ('title', 'softplus')]), OrderedDict([('location', 'activations.html#softsign'), ('text', 'keras.activations.softsign(x) Softsign activation function. Arguments x : Input tensor. Returns The softplus activation: x / (abs(x) + 1) .'), ('title', 'softsign')]), OrderedDict([('location', 'activations.html#relu'), ('text', 'keras.activations.relu(x, alpha=0.0, max_value=None, threshold=0.0) Rectified Linear Unit. With default values, it returns element-wise max(x, 0) . Otherwise, it follows: f(x) = max_value for x >= max_value , f(x) = x for threshold <= x < max_value , f(x) = alpha * (x - threshold) otherwise. Arguments x : Input tensor. alpha : float. Slope of the negative part. Defaults to zero. max_value : float. Saturation threshold. threshold : float. Threshold value for thresholded activation. Returns A tensor.'), ('title', 'relu')]), OrderedDict([('location', 'activations.html#tanh'), ('text', 'keras.activations.tanh(x) Hyperbolic tangent activation function.'), ('title', 'tanh')]), OrderedDict([('location', 'activations.html#sigmoid'), ('text', 'keras.activations.sigmoid(x) Sigmoid activation function.'), ('title', 'sigmoid')]), OrderedDict([('location', 'activations.html#hard_sigmoid'), ('text', 'keras.activations.hard_sigmoid(x) Hard sigmoid activation function. Faster to compute than sigmoid activation. Arguments x : Input tensor. Returns Hard sigmoid activation: 0 if x < -2.5 1 if x > 2.5 0.2 * x + 0.5 if -2.5 <= x <= 2.5 .'), ('title', 'hard_sigmoid')]), OrderedDict([('location', 'activations.html#exponential'), ('text', 'keras.activations.exponential(x) Exponential (base e) activation function.'), ('title', 'exponential')]), OrderedDict([('location', 'activations.html#linear'), ('text', 'keras.activations.linear(x) Linear (i.e. identity) activation function.'), ('title', 'linear')]), OrderedDict([('location', 'activations.html#on-advanced-activations'), ('text', 'Activations that are more complex than a simple TensorFlow/Theano/CNTK function (eg. learnable activations, which maintain a state) are available as Advanced Activation layers , and can be found in the module keras.layers.advanced_activations . These include PReLU and LeakyReLU .'), ('title', 'On \"Advanced Activations\"')]), OrderedDict([('location', 'applications.html'), ('text', 'Applications Keras Applications are deep learning models that are made available alongside pre-trained weights. These models can be used for prediction, feature extraction, and fine-tuning. Weights are downloaded automatically when instantiating a model. They are stored at ~/.keras/models/ . Available models Models for image classification with weights trained on ImageNet: Xception VGG16 VGG19 ResNet50 InceptionV3 InceptionResNetV2 MobileNet DenseNet NASNet MobileNetV2 All of these architectures are compatible with all the backends (TensorFlow, Theano, and CNTK), and upon instantiation the models will be built according to the image data format set in your Keras configuration file at ~/.keras/keras.json . For instance, if you have set image_data_format=channels_last , then any model loaded from this repository will get built according to the TensorFlow data format convention, \"Height-Width-Depth\". Note that: - For Keras < 2.2.0 , The Xception model is only available for TensorFlow, due to its reliance on SeparableConvolution layers. - For Keras < 2.1.5 , The MobileNet model is only available for TensorFlow, due to its reliance on DepthwiseConvolution layers. Usage examples for image classification models Classify ImageNet classes with ResNet50 from keras.applications.resnet50 import ResNet50 from keras.preprocessing import image from keras.applications.resnet50 import preprocess_input, decode_predictions import numpy as np model = ResNet50(weights=\\'imagenet\\') img_path = \\'elephant.jpg\\' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) preds = model.predict(x) # decode the results into a list of tuples (class, description, probability) # (one such list for each sample in the batch) print(\\'Predicted:\\', decode_predictions(preds, top=3)[0]) # Predicted: [(u\\'n02504013\\', u\\'Indian_elephant\\', 0.82658225), (u\\'n01871265\\', u\\'tusker\\', 0.1122357), (u\\'n02504458\\', u\\'African_elephant\\', 0.061040461)] Extract features with VGG16 from keras.applications.vgg16 import VGG16 from keras.preprocessing import image from keras.applications.vgg16 import preprocess_input import numpy as np model = VGG16(weights=\\'imagenet\\', include_top=False) img_path = \\'elephant.jpg\\' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) features = model.predict(x) Extract features from an arbitrary intermediate layer with VGG19 from keras.applications.vgg19 import VGG19 from keras.preprocessing import image from keras.applications.vgg19 import preprocess_input from keras.models import Model import numpy as np base_model = VGG19(weights=\\'imagenet\\') model = Model(inputs=base_model.input, outputs=base_model.get_layer(\\'block4_pool\\').output) img_path = \\'elephant.jpg\\' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) block4_pool_features = model.predict(x) Fine-tune InceptionV3 on a new set of classes from keras.applications.inception_v3 import InceptionV3 from keras.preprocessing import image from keras.models import Model from keras.layers import Dense, GlobalAveragePooling2D from keras import backend as K # create the base pre-trained model base_model = InceptionV3(weights=\\'imagenet\\', include_top=False) # add a global spatial average pooling layer x = base_model.output x = GlobalAveragePooling2D()(x) # let\\'s add a fully-connected layer x = Dense(1024, activation=\\'relu\\')(x) # and a logistic layer -- let\\'s say we have 200 classes predictions = Dense(200, activation=\\'softmax\\')(x) # this is the model we will train model = Model(inputs=base_model.input, outputs=predictions) # first: train only the top layers (which were randomly initialized) # i.e. freeze all convolutional InceptionV3 layers for layer in base_model.layers: layer.trainable = False # compile the model (should be done *after* setting layers to non-trainable) model.compile(optimizer=\\'rmsprop\\', loss=\\'categorical_crossentropy\\') # train the model on the new data for a few epochs model.fit_generator(...) # at this point, the top layers are well trained and we can start fine-tuning # convolutional layers from inception V3. We will freeze the bottom N layers # and train the remaining top layers. # let\\'s visualize layer names and layer indices to see how many layers # we should freeze: for i, layer in enumerate(base_model.layers): print(i, layer.name) # we chose to train the top 2 inception blocks, i.e. we will freeze # the first 249 layers and unfreeze the rest: for layer in model.layers[:249]: layer.trainable = False for layer in model.layers[249:]: layer.trainable = True # we need to recompile the model for these modifications to take effect # we use SGD with a low learning rate from keras.optimizers import SGD model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss=\\'categorical_crossentropy\\') # we train our model again (this time fine-tuning the top 2 inception blocks # alongside the top Dense layers model.fit_generator(...) Build InceptionV3 over a custom input tensor from keras.applications.inception_v3 import InceptionV3 from keras.layers import Input # this could also be the output a different Keras model or layer input_tensor = Input(shape=(224, 224, 3)) # this assumes K.image_data_format() == \\'channels_last\\' model = InceptionV3(input_tensor=input_tensor, weights=\\'imagenet\\', include_top=True) Documentation for individual models Model Size Top-1 Accuracy Top-5 Accuracy Parameters Depth Xception 88 MB 0.790 0.945 22,910,480 126 VGG16 528 MB 0.713 0.901 138,357,544 23 VGG19 549 MB 0.713 0.900 143,667,240 26 ResNet50 99 MB 0.749 0.921 25,636,712 168 InceptionV3 92 MB 0.779 0.937 23,851,784 159 InceptionResNetV2 215 MB 0.803 0.953 55,873,736 572 MobileNet 16 MB 0.704 0.895 4,253,864 88 MobileNetV2 14 MB 0.713 0.901 3,538,984 88 DenseNet121 33 MB 0.750 0.923 8,062,504 121 DenseNet169 57 MB 0.762 0.932 14,307,880 169 DenseNet201 80 MB 0.773 0.936 20,242,984 201 NASNetMobile 23 MB 0.744 0.919 5,326,716 - NASNetLarge 343 MB 0.825 0.960 88,949,818 - The top-1 and top-5 accuracy refers to the model\\'s performance on the ImageNet validation dataset. Xception keras.applications.xception.Xception(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) Xception V1 model, with weights pre-trained on ImageNet. On ImageNet, this model gets to a top-1 validation accuracy of 0.790 and a top-5 validation accuracy of 0.945. Note that this model only supports the data format \\'channels_last\\' (height, width, channels). The default input size for this model is 299x299. Arguments include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or \\'imagenet\\' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) . It should have exactly 3 inputs channels, and width and height should be no smaller than 71. E.g. (150, 150, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References Xception: Deep Learning with Depthwise Separable Convolutions License These weights are trained by ourselves and are released under the MIT license. VGG16 keras.applications.vgg16.VGG16(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) VGG16 model, with weights pre-trained on ImageNet. This model can be built both with \\'channels_first\\' data format (channels, height, width) or \\'channels_last\\' data format (height, width, channels). The default input size for this model is 224x224. Arguments include_top: whether to include the 3 fully-connected layers at the top of the network. weights: one of None (random initialization) or \\'imagenet\\' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with \\'channels_last\\' data format) or (3, 224, 224) (with \\'channels_first\\' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References Very Deep Convolutional Networks for Large-Scale Image Recognition : please cite this paper if you use the VGG models in your work. License These weights are ported from the ones released by VGG at Oxford under the Creative Commons Attribution License . VGG19 keras.applications.vgg19.VGG19(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) VGG19 model, with weights pre-trained on ImageNet. This model can be built both with \\'channels_first\\' data format (channels, height, width) or \\'channels_last\\' data format (height, width, channels). The default input size for this model is 224x224. Arguments include_top: whether to include the 3 fully-connected layers at the top of the network. weights: one of None (random initialization) or \\'imagenet\\' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with \\'channels_last\\' data format) or (3, 224, 224) (with \\'channels_first\\' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References Very Deep Convolutional Networks for Large-Scale Image Recognition License These weights are ported from the ones released by VGG at Oxford under the Creative Commons Attribution License . ResNet50 keras.applications.resnet50.ResNet50(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) ResNet50 model, with weights pre-trained on ImageNet. This model and can be built both with \\'channels_first\\' data format (channels, height, width) or \\'channels_last\\' data format (height, width, channels). The default input size for this model is 224x224. Arguments include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or \\'imagenet\\' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with \\'channels_last\\' data format) or (3, 224, 224) (with \\'channels_first\\' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References Deep Residual Learning for Image Recognition License These weights are ported from the ones released by Kaiming He under the MIT license . InceptionV3 keras.applications.inception_v3.InceptionV3(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) Inception V3 model, with weights pre-trained on ImageNet. This model and can be built both with \\'channels_first\\' data format (channels, height, width) or \\'channels_last\\' data format (height, width, channels). The default input size for this model is 299x299. Arguments include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or \\'imagenet\\' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) (with \\'channels_last\\' data format) or (3, 299, 299) (with \\'channels_first\\' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. (150, 150, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References Rethinking the Inception Architecture for Computer Vision License These weights are released under the Apache License . InceptionResNetV2 keras.applications.inception_resnet_v2.InceptionResNetV2(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) Inception-ResNet V2 model, with weights pre-trained on ImageNet. This model and can be built both with \\'channels_first\\' data format (channels, height, width) or \\'channels_last\\' data format (height, width, channels). The default input size for this model is 299x299. Arguments include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or \\'imagenet\\' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) (with \\'channels_last\\' data format) or (3, 299, 299) (with \\'channels_first\\' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. (150, 150, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning License These weights are released under the Apache License . MobileNet keras.applications.mobilenet.MobileNet(input_shape=None, alpha=1.0, depth_multiplier=1, dropout=1e-3, include_top=True, weights=\\'imagenet\\', input_tensor=None, pooling=None, classes=1000) MobileNet model, with weights pre-trained on ImageNet. Note that this model only supports the data format \\'channels_last\\' (height, width, channels). The default input size for this model is 224x224. Arguments input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with \\'channels_last\\' data format) or (3, 224, 224) (with \\'channels_first\\' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. alpha: controls the width of the network. If alpha < 1.0, proportionally decreases the number of filters in each layer. If alpha > 1.0, proportionally increases the number of filters in each layer. If alpha = 1, default number of filters from the paper are used at each layer. depth_multiplier: depth multiplier for depthwise convolution (also called the resolution multiplier) dropout: dropout rate include_top: whether to include the fully-connected layer at the top of the network. weights: None (random initialization) or \\'imagenet\\' (ImageNet weights) input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications License These weights are released under the Apache License . DenseNet keras.applications.densenet.DenseNet121(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) keras.applications.densenet.DenseNet169(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) keras.applications.densenet.DenseNet201(include_top=True, weights=\\'imagenet\\', input_tensor=None, input_shape=None, pooling=None, classes=1000) DenseNet models, with weights pre-trained on ImageNet. This model and can be built both with \\'channels_first\\' data format (channels, height, width) or \\'channels_last\\' data format (height, width, channels). The default input size for this model is 224x224. Arguments blocks: numbers of building blocks for the four dense layers. include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization), \\'imagenet\\' (pre-training on ImageNet), or the path to the weights file to be loaded. input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with \\'channels_last\\' data format) or (3, 224, 224) (with \\'channels_first\\' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. max means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified. Returns A Keras model instance. References Densely Connected Convolutional Networks (CVPR 2017 Best Paper Award) License These weights are released under the BSD 3-clause License . NASNet keras.applications.nasnet.NASNetLarge(input_shape=None, include_top=True, weights=\\'imagenet\\', input_tensor=None, pooling=None, classes=1000) keras.applications.nasnet.NASNetMobile(input_shape=None, include_top=True, weights=\\'imagenet\\', input_tensor=None, pooling=None, classes=1000) Neural Architecture Search Network (NASNet) models, with weights pre-trained on ImageNet. The default input size for the NASNetLarge model is 331x331 and for the NASNetMobile model is 224x224. Arguments input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with \\'channels_last\\' data format) or (3, 224, 224) (with \\'channels_first\\' data format) for NASNetMobile or (331, 331, 3) (with \\'channels_last\\' data format) or (3, 331, 331) (with \\'channels_first\\' data format) for NASNetLarge. It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. include_top: whether to include the fully-connected layer at the top of the network. weights: None (random initialization) or \\'imagenet\\' (ImageNet weights) input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified. Returns A Keras Model instance. References Learning Transferable Architectures for Scalable Image Recognition License These weights are released under the Apache License . MobileNetV2 keras.applications.mobilenetv2.MobileNetV2(input_shape=None, alpha=1.0, depth_multiplier=1, include_top=True, weights=\\'imagenet\\', input_tensor=None, pooling=None, classes=1000) MobileNetV2 model, with weights pre-trained on ImageNet. Note that this model only supports the data format \\'channels_last\\' (height, width, channels). The default input size for this model is 224x224. Arguments input_shape: optional shape tuple, to be specified if you would like to use a model with an input img resolution that is not (224, 224, 3). It should have exactly 3 inputs channels (224, 224, 3). You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g. (160, 160, 3) would be one valid value. alpha: controls the width of the network. This is known as the width multiplier in the MobileNetV2 paper. If alpha < 1.0, proportionally decreases the number of filters in each layer. If alpha > 1.0, proportionally increases the number of filters in each layer. If alpha = 1, default number of filters from the paper are used at each layer. depth_multiplier: depth multiplier for depthwise convolution (also called the resolution multiplier) include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization), \\'imagenet\\' (pre-training on ImageNet), or the path to the weights file to be loaded. input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. \\'avg\\' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. \\'max\\' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified. Returns A Keras model instance. Raises ValueError: in case of invalid argument for weights , or invalid input shape or invalid depth_multiplier, alpha, rows when weights=\\'imagenet\\' References MobileNetV2: Inverted Residuals and Linear Bottlenecks License These weights are released under the Apache License .'), ('title', 'Applications')]), OrderedDict([('location', 'applications.html#applications'), ('text', 'Keras Applications are deep learning models that are made available alongside pre-trained weights. These models can be used for prediction, feature extraction, and fine-tuning. Weights are downloaded automatically when instantiating a model. They are stored at ~/.keras/models/ .'), ('title', 'Applications')]), OrderedDict([('location', 'applications.html#available-models'), ('text', ''), ('title', 'Available models')]), OrderedDict([('location', 'applications.html#models-for-image-classification-with-weights-trained-on-imagenet'), ('text', 'Xception VGG16 VGG19 ResNet50 InceptionV3 InceptionResNetV2 MobileNet DenseNet NASNet MobileNetV2 All of these architectures are compatible with all the backends (TensorFlow, Theano, and CNTK), and upon instantiation the models will be built according to the image data format set in your Keras configuration file at ~/.keras/keras.json . For instance, if you have set image_data_format=channels_last , then any model loaded from this repository will get built according to the TensorFlow data format convention, \"Height-Width-Depth\". Note that: - For Keras < 2.2.0 , The Xception model is only available for TensorFlow, due to its reliance on SeparableConvolution layers. - For Keras < 2.1.5 , The MobileNet model is only available for TensorFlow, due to its reliance on DepthwiseConvolution layers.'), ('title', 'Models for image classification with weights trained on ImageNet:')]), OrderedDict([('location', 'applications.html#usage-examples-for-image-classification-models'), ('text', ''), ('title', 'Usage examples for image classification models')]), OrderedDict([('location', 'applications.html#classify-imagenet-classes-with-resnet50'), ('text', \"from keras.applications.resnet50 import ResNet50 from keras.preprocessing import image from keras.applications.resnet50 import preprocess_input, decode_predictions import numpy as np model = ResNet50(weights='imagenet') img_path = 'elephant.jpg' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) preds = model.predict(x) # decode the results into a list of tuples (class, description, probability) # (one such list for each sample in the batch) print('Predicted:', decode_predictions(preds, top=3)[0]) # Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]\"), ('title', 'Classify ImageNet classes with ResNet50')]), OrderedDict([('location', 'applications.html#extract-features-with-vgg16'), ('text', \"from keras.applications.vgg16 import VGG16 from keras.preprocessing import image from keras.applications.vgg16 import preprocess_input import numpy as np model = VGG16(weights='imagenet', include_top=False) img_path = 'elephant.jpg' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) features = model.predict(x)\"), ('title', 'Extract features with VGG16')]), OrderedDict([('location', 'applications.html#extract-features-from-an-arbitrary-intermediate-layer-with-vgg19'), ('text', \"from keras.applications.vgg19 import VGG19 from keras.preprocessing import image from keras.applications.vgg19 import preprocess_input from keras.models import Model import numpy as np base_model = VGG19(weights='imagenet') model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output) img_path = 'elephant.jpg' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) block4_pool_features = model.predict(x)\"), ('title', 'Extract features from an arbitrary intermediate layer with VGG19')]), OrderedDict([('location', 'applications.html#fine-tune-inceptionv3-on-a-new-set-of-classes'), ('text', \"from keras.applications.inception_v3 import InceptionV3 from keras.preprocessing import image from keras.models import Model from keras.layers import Dense, GlobalAveragePooling2D from keras import backend as K # create the base pre-trained model base_model = InceptionV3(weights='imagenet', include_top=False) # add a global spatial average pooling layer x = base_model.output x = GlobalAveragePooling2D()(x) # let's add a fully-connected layer x = Dense(1024, activation='relu')(x) # and a logistic layer -- let's say we have 200 classes predictions = Dense(200, activation='softmax')(x) # this is the model we will train model = Model(inputs=base_model.input, outputs=predictions) # first: train only the top layers (which were randomly initialized) # i.e. freeze all convolutional InceptionV3 layers for layer in base_model.layers: layer.trainable = False # compile the model (should be done *after* setting layers to non-trainable) model.compile(optimizer='rmsprop', loss='categorical_crossentropy') # train the model on the new data for a few epochs model.fit_generator(...) # at this point, the top layers are well trained and we can start fine-tuning # convolutional layers from inception V3. We will freeze the bottom N layers # and train the remaining top layers. # let's visualize layer names and layer indices to see how many layers # we should freeze: for i, layer in enumerate(base_model.layers): print(i, layer.name) # we chose to train the top 2 inception blocks, i.e. we will freeze # the first 249 layers and unfreeze the rest: for layer in model.layers[:249]: layer.trainable = False for layer in model.layers[249:]: layer.trainable = True # we need to recompile the model for these modifications to take effect # we use SGD with a low learning rate from keras.optimizers import SGD model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy') # we train our model again (this time fine-tuning the top 2 inception blocks # alongside the top Dense layers model.fit_generator(...)\"), ('title', 'Fine-tune InceptionV3 on a new set of classes')]), OrderedDict([('location', 'applications.html#build-inceptionv3-over-a-custom-input-tensor'), ('text', \"from keras.applications.inception_v3 import InceptionV3 from keras.layers import Input # this could also be the output a different Keras model or layer input_tensor = Input(shape=(224, 224, 3)) # this assumes K.image_data_format() == 'channels_last' model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)\"), ('title', 'Build InceptionV3 over a custom input tensor')]), OrderedDict([('location', 'applications.html#documentation-for-individual-models'), ('text', \"Model Size Top-1 Accuracy Top-5 Accuracy Parameters Depth Xception 88 MB 0.790 0.945 22,910,480 126 VGG16 528 MB 0.713 0.901 138,357,544 23 VGG19 549 MB 0.713 0.900 143,667,240 26 ResNet50 99 MB 0.749 0.921 25,636,712 168 InceptionV3 92 MB 0.779 0.937 23,851,784 159 InceptionResNetV2 215 MB 0.803 0.953 55,873,736 572 MobileNet 16 MB 0.704 0.895 4,253,864 88 MobileNetV2 14 MB 0.713 0.901 3,538,984 88 DenseNet121 33 MB 0.750 0.923 8,062,504 121 DenseNet169 57 MB 0.762 0.932 14,307,880 169 DenseNet201 80 MB 0.773 0.936 20,242,984 201 NASNetMobile 23 MB 0.744 0.919 5,326,716 - NASNetLarge 343 MB 0.825 0.960 88,949,818 - The top-1 and top-5 accuracy refers to the model's performance on the ImageNet validation dataset.\"), ('title', 'Documentation for individual models')]), OrderedDict([('location', 'applications.html#xception'), ('text', \"keras.applications.xception.Xception(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) Xception V1 model, with weights pre-trained on ImageNet. On ImageNet, this model gets to a top-1 validation accuracy of 0.790 and a top-5 validation accuracy of 0.945. Note that this model only supports the data format 'channels_last' (height, width, channels). The default input size for this model is 299x299.\"), ('title', 'Xception')]), OrderedDict([('location', 'applications.html#arguments'), ('text', \"include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) . It should have exactly 3 inputs channels, and width and height should be no smaller than 71. E.g. (150, 150, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references'), ('text', 'Xception: Deep Learning with Depthwise Separable Convolutions'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license'), ('text', 'These weights are trained by ourselves and are released under the MIT license.'), ('title', 'License')]), OrderedDict([('location', 'applications.html#vgg16'), ('text', \"keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) VGG16 model, with weights pre-trained on ImageNet. This model can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels). The default input size for this model is 224x224.\"), ('title', 'VGG16')]), OrderedDict([('location', 'applications.html#arguments_1'), ('text', \"include_top: whether to include the 3 fully-connected layers at the top of the network. weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_1'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_1'), ('text', 'Very Deep Convolutional Networks for Large-Scale Image Recognition : please cite this paper if you use the VGG models in your work.'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_1'), ('text', 'These weights are ported from the ones released by VGG at Oxford under the Creative Commons Attribution License .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#vgg19'), ('text', \"keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) VGG19 model, with weights pre-trained on ImageNet. This model can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels). The default input size for this model is 224x224.\"), ('title', 'VGG19')]), OrderedDict([('location', 'applications.html#arguments_2'), ('text', \"include_top: whether to include the 3 fully-connected layers at the top of the network. weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_2'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_2'), ('text', 'Very Deep Convolutional Networks for Large-Scale Image Recognition'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_2'), ('text', 'These weights are ported from the ones released by VGG at Oxford under the Creative Commons Attribution License .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#resnet50'), ('text', \"keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) ResNet50 model, with weights pre-trained on ImageNet. This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels). The default input size for this model is 224x224.\"), ('title', 'ResNet50')]), OrderedDict([('location', 'applications.html#arguments_3'), ('text', \"include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_3'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_3'), ('text', 'Deep Residual Learning for Image Recognition'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_3'), ('text', 'These weights are ported from the ones released by Kaiming He under the MIT license .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#inceptionv3'), ('text', \"keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) Inception V3 model, with weights pre-trained on ImageNet. This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels). The default input size for this model is 299x299.\"), ('title', 'InceptionV3')]), OrderedDict([('location', 'applications.html#arguments_4'), ('text', \"include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) (with 'channels_last' data format) or (3, 299, 299) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. (150, 150, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_4'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_4'), ('text', 'Rethinking the Inception Architecture for Computer Vision'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_4'), ('text', 'These weights are released under the Apache License .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#inceptionresnetv2'), ('text', \"keras.applications.inception_resnet_v2.InceptionResNetV2(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) Inception-ResNet V2 model, with weights pre-trained on ImageNet. This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels). The default input size for this model is 299x299.\"), ('title', 'InceptionResNetV2')]), OrderedDict([('location', 'applications.html#arguments_5'), ('text', \"include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet). input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (299, 299, 3) (with 'channels_last' data format) or (3, 299, 299) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. (150, 150, 3) would be one valid value. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_5'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_5'), ('text', 'Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_5'), ('text', 'These weights are released under the Apache License .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#mobilenet'), ('text', \"keras.applications.mobilenet.MobileNet(input_shape=None, alpha=1.0, depth_multiplier=1, dropout=1e-3, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000) MobileNet model, with weights pre-trained on ImageNet. Note that this model only supports the data format 'channels_last' (height, width, channels). The default input size for this model is 224x224.\"), ('title', 'MobileNet')]), OrderedDict([('location', 'applications.html#arguments_6'), ('text', \"input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. alpha: controls the width of the network. If alpha < 1.0, proportionally decreases the number of filters in each layer. If alpha > 1.0, proportionally increases the number of filters in each layer. If alpha = 1, default number of filters from the paper are used at each layer. depth_multiplier: depth multiplier for depthwise convolution (also called the resolution multiplier) dropout: dropout rate include_top: whether to include the fully-connected layer at the top of the network. weights: None (random initialization) or 'imagenet' (ImageNet weights) input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_6'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_6'), ('text', 'MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_6'), ('text', 'These weights are released under the Apache License .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#densenet'), ('text', \"keras.applications.densenet.DenseNet121(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) keras.applications.densenet.DenseNet169(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) keras.applications.densenet.DenseNet201(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000) DenseNet models, with weights pre-trained on ImageNet. This model and can be built both with 'channels_first' data format (channels, height, width) or 'channels_last' data format (height, width, channels). The default input size for this model is 224x224.\"), ('title', 'DenseNet')]), OrderedDict([('location', 'applications.html#arguments_7'), ('text', \"blocks: numbers of building blocks for the four dense layers. include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization), 'imagenet' (pre-training on ImageNet), or the path to the weights file to be loaded. input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. pooling: optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. max means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_7'), ('text', 'A Keras model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_7'), ('text', 'Densely Connected Convolutional Networks (CVPR 2017 Best Paper Award)'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_7'), ('text', 'These weights are released under the BSD 3-clause License .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#nasnet'), ('text', \"keras.applications.nasnet.NASNetLarge(input_shape=None, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000) keras.applications.nasnet.NASNetMobile(input_shape=None, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000) Neural Architecture Search Network (NASNet) models, with weights pre-trained on ImageNet. The default input size for the NASNetLarge model is 331x331 and for the NASNetMobile model is 224x224.\"), ('title', 'NASNet')]), OrderedDict([('location', 'applications.html#arguments_8'), ('text', \"input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format) for NASNetMobile or (331, 331, 3) (with 'channels_last' data format) or (3, 331, 331) (with 'channels_first' data format) for NASNetLarge. It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. (200, 200, 3) would be one valid value. include_top: whether to include the fully-connected layer at the top of the network. weights: None (random initialization) or 'imagenet' (ImageNet weights) input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True , and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_8'), ('text', 'A Keras Model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#references_8'), ('text', 'Learning Transferable Architectures for Scalable Image Recognition'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_8'), ('text', 'These weights are released under the Apache License .'), ('title', 'License')]), OrderedDict([('location', 'applications.html#mobilenetv2'), ('text', \"keras.applications.mobilenetv2.MobileNetV2(input_shape=None, alpha=1.0, depth_multiplier=1, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000) MobileNetV2 model, with weights pre-trained on ImageNet. Note that this model only supports the data format 'channels_last' (height, width, channels). The default input size for this model is 224x224.\"), ('title', 'MobileNetV2')]), OrderedDict([('location', 'applications.html#arguments_9'), ('text', \"input_shape: optional shape tuple, to be specified if you would like to use a model with an input img resolution that is not (224, 224, 3). It should have exactly 3 inputs channels (224, 224, 3). You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g. (160, 160, 3) would be one valid value. alpha: controls the width of the network. This is known as the width multiplier in the MobileNetV2 paper. If alpha < 1.0, proportionally decreases the number of filters in each layer. If alpha > 1.0, proportionally increases the number of filters in each layer. If alpha = 1, default number of filters from the paper are used at each layer. depth_multiplier: depth multiplier for depthwise convolution (also called the resolution multiplier) include_top: whether to include the fully-connected layer at the top of the network. weights: one of None (random initialization), 'imagenet' (pre-training on ImageNet), or the path to the weights file to be loaded. input_tensor: optional Keras tensor (i.e. output of layers.Input() ) to use as image input for the model. pooling: Optional pooling mode for feature extraction when include_top is False . None means that the output of the model will be the 4D tensor output of the last convolutional layer. 'avg' means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. 'max' means that global max pooling will be applied. classes: optional number of classes to classify images into, only to be specified if include_top is True, and if no weights argument is specified.\"), ('title', 'Arguments')]), OrderedDict([('location', 'applications.html#returns_9'), ('text', 'A Keras model instance.'), ('title', 'Returns')]), OrderedDict([('location', 'applications.html#raises'), ('text', \"ValueError: in case of invalid argument for weights , or invalid input shape or invalid depth_multiplier, alpha, rows when weights='imagenet'\"), ('title', 'Raises')]), OrderedDict([('location', 'applications.html#references_9'), ('text', 'MobileNetV2: Inverted Residuals and Linear Bottlenecks'), ('title', 'References')]), OrderedDict([('location', 'applications.html#license_9'), ('text', 'These weights are released under the Apache License .'), ('title', 'License')]), OrderedDict([('location', 'backend.html'), ('text', 'Keras backends What is a \"backend\"? Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the \"backend engine\" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras. At this time, Keras has three backend implementations available: the TensorFlow backend, the Theano backend, and the CNTK backend. TensorFlow is an open-source symbolic tensor manipulation framework developed by Google. Theano is an open-source symbolic tensor manipulation framework developed by LISA Lab at Universit\u00e9 de Montr\u00e9al. CNTK is an open-source toolkit for deep learning developed by Microsoft. In the future, we are likely to add more backend options. Switching from one backend to another If you have run Keras at least once, you will find the Keras configuration file at: $HOME/.keras/keras.json If it isn\\'t there, you can create it. NOTE for Windows Users: Please replace $HOME with %USERPROFILE% . The default configuration file looks like this: { \"image_data_format\": \"channels_last\", \"epsilon\": 1e-07, \"floatx\": \"float32\", \"backend\": \"tensorflow\" } Simply change the field backend to \"theano\" , \"tensorflow\" , or \"cntk\" , and Keras will use the new configuration next time you run any Keras code. You can also define the environment variable KERAS_BACKEND and this will override what is defined in your config file : KERAS_BACKEND=tensorflow python -c \"from keras import backend\" Using TensorFlow backend. keras.json details The keras.json configuration file contains the following settings: { \"image_data_format\": \"channels_last\", \"epsilon\": 1e-07, \"floatx\": \"float32\", \"backend\": \"tensorflow\" } You can change these settings by editing $HOME/.keras/keras.json . image_data_format : String, either \"channels_last\" or \"channels_first\" . It specifies which data format convention Keras will follow. ( keras.backend.image_data_format() returns it.) For 2D data (e.g. image), \"channels_last\" assumes (rows, cols, channels) while \"channels_first\" assumes (channels, rows, cols) . For 3D data, \"channels_last\" assumes (conv_dim1, conv_dim2, conv_dim3, channels) while \"channels_first\" assumes (channels, conv_dim1, conv_dim2, conv_dim3) . epsilon : Float, a numeric fuzzing constant used to avoid dividing by zero in some operations. floatx : String, \"float16\" , \"float32\" , or \"float64\" . Default float precision. backend : String, \"tensorflow\" , \"theano\" , or \"cntk\" . Using the abstract Keras backend to write new code If you want the Keras modules you write to be compatible with both Theano ( th ) and TensorFlow ( tf ), you have to write them via the abstract Keras backend API. Here\\'s an intro. You can import the backend module via: from keras import backend as K The code below instantiates an input placeholder. It\\'s equivalent to tf.placeholder() or th.tensor.matrix() , th.tensor.tensor3() , etc. inputs = K.placeholder(shape=(2, 4, 5)) # also works: inputs = K.placeholder(shape=(None, 4, 5)) # also works: inputs = K.placeholder(ndim=3) The code below instantiates a variable. It\\'s equivalent to tf.Variable() or th.shared() . import numpy as np val = np.random.random((3, 4, 5)) var = K.variable(value=val) # all-zeros variable: var = K.zeros(shape=(3, 4, 5)) # all-ones: var = K.ones(shape=(3, 4, 5)) Most tensor operations you will need can be done as you would in TensorFlow or Theano: # Initializing Tensors with Random Numbers b = K.random_uniform_variable(shape=(3, 4), low=0, high=1) # Uniform distribution c = K.random_normal_variable(shape=(3, 4), mean=0, scale=1) # Gaussian distribution d = K.random_normal_variable(shape=(3, 4), mean=0, scale=1) # Tensor Arithmetic a = b + c * K.abs(d) c = K.dot(a, K.transpose(b)) a = K.sum(b, axis=1) a = K.softmax(b) a = K.concatenate([b, c], axis=-1) # etc... Backend functions learning_phase keras.backend.learning_phase() set_learning_phase keras.backend.set_learning_phase(value) get_uid keras.backend.get_uid(prefix=\\'\\') Provides a unique UID given a string prefix. Arguments prefix : string. Returns An integer. Example >>> keras.backend.get_uid(\\'dense\\') 1 >>> keras.backend.get_uid(\\'dense\\') 2 reset_uids keras.backend.reset_uids() is_sparse keras.backend.is_sparse(tensor) to_dense keras.backend.to_dense(tensor) name_scope keras.backend.name_scope() variable keras.backend.variable(value, dtype=None, name=None, constraint=None) Instantiates a variable and returns it. Arguments value : Numpy array, initial value of the tensor. dtype : Tensor type. name : Optional name string for the tensor. constraint : Optional projection function to be applied to the variable after an optimizer update. Returns A variable instance (with Keras metadata included). constant keras.backend.constant(value, dtype=None, shape=None, name=None) is_keras_tensor keras.backend.is_keras_tensor(x) Returns whether x is a Keras tensor. A \"Keras tensor\" is a tensor that was returned by a Keras layer, ( Layer class) or by Input . Arguments x : A candidate tensor. Returns A boolean: Whether the argument is a Keras tensor. Raises ValueError : In case x is not a symbolic tensor. Examples >>> from keras import backend as K >>> from keras.layers import Input, Dense >>> np_var = numpy.array([1, 2]) >>> K.is_keras_tensor(np_var) # A numpy array is not a symbolic tensor. ValueError >>> k_var = tf.placeholder(\\'float32\\', shape=(1,1)) >>> K.is_keras_tensor(k_var) # A variable indirectly created outside of keras is not a Keras tensor. False >>> keras_var = K.variable(np_var) >>> K.is_keras_tensor(keras_var) # A variable created with the keras backend is not a Keras tensor. False >>> keras_placeholder = K.placeholder(shape=(2, 4, 5)) >>> K.is_keras_tensor(keras_placeholder) # A placeholder is not a Keras tensor. False >>> keras_input = Input([10]) >>> K.is_keras_tensor(keras_input) # An Input is a Keras tensor. True >>> keras_layer_output = Dense(10)(keras_input) >>> K.is_keras_tensor(keras_layer_output) # Any Keras layer output is a Keras tensor. True is_tensor keras.backend.is_tensor(x) placeholder keras.backend.placeholder(shape=None, ndim=None, dtype=None, sparse=False, name=None) Instantiate an input data placeholder variable. is_placeholder keras.backend.is_placeholder(x) Returns whether x is a placeholder. Arguments x : A candidate placeholder. Returns Boolean. shape keras.backend.shape(x) Returns the shape of a tensor. Warning: type returned will be different for Theano backend (Theano tensor type) and TF backend (TF TensorShape). int_shape keras.backend.int_shape(x) Returns the shape of a Keras tensor or a Keras variable as a tuple of integers or None entries. Arguments x : Tensor or variable. Returns A tuple of integers (or None entries). ndim keras.backend.ndim(x) dtype keras.backend.dtype(x) eval keras.backend.eval(x) Returns the value of a tensor. zeros keras.backend.zeros(shape, dtype=None, name=None) Instantiates an all-zeros variable. ones keras.backend.ones(shape, dtype=None, name=None) Instantiates an all-ones variable. eye keras.backend.eye(size, dtype=None, name=None) Instantiates an identity matrix. ones_like keras.backend.ones_like(x, dtype=None, name=None) zeros_like keras.backend.zeros_like(x, dtype=None, name=None) identity keras.backend.identity(x, name=None) Returns a tensor with the same content as the input tensor. Arguments x : The input tensor. name : String, name for the variable to create. Returns A tensor of the same shape, type and content. random_uniform_variable keras.backend.random_uniform_variable(shape, low, high, dtype=None, name=None) random_normal_variable keras.backend.random_normal_variable(shape, mean, scale, dtype=None, name=None) count_params keras.backend.count_params(x) Returns the number of scalars in a tensor. Return: numpy integer. cast keras.backend.cast(x, dtype) update keras.backend.update(x, new_x) update_sub keras.backend.update_sub(x, decrement) moving_average_update keras.backend.moving_average_update(variable, value, momentum) dot keras.backend.dot(x, y) batch_dot keras.backend.batch_dot(x, y, axes=None) Batchwise dot product. batch_dot results in a tensor with less dimensions than the input. If the number of dimensions is reduced to 1, we use expand_dims to make sure that ndim is at least 2. Arguments x, y: tensors with ndim >= 2 - axes : list (or single) int with target dimensions Returns A tensor with shape equal to the concatenation of x\\'s shape (less the dimension that was summed over) and y\\'s shape (less the batch dimension and the dimension that was summed over). If the final rank is 1, we reshape it to (batch_size, 1). Examples Assume x = [[1, 2], [3, 4]] and y = [[5, 6], [7, 8]] batch_dot(x, y, axes=1) = [[17, 53]] which is the main diagonal of x.dot(y.T), although we never have to calculate the off-diagonal elements. Shape inference: Let x\\'s shape be (100, 20) and y\\'s shape be (100, 30, 20). If dot_axes is (1, 2), to find the output shape of resultant tensor, loop through each dimension in x\\'s shape and y\\'s shape: x.shape[0] : 100 : append to output shape x.shape[1] : 20 : do not append to output shape, dimension 1 of x has been summed over. (dot_axes[0] = 1) y.shape[0] : 100 : do not append to output shape, always ignore first dimension of y y.shape[1] : 30 : append to output shape y.shape[2] : 20 : do not append to output shape, dimension 2 of y has been summed over. (dot_axes[1] = 2) output_shape = (100, 30) transpose keras.backend.transpose(x) gather keras.backend.gather(reference, indices) Retrieves the elements of indices indices in the tensor reference . Arguments reference : A tensor. indices : An integer tensor of indices. Returns A tensor of same type as reference . max keras.backend.max(x, axis=None, keepdims=False) min keras.backend.min(x, axis=None, keepdims=False) sum keras.backend.sum(x, axis=None, keepdims=False) Sum of the values in a tensor, alongside the specified axis. prod keras.backend.prod(x, axis=None, keepdims=False) Multiply the values in a tensor, alongside the specified axis. cumsum keras.backend.cumsum(x, axis=0) Cumulative sum of the values in a tensor, alongside the specified axis. Arguments x : A tensor or variable. axis : An integer, the axis to compute the sum. Returns A tensor of the cumulative sum of values of x along axis . cumprod keras.backend.cumprod(x, axis=0) Cumulative product of the values in a tensor, alongside the specified axis. Arguments x : A tensor or variable. axis : An integer, the axis to compute the product. Returns A tensor of the cumulative product of values of x along axis . mean keras.backend.mean(x, axis=None, keepdims=False) Mean of a tensor, alongside the specified axis. std keras.backend.std(x, axis=None, keepdims=False) var keras.backend.var(x, axis=None, keepdims=False) any keras.backend.any(x, axis=None, keepdims=False) Bitwise reduction (logical OR). all keras.backend.all(x, axis=None, keepdims=False) Bitwise reduction (logical AND). argmax keras.backend.argmax(x, axis=-1) argmin keras.backend.argmin(x, axis=-1) square keras.backend.square(x) abs keras.backend.abs(x) sqrt keras.backend.sqrt(x) exp keras.backend.exp(x) log keras.backend.log(x) logsumexp keras.backend.logsumexp(x, axis=None, keepdims=False) Computes log(sum(exp(elements across dimensions of a tensor))). This function is more numerically stable than log(sum(exp(x))). It avoids overflows caused by taking the exp of large inputs and underflows caused by taking the log of small inputs. Arguments x : A tensor or variable. axis : An integer, the axis to reduce over. keepdims : A boolean, whether to keep the dimensions or not. If keepdims is False , the rank of the tensor is reduced by 1. If keepdims is True , the reduced dimension is retained with length 1. Returns The reduced tensor. round keras.backend.round(x) sign keras.backend.sign(x) pow keras.backend.pow(x, a) clip keras.backend.clip(x, min_value, max_value) equal keras.backend.equal(x, y) not_equal keras.backend.not_equal(x, y) greater keras.backend.greater(x, y) greater_equal keras.backend.greater_equal(x, y) less keras.backend.less(x, y) less_equal keras.backend.less_equal(x, y) maximum keras.backend.maximum(x, y) minimum keras.backend.minimum(x, y) sin keras.backend.sin(x) cos keras.backend.cos(x) normalize_batch_in_training keras.backend.normalize_batch_in_training(x, gamma, beta, reduction_axes, epsilon=0.001) Computes mean and std for batch then apply batch_normalization on batch. batch_normalization keras.backend.batch_normalization(x, mean, var, beta, gamma, axis=-1, epsilon=0.001) Apply batch normalization on x given mean, var, beta and gamma. concatenate keras.backend.concatenate(tensors, axis=-1) reshape keras.backend.reshape(x, shape) permute_dimensions keras.backend.permute_dimensions(x, pattern) Transpose dimensions. pattern should be a tuple or list of dimension indices, e.g. [0, 2, 1]. repeat_elements keras.backend.repeat_elements(x, rep, axis) Repeat the elements of a tensor along an axis, like np.repeat. If x has shape (s1, s2, s3) and axis=1, the output will have shape (s1, s2 * rep, s3). resize_images keras.backend.resize_images(x, height_factor, width_factor, data_format, interpolation=\\'nearest\\') Resize the images contained in a 4D tensor of shape - [batch, channels, height, width] (for \\'channels_first\\' data_format) - [batch, height, width, channels] (for \\'channels_last\\' data_format) by a factor of (height_factor, width_factor). Both factors should be positive integers. resize_volumes keras.backend.resize_volumes(x, depth_factor, height_factor, width_factor, data_format) Resize the volume contained in a 5D tensor of shape - [batch, channels, depth, height, width] (for \\'channels_first\\' data_format) - [batch, depth, height, width, channels] (for \\'channels_last\\' data_format) by a factor of (depth_factor, height_factor, width_factor). Both factors should be positive integers. repeat keras.backend.repeat(x, n) Repeat a 2D tensor. If x has shape (samples, dim) and n=2, the output will have shape (samples, 2, dim). arange keras.backend.arange(start, stop=None, step=1, dtype=\\'int32\\') Creates a 1-D tensor containing a sequence of integers. The function arguments use the same convention as Theano\\'s arange: if only one argument is provided, it is in fact the \"stop\" argument. The default type of the returned tensor is \\'int32\\' to match TensorFlow\\'s default. tile keras.backend.tile(x, n) flatten keras.backend.flatten(x) batch_flatten keras.backend.batch_flatten(x) Turn a n-D tensor into a 2D tensor where the first dimension is conserved. expand_dims keras.backend.expand_dims(x, axis=-1) Add a 1-sized dimension at index \"dim\". squeeze keras.backend.squeeze(x, axis) Remove a 1-dimension from the tensor at index \"axis\". temporal_padding keras.backend.temporal_padding(x, padding=(1, 1)) Pad the middle dimension of a 3D tensor with \"padding\" zeros left and right. Apologies for the inane API, but Theano makes this really hard. spatial_2d_padding keras.backend.spatial_2d_padding(x, padding=((1, 1), (1, 1)), data_format=None) Pad the 2nd and 3rd dimensions of a 4D tensor with \"padding[0]\" and \"padding[1]\" (resp.) zeros left and right. spatial_3d_padding keras.backend.spatial_3d_padding(x, padding=((1, 1), (1, 1), (1, 1)), data_format=None) Pad the 2nd, 3rd and 4th dimensions of a 5D tensor with \"padding[0]\", \"padding[1]\" and \"padding[2]\" (resp.) zeros left and right. stack keras.backend.stack(x, axis=0) one_hot keras.backend.one_hot(indices, num_classes) Input: nD integer tensor of shape (batch_size, dim1, dim2, ... dim(n-1)) Output: (n + 1)D one hot representation of the input with shape (batch_size, dim1, dim2, ... dim(n-1), num_classes) reverse keras.backend.reverse(x, axes) Reverse a tensor along the specified axes slice keras.backend.slice(x, start, size) pattern_broadcast keras.backend.pattern_broadcast(x, broadcastable) get_value keras.backend.get_value(x) batch_get_value keras.backend.batch_get_value(xs) Returns the value of more than one tensor variable, as a list of Numpy arrays. set_value keras.backend.set_value(x, value) batch_set_value keras.backend.batch_set_value(tuples) print_tensor keras.backend.print_tensor(x, message=\\'\\') Print the message and the tensor when evaluated and return the same tensor. function keras.backend.function(inputs, outputs, updates=[]) gradients keras.backend.gradients(loss, variables) stop_gradient keras.backend.stop_gradient(variables) Returns variables but with zero gradient w.r.t. every other variable. Arguments variables : tensor or list of tensors to consider constant with respect to any other variable. Returns A single tensor or a list of tensors (depending on the passed argument) that has constant gradient with respect to any other variable. rnn keras.backend.rnn(step_function, inputs, initial_states, go_backwards=False, mask=None, constants=None, unroll=False, input_length=None) Iterates over the time dimension of a tensor. Arguments step_function : Parameters: inputs: Tensor with shape (samples, ...) (no time dimension), representing input for the batch of samples at a certain time step. states: List of tensors. Returns: outputs: Tensor with shape (samples, ...) (no time dimension), new_states: List of tensors, same length and shapes as \\'states\\'. inputs : Tensor of temporal data of shape (samples, time, ...) (at least 3D). initial_states : Tensor with shape (samples, ...) (no time dimension), containing the initial values for the states used in the step function. go_backwards : Boolean. If True, do the iteration over the time dimension in reverse order and return the reversed sequence. mask : Binary tensor with shape (samples, time), with a zero for every element that is masked. constants : A list of constant values passed at each step. unroll : Whether to unroll the RNN or to use a symbolic loop ( while_loop or scan depending on backend). input_length : Static number of timesteps in the input. Must be specified if using unroll . Returns A tuple (last_output, outputs, new_states). last_output: The latest output of the rnn, of shape (samples, ...) outputs: Tensor with shape (samples, time, ...) where each entry outputs[s, t] is the output of the step function at time t for sample s . new_states: List of tensors, latest states returned by the step function, of shape (samples, ...) . switch keras.backend.switch(condition, then_expression, else_expression) Switches between two operations depending on a scalar value. Note that both then_expression and else_expression should be symbolic tensors of the same shape . Arguments condition : scalar tensor ( int or bool ). then_expression : either a tensor, or a callable that returns a tensor. else_expression : either a tensor, or a callable that returns a tensor. Returns The selected tensor. in_train_phase keras.backend.in_train_phase(x, alt, training=None) Selects x in train phase, and alt otherwise. Note that alt should have the same shape as x . Returns Either x or alt based on the training flag. the training flag defaults to K.learning_phase() . in_test_phase keras.backend.in_test_phase(x, alt, training=None) Selects x in test phase, and alt otherwise. Note that alt should have the same shape as x . Returns Either x or alt based on K.learning_phase . elu keras.backend.elu(x, alpha=1.0) Exponential linear unit Arguments x : Tensor to compute the activation function for. alpha : scalar relu keras.backend.relu(x, alpha=0.0, max_value=None, threshold=0.0) softmax keras.backend.softmax(x, axis=-1) softplus keras.backend.softplus(x) softsign keras.backend.softsign(x) categorical_crossentropy keras.backend.categorical_crossentropy(target, output, from_logits=False, axis=-1) sparse_categorical_crossentropy keras.backend.sparse_categorical_crossentropy(target, output, from_logits=False, axis=-1) binary_crossentropy keras.backend.binary_crossentropy(target, output, from_logits=False) sigmoid keras.backend.sigmoid(x) hard_sigmoid keras.backend.hard_sigmoid(x) tanh keras.backend.tanh(x) dropout keras.backend.dropout(x, level, noise_shape=None, seed=None) Sets entries in x to zero at random, while scaling the entire tensor. Arguments x : tensor level : fraction of the entries in the tensor that will be set to 0. noise_shape : shape for randomly generated keep/drop flags, must be broadcastable to the shape of x seed : random seed to ensure determinism. l2_normalize keras.backend.l2_normalize(x, axis=None) in_top_k keras.backend.in_top_k(predictions, targets, k) Returns whether the targets are in the top k predictions . Arguments predictions : A tensor of shape (batch_size, classes) and type float32 . targets : A 1D tensor of length batch_size and type int32 or int64 . k : An int , number of top elements to consider. Returns A 1D tensor of length batch_size and type bool . output[i] is True if predictions[i, targets[i]] is within top- k values of predictions[i] . conv1d keras.backend.conv1d(x, kernel, strides=1, padding=\\'valid\\', data_format=None, dilation_rate=1) 1D convolution. Arguments kernel : kernel tensor. strides : stride integer. padding : string, \"same\" , \"causal\" or \"valid\" . data_format : string, one of \"channels_last\", \"channels_first\" dilation_rate : integer. conv2d keras.backend.conv2d(x, kernel, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D convolution. Arguments kernel : kernel tensor. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs. conv2d_transpose keras.backend.conv2d_transpose(x, kernel, output_shape, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D deconvolution (transposed convolution). Arguments kernel : kernel tensor. output_shape : desired dimensions of output. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs. dilation_rate : tuple of 2 integers. Raises ValueError : if using an even kernel size with padding \\'same\\'. separable_conv1d keras.backend.separable_conv1d(x, depthwise_kernel, pointwise_kernel, strides=1, padding=\\'valid\\', data_format=None, dilation_rate=1) 1D convolution with separable filters. Arguments x : input tensor depthwise_kernel : convolution kernel for the depthwise convolution. pointwise_kernel : kernel for the 1x1 convolution. strides : strides integer. padding : string, \"same\" or \"valid\" . data_format : string, \"channels_last\" or \"channels_first\" . dilation_rate : integer dilation rate. Returns Output tensor. Raises ValueError : if data_format is neither \"channels_last\" or \"channels_first\" . separable_conv2d keras.backend.separable_conv2d(x, depthwise_kernel, pointwise_kernel, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D convolution with separable filters. Arguments x : input tensor depthwise_kernel : convolution kernel for the depthwise convolution. pointwise_kernel : kernel for the 1x1 convolution. strides : strides tuple (length 2). padding : string, \"same\" or \"valid\" . data_format : string, \"channels_last\" or \"channels_first\" . dilation_rate : tuple of integers, dilation rates for the separable convolution. Returns Output tensor. Raises ValueError : if data_format is neither \"channels_last\" or \"channels_first\" . depthwise_conv2d keras.backend.depthwise_conv2d(x, depthwise_kernel, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D convolution with separable filters. Arguments x : input tensor depthwise_kernel : convolution kernel for the depthwise convolution. strides : strides tuple (length 2). padding : string, \"same\" or \"valid\" . data_format : string, \"channels_last\" or \"channels_first\" . dilation_rate : tuple of integers, dilation rates for the separable convolution. Returns Output tensor. Raises ValueError : if data_format is neither \"channels_last\" or \"channels_first\" . conv3d keras.backend.conv3d(x, kernel, strides=(1, 1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1, 1)) 3D convolution. Arguments kernel : kernel tensor. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs. conv3d_transpose keras.backend.conv3d_transpose(x, kernel, output_shape, strides=(1, 1, 1), padding=\\'valid\\', data_format=None) 3D deconvolution (transposed convolution). Arguments kernel : kernel tensor. output_shape : desired dimensions of output. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs. Raises ValueError : if using an even kernel size with padding \\'same\\'. pool2d keras.backend.pool2d(x, pool_size, strides=(1, 1), padding=\\'valid\\', data_format=None, pool_mode=\\'max\\') pool3d keras.backend.pool3d(x, pool_size, strides=(1, 1, 1), padding=\\'valid\\', data_format=None, pool_mode=\\'max\\') bias_add keras.backend.bias_add(x, bias, data_format=None) random_normal keras.backend.random_normal(shape, mean=0.0, stddev=1.0, dtype=None, seed=None) random_uniform keras.backend.random_uniform(shape, minval=0.0, maxval=1.0, dtype=None, seed=None) random_binomial keras.backend.random_binomial(shape, p=0.0, dtype=None, seed=None) truncated_normal keras.backend.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=None, seed=None) ctc_interleave_blanks keras.backend.ctc_interleave_blanks(Y) ctc_create_skip_idxs keras.backend.ctc_create_skip_idxs(Y) ctc_update_log_p keras.backend.ctc_update_log_p(skip_idxs, zeros, active, log_p_curr, log_p_prev) ctc_path_probs keras.backend.ctc_path_probs(predict, Y, alpha=0.0001) ctc_cost keras.backend.ctc_cost(predict, Y) ctc_batch_cost keras.backend.ctc_batch_cost(y_true, y_pred, input_length, label_length) Runs CTC loss algorithm on each batch element. Arguments y_true : tensor (samples, max_string_length) containing the truth labels y_pred : tensor (samples, time_steps, num_categories) containing the prediction, or output of the softmax input_length : tensor (samples,1) containing the sequence length for each batch item in y_pred label_length : tensor (samples,1) containing the sequence length for each batch item in y_true Returns Tensor with shape (samples,1) containing the CTC loss of each element map_fn keras.backend.map_fn(fn, elems, name=None, dtype=None) Map the function fn over the elements elems and return the outputs. Arguments fn : Callable that will be called upon each element in elems elems : tensor, at least 2 dimensional name : A string name for the map node in the graph Returns Tensor with first dimension equal to the elems and second depending on fn foldl keras.backend.foldl(fn, elems, initializer=None, name=None) Reduce elems using fn to combine them from left to right. Arguments fn : Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x elems : tensor initializer : The first value used (elems[0] in case of None) name : A string name for the foldl node in the graph Returns Same type and shape as initializer foldr keras.backend.foldr(fn, elems, initializer=None, name=None) Reduce elems using fn to combine them from right to left. Arguments fn : Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x elems : tensor initializer : The first value used (elems[-1] in case of None) name : A string name for the foldr node in the graph Returns Same type and shape as initializer local_conv1d keras.backend.local_conv1d(inputs, kernel, kernel_size, strides, data_format=None) local_conv2d keras.backend.local_conv2d(inputs, kernel, kernel_size, strides, output_shape, data_format=None) update_add keras.backend.update_add(x, increment) epsilon keras.backend.epsilon() Returns the value of the fuzz factor used in numeric expressions. Returns A float. Example >>> keras.backend.epsilon() 1e-07 set_epsilon keras.backend.set_epsilon(e) Sets the value of the fuzz factor used in numeric expressions. Arguments e : float. New value of epsilon. Example >>> from keras import backend as K >>> K.epsilon() 1e-07 >>> K.set_epsilon(1e-05) >>> K.epsilon() 1e-05 floatx keras.backend.floatx() Returns the default float type, as a string. (e.g. \\'float16\\', \\'float32\\', \\'float64\\'). Returns String, the current default float type. Example >>> keras.backend.floatx() \\'float32\\' set_floatx keras.backend.set_floatx(floatx) Sets the default float type. Arguments floatx : String, \\'float16\\', \\'float32\\', or \\'float64\\'. Example >>> from keras import backend as K >>> K.floatx() \\'float32\\' >>> K.set_floatx(\\'float16\\') >>> K.floatx() \\'float16\\' cast_to_floatx keras.backend.cast_to_floatx(x) Cast a Numpy array to the default Keras float type. Arguments x : Numpy array. Returns The same Numpy array, cast to its new type. Example >>> from keras import backend as K >>> K.floatx() \\'float32\\' >>> arr = numpy.array([1.0, 2.0], dtype=\\'float64\\') >>> arr.dtype dtype(\\'float64\\') >>> new_arr = K.cast_to_floatx(arr) >>> new_arr array([ 1., 2.], dtype=float32) >>> new_arr.dtype dtype(\\'float32\\') image_data_format keras.backend.image_data_format() Returns the default image data format convention (\\'channels_first\\' or \\'channels_last\\'). Returns A string, either \\'channels_first\\' or \\'channels_last\\' Example >>> keras.backend.image_data_format() \\'channels_first\\' set_image_data_format keras.backend.set_image_data_format(data_format) Sets the value of the data format convention. Arguments data_format : string. \\'channels_first\\' or \\'channels_last\\' . Example >>> from keras import backend as K >>> K.image_data_format() \\'channels_first\\' >>> K.set_image_data_format(\\'channels_last\\') >>> K.image_data_format() \\'channels_last\\' backend keras.backend.backend() Publicly accessible method for determining the current backend. Returns String, the name of the backend Keras is currently using. Example >>> keras.backend.backend() \\'tensorflow\\''), ('title', 'Backend')]), OrderedDict([('location', 'backend.html#keras-backends'), ('text', ''), ('title', 'Keras backends')]), OrderedDict([('location', 'backend.html#what-is-a-backend'), ('text', 'Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the \"backend engine\" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras. At this time, Keras has three backend implementations available: the TensorFlow backend, the Theano backend, and the CNTK backend. TensorFlow is an open-source symbolic tensor manipulation framework developed by Google. Theano is an open-source symbolic tensor manipulation framework developed by LISA Lab at Universit\u00e9 de Montr\u00e9al. CNTK is an open-source toolkit for deep learning developed by Microsoft. In the future, we are likely to add more backend options.'), ('title', 'What is a \"backend\"?')]), OrderedDict([('location', 'backend.html#switching-from-one-backend-to-another'), ('text', 'If you have run Keras at least once, you will find the Keras configuration file at: $HOME/.keras/keras.json If it isn\\'t there, you can create it. NOTE for Windows Users: Please replace $HOME with %USERPROFILE% . The default configuration file looks like this: { \"image_data_format\": \"channels_last\", \"epsilon\": 1e-07, \"floatx\": \"float32\", \"backend\": \"tensorflow\" } Simply change the field backend to \"theano\" , \"tensorflow\" , or \"cntk\" , and Keras will use the new configuration next time you run any Keras code. You can also define the environment variable KERAS_BACKEND and this will override what is defined in your config file : KERAS_BACKEND=tensorflow python -c \"from keras import backend\" Using TensorFlow backend.'), ('title', 'Switching from one backend to another')]), OrderedDict([('location', 'backend.html#kerasjson-details'), ('text', 'The keras.json configuration file contains the following settings: { \"image_data_format\": \"channels_last\", \"epsilon\": 1e-07, \"floatx\": \"float32\", \"backend\": \"tensorflow\" } You can change these settings by editing $HOME/.keras/keras.json . image_data_format : String, either \"channels_last\" or \"channels_first\" . It specifies which data format convention Keras will follow. ( keras.backend.image_data_format() returns it.) For 2D data (e.g. image), \"channels_last\" assumes (rows, cols, channels) while \"channels_first\" assumes (channels, rows, cols) . For 3D data, \"channels_last\" assumes (conv_dim1, conv_dim2, conv_dim3, channels) while \"channels_first\" assumes (channels, conv_dim1, conv_dim2, conv_dim3) . epsilon : Float, a numeric fuzzing constant used to avoid dividing by zero in some operations. floatx : String, \"float16\" , \"float32\" , or \"float64\" . Default float precision. backend : String, \"tensorflow\" , \"theano\" , or \"cntk\" .'), ('title', 'keras.json details')]), OrderedDict([('location', 'backend.html#using-the-abstract-keras-backend-to-write-new-code'), ('text', \"If you want the Keras modules you write to be compatible with both Theano ( th ) and TensorFlow ( tf ), you have to write them via the abstract Keras backend API. Here's an intro. You can import the backend module via: from keras import backend as K The code below instantiates an input placeholder. It's equivalent to tf.placeholder() or th.tensor.matrix() , th.tensor.tensor3() , etc. inputs = K.placeholder(shape=(2, 4, 5)) # also works: inputs = K.placeholder(shape=(None, 4, 5)) # also works: inputs = K.placeholder(ndim=3) The code below instantiates a variable. It's equivalent to tf.Variable() or th.shared() . import numpy as np val = np.random.random((3, 4, 5)) var = K.variable(value=val) # all-zeros variable: var = K.zeros(shape=(3, 4, 5)) # all-ones: var = K.ones(shape=(3, 4, 5)) Most tensor operations you will need can be done as you would in TensorFlow or Theano: # Initializing Tensors with Random Numbers b = K.random_uniform_variable(shape=(3, 4), low=0, high=1) # Uniform distribution c = K.random_normal_variable(shape=(3, 4), mean=0, scale=1) # Gaussian distribution d = K.random_normal_variable(shape=(3, 4), mean=0, scale=1) # Tensor Arithmetic a = b + c * K.abs(d) c = K.dot(a, K.transpose(b)) a = K.sum(b, axis=1) a = K.softmax(b) a = K.concatenate([b, c], axis=-1) # etc...\"), ('title', 'Using the abstract Keras backend to write new code')]), OrderedDict([('location', 'backend.html#backend-functions'), ('text', ''), ('title', 'Backend functions')]), OrderedDict([('location', 'backend.html#learning_phase'), ('text', 'keras.backend.learning_phase()'), ('title', 'learning_phase')]), OrderedDict([('location', 'backend.html#set_learning_phase'), ('text', 'keras.backend.set_learning_phase(value)'), ('title', 'set_learning_phase')]), OrderedDict([('location', 'backend.html#get_uid'), ('text', \"keras.backend.get_uid(prefix='') Provides a unique UID given a string prefix. Arguments prefix : string. Returns An integer. Example >>> keras.backend.get_uid('dense') 1 >>> keras.backend.get_uid('dense') 2\"), ('title', 'get_uid')]), OrderedDict([('location', 'backend.html#reset_uids'), ('text', 'keras.backend.reset_uids()'), ('title', 'reset_uids')]), OrderedDict([('location', 'backend.html#is_sparse'), ('text', 'keras.backend.is_sparse(tensor)'), ('title', 'is_sparse')]), OrderedDict([('location', 'backend.html#to_dense'), ('text', 'keras.backend.to_dense(tensor)'), ('title', 'to_dense')]), OrderedDict([('location', 'backend.html#name_scope'), ('text', 'keras.backend.name_scope()'), ('title', 'name_scope')]), OrderedDict([('location', 'backend.html#variable'), ('text', 'keras.backend.variable(value, dtype=None, name=None, constraint=None) Instantiates a variable and returns it. Arguments value : Numpy array, initial value of the tensor. dtype : Tensor type. name : Optional name string for the tensor. constraint : Optional projection function to be applied to the variable after an optimizer update. Returns A variable instance (with Keras metadata included).'), ('title', 'variable')]), OrderedDict([('location', 'backend.html#constant'), ('text', 'keras.backend.constant(value, dtype=None, shape=None, name=None)'), ('title', 'constant')]), OrderedDict([('location', 'backend.html#is_keras_tensor'), ('text', 'keras.backend.is_keras_tensor(x) Returns whether x is a Keras tensor. A \"Keras tensor\" is a tensor that was returned by a Keras layer, ( Layer class) or by Input . Arguments x : A candidate tensor. Returns A boolean: Whether the argument is a Keras tensor. Raises ValueError : In case x is not a symbolic tensor. Examples >>> from keras import backend as K >>> from keras.layers import Input, Dense >>> np_var = numpy.array([1, 2]) >>> K.is_keras_tensor(np_var) # A numpy array is not a symbolic tensor. ValueError >>> k_var = tf.placeholder(\\'float32\\', shape=(1,1)) >>> K.is_keras_tensor(k_var) # A variable indirectly created outside of keras is not a Keras tensor. False >>> keras_var = K.variable(np_var) >>> K.is_keras_tensor(keras_var) # A variable created with the keras backend is not a Keras tensor. False >>> keras_placeholder = K.placeholder(shape=(2, 4, 5)) >>> K.is_keras_tensor(keras_placeholder) # A placeholder is not a Keras tensor. False >>> keras_input = Input([10]) >>> K.is_keras_tensor(keras_input) # An Input is a Keras tensor. True >>> keras_layer_output = Dense(10)(keras_input) >>> K.is_keras_tensor(keras_layer_output) # Any Keras layer output is a Keras tensor. True'), ('title', 'is_keras_tensor')]), OrderedDict([('location', 'backend.html#is_tensor'), ('text', 'keras.backend.is_tensor(x)'), ('title', 'is_tensor')]), OrderedDict([('location', 'backend.html#placeholder'), ('text', 'keras.backend.placeholder(shape=None, ndim=None, dtype=None, sparse=False, name=None) Instantiate an input data placeholder variable.'), ('title', 'placeholder')]), OrderedDict([('location', 'backend.html#is_placeholder'), ('text', 'keras.backend.is_placeholder(x) Returns whether x is a placeholder. Arguments x : A candidate placeholder. Returns Boolean.'), ('title', 'is_placeholder')]), OrderedDict([('location', 'backend.html#shape'), ('text', 'keras.backend.shape(x) Returns the shape of a tensor. Warning: type returned will be different for Theano backend (Theano tensor type) and TF backend (TF TensorShape).'), ('title', 'shape')]), OrderedDict([('location', 'backend.html#int_shape'), ('text', 'keras.backend.int_shape(x) Returns the shape of a Keras tensor or a Keras variable as a tuple of integers or None entries. Arguments x : Tensor or variable. Returns A tuple of integers (or None entries).'), ('title', 'int_shape')]), OrderedDict([('location', 'backend.html#ndim'), ('text', 'keras.backend.ndim(x)'), ('title', 'ndim')]), OrderedDict([('location', 'backend.html#dtype'), ('text', 'keras.backend.dtype(x)'), ('title', 'dtype')]), OrderedDict([('location', 'backend.html#eval'), ('text', 'keras.backend.eval(x) Returns the value of a tensor.'), ('title', 'eval')]), OrderedDict([('location', 'backend.html#zeros'), ('text', 'keras.backend.zeros(shape, dtype=None, name=None) Instantiates an all-zeros variable.'), ('title', 'zeros')]), OrderedDict([('location', 'backend.html#ones'), ('text', 'keras.backend.ones(shape, dtype=None, name=None) Instantiates an all-ones variable.'), ('title', 'ones')]), OrderedDict([('location', 'backend.html#eye'), ('text', 'keras.backend.eye(size, dtype=None, name=None) Instantiates an identity matrix.'), ('title', 'eye')]), OrderedDict([('location', 'backend.html#ones_like'), ('text', 'keras.backend.ones_like(x, dtype=None, name=None)'), ('title', 'ones_like')]), OrderedDict([('location', 'backend.html#zeros_like'), ('text', 'keras.backend.zeros_like(x, dtype=None, name=None)'), ('title', 'zeros_like')]), OrderedDict([('location', 'backend.html#identity'), ('text', 'keras.backend.identity(x, name=None) Returns a tensor with the same content as the input tensor. Arguments x : The input tensor. name : String, name for the variable to create. Returns A tensor of the same shape, type and content.'), ('title', 'identity')]), OrderedDict([('location', 'backend.html#random_uniform_variable'), ('text', 'keras.backend.random_uniform_variable(shape, low, high, dtype=None, name=None)'), ('title', 'random_uniform_variable')]), OrderedDict([('location', 'backend.html#random_normal_variable'), ('text', 'keras.backend.random_normal_variable(shape, mean, scale, dtype=None, name=None)'), ('title', 'random_normal_variable')]), OrderedDict([('location', 'backend.html#count_params'), ('text', 'keras.backend.count_params(x) Returns the number of scalars in a tensor. Return: numpy integer.'), ('title', 'count_params')]), OrderedDict([('location', 'backend.html#cast'), ('text', 'keras.backend.cast(x, dtype)'), ('title', 'cast')]), OrderedDict([('location', 'backend.html#update'), ('text', 'keras.backend.update(x, new_x)'), ('title', 'update')]), OrderedDict([('location', 'backend.html#update_sub'), ('text', 'keras.backend.update_sub(x, decrement)'), ('title', 'update_sub')]), OrderedDict([('location', 'backend.html#moving_average_update'), ('text', 'keras.backend.moving_average_update(variable, value, momentum)'), ('title', 'moving_average_update')]), OrderedDict([('location', 'backend.html#dot'), ('text', 'keras.backend.dot(x, y)'), ('title', 'dot')]), OrderedDict([('location', 'backend.html#batch_dot'), ('text', \"keras.backend.batch_dot(x, y, axes=None) Batchwise dot product. batch_dot results in a tensor with less dimensions than the input. If the number of dimensions is reduced to 1, we use expand_dims to make sure that ndim is at least 2. Arguments x, y: tensors with ndim >= 2 - axes : list (or single) int with target dimensions Returns A tensor with shape equal to the concatenation of x's shape (less the dimension that was summed over) and y's shape (less the batch dimension and the dimension that was summed over). If the final rank is 1, we reshape it to (batch_size, 1). Examples Assume x = [[1, 2], [3, 4]] and y = [[5, 6], [7, 8]] batch_dot(x, y, axes=1) = [[17, 53]] which is the main diagonal of x.dot(y.T), although we never have to calculate the off-diagonal elements. Shape inference: Let x's shape be (100, 20) and y's shape be (100, 30, 20). If dot_axes is (1, 2), to find the output shape of resultant tensor, loop through each dimension in x's shape and y's shape: x.shape[0] : 100 : append to output shape x.shape[1] : 20 : do not append to output shape, dimension 1 of x has been summed over. (dot_axes[0] = 1) y.shape[0] : 100 : do not append to output shape, always ignore first dimension of y y.shape[1] : 30 : append to output shape y.shape[2] : 20 : do not append to output shape, dimension 2 of y has been summed over. (dot_axes[1] = 2) output_shape = (100, 30)\"), ('title', 'batch_dot')]), OrderedDict([('location', 'backend.html#transpose'), ('text', 'keras.backend.transpose(x)'), ('title', 'transpose')]), OrderedDict([('location', 'backend.html#gather'), ('text', 'keras.backend.gather(reference, indices) Retrieves the elements of indices indices in the tensor reference . Arguments reference : A tensor. indices : An integer tensor of indices. Returns A tensor of same type as reference .'), ('title', 'gather')]), OrderedDict([('location', 'backend.html#max'), ('text', 'keras.backend.max(x, axis=None, keepdims=False)'), ('title', 'max')]), OrderedDict([('location', 'backend.html#min'), ('text', 'keras.backend.min(x, axis=None, keepdims=False)'), ('title', 'min')]), OrderedDict([('location', 'backend.html#sum'), ('text', 'keras.backend.sum(x, axis=None, keepdims=False) Sum of the values in a tensor, alongside the specified axis.'), ('title', 'sum')]), OrderedDict([('location', 'backend.html#prod'), ('text', 'keras.backend.prod(x, axis=None, keepdims=False) Multiply the values in a tensor, alongside the specified axis.'), ('title', 'prod')]), OrderedDict([('location', 'backend.html#cumsum'), ('text', 'keras.backend.cumsum(x, axis=0) Cumulative sum of the values in a tensor, alongside the specified axis. Arguments x : A tensor or variable. axis : An integer, the axis to compute the sum. Returns A tensor of the cumulative sum of values of x along axis .'), ('title', 'cumsum')]), OrderedDict([('location', 'backend.html#cumprod'), ('text', 'keras.backend.cumprod(x, axis=0) Cumulative product of the values in a tensor, alongside the specified axis. Arguments x : A tensor or variable. axis : An integer, the axis to compute the product. Returns A tensor of the cumulative product of values of x along axis .'), ('title', 'cumprod')]), OrderedDict([('location', 'backend.html#mean'), ('text', 'keras.backend.mean(x, axis=None, keepdims=False) Mean of a tensor, alongside the specified axis.'), ('title', 'mean')]), OrderedDict([('location', 'backend.html#std'), ('text', 'keras.backend.std(x, axis=None, keepdims=False)'), ('title', 'std')]), OrderedDict([('location', 'backend.html#var'), ('text', 'keras.backend.var(x, axis=None, keepdims=False)'), ('title', 'var')]), OrderedDict([('location', 'backend.html#any'), ('text', 'keras.backend.any(x, axis=None, keepdims=False) Bitwise reduction (logical OR).'), ('title', 'any')]), OrderedDict([('location', 'backend.html#all'), ('text', 'keras.backend.all(x, axis=None, keepdims=False) Bitwise reduction (logical AND).'), ('title', 'all')]), OrderedDict([('location', 'backend.html#argmax'), ('text', 'keras.backend.argmax(x, axis=-1)'), ('title', 'argmax')]), OrderedDict([('location', 'backend.html#argmin'), ('text', 'keras.backend.argmin(x, axis=-1)'), ('title', 'argmin')]), OrderedDict([('location', 'backend.html#square'), ('text', 'keras.backend.square(x)'), ('title', 'square')]), OrderedDict([('location', 'backend.html#abs'), ('text', 'keras.backend.abs(x)'), ('title', 'abs')]), OrderedDict([('location', 'backend.html#sqrt'), ('text', 'keras.backend.sqrt(x)'), ('title', 'sqrt')]), OrderedDict([('location', 'backend.html#exp'), ('text', 'keras.backend.exp(x)'), ('title', 'exp')]), OrderedDict([('location', 'backend.html#log'), ('text', 'keras.backend.log(x)'), ('title', 'log')]), OrderedDict([('location', 'backend.html#logsumexp'), ('text', 'keras.backend.logsumexp(x, axis=None, keepdims=False) Computes log(sum(exp(elements across dimensions of a tensor))). This function is more numerically stable than log(sum(exp(x))). It avoids overflows caused by taking the exp of large inputs and underflows caused by taking the log of small inputs. Arguments x : A tensor or variable. axis : An integer, the axis to reduce over. keepdims : A boolean, whether to keep the dimensions or not. If keepdims is False , the rank of the tensor is reduced by 1. If keepdims is True , the reduced dimension is retained with length 1. Returns The reduced tensor.'), ('title', 'logsumexp')]), OrderedDict([('location', 'backend.html#round'), ('text', 'keras.backend.round(x)'), ('title', 'round')]), OrderedDict([('location', 'backend.html#sign'), ('text', 'keras.backend.sign(x)'), ('title', 'sign')]), OrderedDict([('location', 'backend.html#pow'), ('text', 'keras.backend.pow(x, a)'), ('title', 'pow')]), OrderedDict([('location', 'backend.html#clip'), ('text', 'keras.backend.clip(x, min_value, max_value)'), ('title', 'clip')]), OrderedDict([('location', 'backend.html#equal'), ('text', 'keras.backend.equal(x, y)'), ('title', 'equal')]), OrderedDict([('location', 'backend.html#not_equal'), ('text', 'keras.backend.not_equal(x, y)'), ('title', 'not_equal')]), OrderedDict([('location', 'backend.html#greater'), ('text', 'keras.backend.greater(x, y)'), ('title', 'greater')]), OrderedDict([('location', 'backend.html#greater_equal'), ('text', 'keras.backend.greater_equal(x, y)'), ('title', 'greater_equal')]), OrderedDict([('location', 'backend.html#less'), ('text', 'keras.backend.less(x, y)'), ('title', 'less')]), OrderedDict([('location', 'backend.html#less_equal'), ('text', 'keras.backend.less_equal(x, y)'), ('title', 'less_equal')]), OrderedDict([('location', 'backend.html#maximum'), ('text', 'keras.backend.maximum(x, y)'), ('title', 'maximum')]), OrderedDict([('location', 'backend.html#minimum'), ('text', 'keras.backend.minimum(x, y)'), ('title', 'minimum')]), OrderedDict([('location', 'backend.html#sin'), ('text', 'keras.backend.sin(x)'), ('title', 'sin')]), OrderedDict([('location', 'backend.html#cos'), ('text', 'keras.backend.cos(x)'), ('title', 'cos')]), OrderedDict([('location', 'backend.html#normalize_batch_in_training'), ('text', 'keras.backend.normalize_batch_in_training(x, gamma, beta, reduction_axes, epsilon=0.001) Computes mean and std for batch then apply batch_normalization on batch.'), ('title', 'normalize_batch_in_training')]), OrderedDict([('location', 'backend.html#batch_normalization'), ('text', 'keras.backend.batch_normalization(x, mean, var, beta, gamma, axis=-1, epsilon=0.001) Apply batch normalization on x given mean, var, beta and gamma.'), ('title', 'batch_normalization')]), OrderedDict([('location', 'backend.html#concatenate'), ('text', 'keras.backend.concatenate(tensors, axis=-1)'), ('title', 'concatenate')]), OrderedDict([('location', 'backend.html#reshape'), ('text', 'keras.backend.reshape(x, shape)'), ('title', 'reshape')]), OrderedDict([('location', 'backend.html#permute_dimensions'), ('text', 'keras.backend.permute_dimensions(x, pattern) Transpose dimensions. pattern should be a tuple or list of dimension indices, e.g. [0, 2, 1].'), ('title', 'permute_dimensions')]), OrderedDict([('location', 'backend.html#repeat_elements'), ('text', 'keras.backend.repeat_elements(x, rep, axis) Repeat the elements of a tensor along an axis, like np.repeat. If x has shape (s1, s2, s3) and axis=1, the output will have shape (s1, s2 * rep, s3).'), ('title', 'repeat_elements')]), OrderedDict([('location', 'backend.html#resize_images'), ('text', \"keras.backend.resize_images(x, height_factor, width_factor, data_format, interpolation='nearest') Resize the images contained in a 4D tensor of shape - [batch, channels, height, width] (for 'channels_first' data_format) - [batch, height, width, channels] (for 'channels_last' data_format) by a factor of (height_factor, width_factor). Both factors should be positive integers.\"), ('title', 'resize_images')]), OrderedDict([('location', 'backend.html#resize_volumes'), ('text', \"keras.backend.resize_volumes(x, depth_factor, height_factor, width_factor, data_format) Resize the volume contained in a 5D tensor of shape - [batch, channels, depth, height, width] (for 'channels_first' data_format) - [batch, depth, height, width, channels] (for 'channels_last' data_format) by a factor of (depth_factor, height_factor, width_factor). Both factors should be positive integers.\"), ('title', 'resize_volumes')]), OrderedDict([('location', 'backend.html#repeat'), ('text', 'keras.backend.repeat(x, n) Repeat a 2D tensor. If x has shape (samples, dim) and n=2, the output will have shape (samples, 2, dim).'), ('title', 'repeat')]), OrderedDict([('location', 'backend.html#arange'), ('text', 'keras.backend.arange(start, stop=None, step=1, dtype=\\'int32\\') Creates a 1-D tensor containing a sequence of integers. The function arguments use the same convention as Theano\\'s arange: if only one argument is provided, it is in fact the \"stop\" argument. The default type of the returned tensor is \\'int32\\' to match TensorFlow\\'s default.'), ('title', 'arange')]), OrderedDict([('location', 'backend.html#tile'), ('text', 'keras.backend.tile(x, n)'), ('title', 'tile')]), OrderedDict([('location', 'backend.html#flatten'), ('text', 'keras.backend.flatten(x)'), ('title', 'flatten')]), OrderedDict([('location', 'backend.html#batch_flatten'), ('text', 'keras.backend.batch_flatten(x) Turn a n-D tensor into a 2D tensor where the first dimension is conserved.'), ('title', 'batch_flatten')]), OrderedDict([('location', 'backend.html#expand_dims'), ('text', 'keras.backend.expand_dims(x, axis=-1) Add a 1-sized dimension at index \"dim\".'), ('title', 'expand_dims')]), OrderedDict([('location', 'backend.html#squeeze'), ('text', 'keras.backend.squeeze(x, axis) Remove a 1-dimension from the tensor at index \"axis\".'), ('title', 'squeeze')]), OrderedDict([('location', 'backend.html#temporal_padding'), ('text', 'keras.backend.temporal_padding(x, padding=(1, 1)) Pad the middle dimension of a 3D tensor with \"padding\" zeros left and right. Apologies for the inane API, but Theano makes this really hard.'), ('title', 'temporal_padding')]), OrderedDict([('location', 'backend.html#spatial_2d_padding'), ('text', 'keras.backend.spatial_2d_padding(x, padding=((1, 1), (1, 1)), data_format=None) Pad the 2nd and 3rd dimensions of a 4D tensor with \"padding[0]\" and \"padding[1]\" (resp.) zeros left and right.'), ('title', 'spatial_2d_padding')]), OrderedDict([('location', 'backend.html#spatial_3d_padding'), ('text', 'keras.backend.spatial_3d_padding(x, padding=((1, 1), (1, 1), (1, 1)), data_format=None) Pad the 2nd, 3rd and 4th dimensions of a 5D tensor with \"padding[0]\", \"padding[1]\" and \"padding[2]\" (resp.) zeros left and right.'), ('title', 'spatial_3d_padding')]), OrderedDict([('location', 'backend.html#stack'), ('text', 'keras.backend.stack(x, axis=0)'), ('title', 'stack')]), OrderedDict([('location', 'backend.html#one_hot'), ('text', 'keras.backend.one_hot(indices, num_classes) Input: nD integer tensor of shape (batch_size, dim1, dim2, ... dim(n-1)) Output: (n + 1)D one hot representation of the input with shape (batch_size, dim1, dim2, ... dim(n-1), num_classes)'), ('title', 'one_hot')]), OrderedDict([('location', 'backend.html#reverse'), ('text', 'keras.backend.reverse(x, axes) Reverse a tensor along the specified axes'), ('title', 'reverse')]), OrderedDict([('location', 'backend.html#slice'), ('text', 'keras.backend.slice(x, start, size)'), ('title', 'slice')]), OrderedDict([('location', 'backend.html#pattern_broadcast'), ('text', 'keras.backend.pattern_broadcast(x, broadcastable)'), ('title', 'pattern_broadcast')]), OrderedDict([('location', 'backend.html#get_value'), ('text', 'keras.backend.get_value(x)'), ('title', 'get_value')]), OrderedDict([('location', 'backend.html#batch_get_value'), ('text', 'keras.backend.batch_get_value(xs) Returns the value of more than one tensor variable, as a list of Numpy arrays.'), ('title', 'batch_get_value')]), OrderedDict([('location', 'backend.html#set_value'), ('text', 'keras.backend.set_value(x, value)'), ('title', 'set_value')]), OrderedDict([('location', 'backend.html#batch_set_value'), ('text', 'keras.backend.batch_set_value(tuples)'), ('title', 'batch_set_value')]), OrderedDict([('location', 'backend.html#print_tensor'), ('text', \"keras.backend.print_tensor(x, message='') Print the message and the tensor when evaluated and return the same tensor.\"), ('title', 'print_tensor')]), OrderedDict([('location', 'backend.html#function'), ('text', 'keras.backend.function(inputs, outputs, updates=[])'), ('title', 'function')]), OrderedDict([('location', 'backend.html#gradients'), ('text', 'keras.backend.gradients(loss, variables)'), ('title', 'gradients')]), OrderedDict([('location', 'backend.html#stop_gradient'), ('text', 'keras.backend.stop_gradient(variables) Returns variables but with zero gradient w.r.t. every other variable. Arguments variables : tensor or list of tensors to consider constant with respect to any other variable. Returns A single tensor or a list of tensors (depending on the passed argument) that has constant gradient with respect to any other variable.'), ('title', 'stop_gradient')]), OrderedDict([('location', 'backend.html#rnn'), ('text', \"keras.backend.rnn(step_function, inputs, initial_states, go_backwards=False, mask=None, constants=None, unroll=False, input_length=None) Iterates over the time dimension of a tensor. Arguments step_function : Parameters: inputs: Tensor with shape (samples, ...) (no time dimension), representing input for the batch of samples at a certain time step. states: List of tensors. Returns: outputs: Tensor with shape (samples, ...) (no time dimension), new_states: List of tensors, same length and shapes as 'states'. inputs : Tensor of temporal data of shape (samples, time, ...) (at least 3D). initial_states : Tensor with shape (samples, ...) (no time dimension), containing the initial values for the states used in the step function. go_backwards : Boolean. If True, do the iteration over the time dimension in reverse order and return the reversed sequence. mask : Binary tensor with shape (samples, time), with a zero for every element that is masked. constants : A list of constant values passed at each step. unroll : Whether to unroll the RNN or to use a symbolic loop ( while_loop or scan depending on backend). input_length : Static number of timesteps in the input. Must be specified if using unroll . Returns A tuple (last_output, outputs, new_states). last_output: The latest output of the rnn, of shape (samples, ...) outputs: Tensor with shape (samples, time, ...) where each entry outputs[s, t] is the output of the step function at time t for sample s . new_states: List of tensors, latest states returned by the step function, of shape (samples, ...) .\"), ('title', 'rnn')]), OrderedDict([('location', 'backend.html#switch'), ('text', 'keras.backend.switch(condition, then_expression, else_expression) Switches between two operations depending on a scalar value. Note that both then_expression and else_expression should be symbolic tensors of the same shape . Arguments condition : scalar tensor ( int or bool ). then_expression : either a tensor, or a callable that returns a tensor. else_expression : either a tensor, or a callable that returns a tensor. Returns The selected tensor.'), ('title', 'switch')]), OrderedDict([('location', 'backend.html#in_train_phase'), ('text', 'keras.backend.in_train_phase(x, alt, training=None) Selects x in train phase, and alt otherwise. Note that alt should have the same shape as x . Returns Either x or alt based on the training flag. the training flag defaults to K.learning_phase() .'), ('title', 'in_train_phase')]), OrderedDict([('location', 'backend.html#in_test_phase'), ('text', 'keras.backend.in_test_phase(x, alt, training=None) Selects x in test phase, and alt otherwise. Note that alt should have the same shape as x . Returns Either x or alt based on K.learning_phase .'), ('title', 'in_test_phase')]), OrderedDict([('location', 'backend.html#elu'), ('text', 'keras.backend.elu(x, alpha=1.0) Exponential linear unit Arguments x : Tensor to compute the activation function for. alpha : scalar'), ('title', 'elu')]), OrderedDict([('location', 'backend.html#relu'), ('text', 'keras.backend.relu(x, alpha=0.0, max_value=None, threshold=0.0)'), ('title', 'relu')]), OrderedDict([('location', 'backend.html#softmax'), ('text', 'keras.backend.softmax(x, axis=-1)'), ('title', 'softmax')]), OrderedDict([('location', 'backend.html#softplus'), ('text', 'keras.backend.softplus(x)'), ('title', 'softplus')]), OrderedDict([('location', 'backend.html#softsign'), ('text', 'keras.backend.softsign(x)'), ('title', 'softsign')]), OrderedDict([('location', 'backend.html#categorical_crossentropy'), ('text', 'keras.backend.categorical_crossentropy(target, output, from_logits=False, axis=-1)'), ('title', 'categorical_crossentropy')]), OrderedDict([('location', 'backend.html#sparse_categorical_crossentropy'), ('text', 'keras.backend.sparse_categorical_crossentropy(target, output, from_logits=False, axis=-1)'), ('title', 'sparse_categorical_crossentropy')]), OrderedDict([('location', 'backend.html#binary_crossentropy'), ('text', 'keras.backend.binary_crossentropy(target, output, from_logits=False)'), ('title', 'binary_crossentropy')]), OrderedDict([('location', 'backend.html#sigmoid'), ('text', 'keras.backend.sigmoid(x)'), ('title', 'sigmoid')]), OrderedDict([('location', 'backend.html#hard_sigmoid'), ('text', 'keras.backend.hard_sigmoid(x)'), ('title', 'hard_sigmoid')]), OrderedDict([('location', 'backend.html#tanh'), ('text', 'keras.backend.tanh(x)'), ('title', 'tanh')]), OrderedDict([('location', 'backend.html#dropout'), ('text', 'keras.backend.dropout(x, level, noise_shape=None, seed=None) Sets entries in x to zero at random, while scaling the entire tensor. Arguments x : tensor level : fraction of the entries in the tensor that will be set to 0. noise_shape : shape for randomly generated keep/drop flags, must be broadcastable to the shape of x seed : random seed to ensure determinism.'), ('title', 'dropout')]), OrderedDict([('location', 'backend.html#l2_normalize'), ('text', 'keras.backend.l2_normalize(x, axis=None)'), ('title', 'l2_normalize')]), OrderedDict([('location', 'backend.html#in_top_k'), ('text', 'keras.backend.in_top_k(predictions, targets, k) Returns whether the targets are in the top k predictions . Arguments predictions : A tensor of shape (batch_size, classes) and type float32 . targets : A 1D tensor of length batch_size and type int32 or int64 . k : An int , number of top elements to consider. Returns A 1D tensor of length batch_size and type bool . output[i] is True if predictions[i, targets[i]] is within top- k values of predictions[i] .'), ('title', 'in_top_k')]), OrderedDict([('location', 'backend.html#conv1d'), ('text', 'keras.backend.conv1d(x, kernel, strides=1, padding=\\'valid\\', data_format=None, dilation_rate=1) 1D convolution. Arguments kernel : kernel tensor. strides : stride integer. padding : string, \"same\" , \"causal\" or \"valid\" . data_format : string, one of \"channels_last\", \"channels_first\" dilation_rate : integer.'), ('title', 'conv1d')]), OrderedDict([('location', 'backend.html#conv2d'), ('text', 'keras.backend.conv2d(x, kernel, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D convolution. Arguments kernel : kernel tensor. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs.'), ('title', 'conv2d')]), OrderedDict([('location', 'backend.html#conv2d_transpose'), ('text', 'keras.backend.conv2d_transpose(x, kernel, output_shape, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D deconvolution (transposed convolution). Arguments kernel : kernel tensor. output_shape : desired dimensions of output. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs. dilation_rate : tuple of 2 integers. Raises ValueError : if using an even kernel size with padding \\'same\\'.'), ('title', 'conv2d_transpose')]), OrderedDict([('location', 'backend.html#separable_conv1d'), ('text', 'keras.backend.separable_conv1d(x, depthwise_kernel, pointwise_kernel, strides=1, padding=\\'valid\\', data_format=None, dilation_rate=1) 1D convolution with separable filters. Arguments x : input tensor depthwise_kernel : convolution kernel for the depthwise convolution. pointwise_kernel : kernel for the 1x1 convolution. strides : strides integer. padding : string, \"same\" or \"valid\" . data_format : string, \"channels_last\" or \"channels_first\" . dilation_rate : integer dilation rate. Returns Output tensor. Raises ValueError : if data_format is neither \"channels_last\" or \"channels_first\" .'), ('title', 'separable_conv1d')]), OrderedDict([('location', 'backend.html#separable_conv2d'), ('text', 'keras.backend.separable_conv2d(x, depthwise_kernel, pointwise_kernel, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D convolution with separable filters. Arguments x : input tensor depthwise_kernel : convolution kernel for the depthwise convolution. pointwise_kernel : kernel for the 1x1 convolution. strides : strides tuple (length 2). padding : string, \"same\" or \"valid\" . data_format : string, \"channels_last\" or \"channels_first\" . dilation_rate : tuple of integers, dilation rates for the separable convolution. Returns Output tensor. Raises ValueError : if data_format is neither \"channels_last\" or \"channels_first\" .'), ('title', 'separable_conv2d')]), OrderedDict([('location', 'backend.html#depthwise_conv2d'), ('text', 'keras.backend.depthwise_conv2d(x, depthwise_kernel, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1)) 2D convolution with separable filters. Arguments x : input tensor depthwise_kernel : convolution kernel for the depthwise convolution. strides : strides tuple (length 2). padding : string, \"same\" or \"valid\" . data_format : string, \"channels_last\" or \"channels_first\" . dilation_rate : tuple of integers, dilation rates for the separable convolution. Returns Output tensor. Raises ValueError : if data_format is neither \"channels_last\" or \"channels_first\" .'), ('title', 'depthwise_conv2d')]), OrderedDict([('location', 'backend.html#conv3d'), ('text', 'keras.backend.conv3d(x, kernel, strides=(1, 1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1, 1)) 3D convolution. Arguments kernel : kernel tensor. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs.'), ('title', 'conv3d')]), OrderedDict([('location', 'backend.html#conv3d_transpose'), ('text', 'keras.backend.conv3d_transpose(x, kernel, output_shape, strides=(1, 1, 1), padding=\\'valid\\', data_format=None) 3D deconvolution (transposed convolution). Arguments kernel : kernel tensor. output_shape : desired dimensions of output. strides : strides tuple. padding : string, \"same\" or \"valid\". data_format : \"channels_last\" or \"channels_first\". Whether to use Theano or TensorFlow data format in inputs/kernels/outputs. Raises ValueError : if using an even kernel size with padding \\'same\\'.'), ('title', 'conv3d_transpose')]), OrderedDict([('location', 'backend.html#pool2d'), ('text', \"keras.backend.pool2d(x, pool_size, strides=(1, 1), padding='valid', data_format=None, pool_mode='max')\"), ('title', 'pool2d')]), OrderedDict([('location', 'backend.html#pool3d'), ('text', \"keras.backend.pool3d(x, pool_size, strides=(1, 1, 1), padding='valid', data_format=None, pool_mode='max')\"), ('title', 'pool3d')]), OrderedDict([('location', 'backend.html#bias_add'), ('text', 'keras.backend.bias_add(x, bias, data_format=None)'), ('title', 'bias_add')]), OrderedDict([('location', 'backend.html#random_normal'), ('text', 'keras.backend.random_normal(shape, mean=0.0, stddev=1.0, dtype=None, seed=None)'), ('title', 'random_normal')]), OrderedDict([('location', 'backend.html#random_uniform'), ('text', 'keras.backend.random_uniform(shape, minval=0.0, maxval=1.0, dtype=None, seed=None)'), ('title', 'random_uniform')]), OrderedDict([('location', 'backend.html#random_binomial'), ('text', 'keras.backend.random_binomial(shape, p=0.0, dtype=None, seed=None)'), ('title', 'random_binomial')]), OrderedDict([('location', 'backend.html#truncated_normal'), ('text', 'keras.backend.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=None, seed=None)'), ('title', 'truncated_normal')]), OrderedDict([('location', 'backend.html#ctc_interleave_blanks'), ('text', 'keras.backend.ctc_interleave_blanks(Y)'), ('title', 'ctc_interleave_blanks')]), OrderedDict([('location', 'backend.html#ctc_create_skip_idxs'), ('text', 'keras.backend.ctc_create_skip_idxs(Y)'), ('title', 'ctc_create_skip_idxs')]), OrderedDict([('location', 'backend.html#ctc_update_log_p'), ('text', 'keras.backend.ctc_update_log_p(skip_idxs, zeros, active, log_p_curr, log_p_prev)'), ('title', 'ctc_update_log_p')]), OrderedDict([('location', 'backend.html#ctc_path_probs'), ('text', 'keras.backend.ctc_path_probs(predict, Y, alpha=0.0001)'), ('title', 'ctc_path_probs')]), OrderedDict([('location', 'backend.html#ctc_cost'), ('text', 'keras.backend.ctc_cost(predict, Y)'), ('title', 'ctc_cost')]), OrderedDict([('location', 'backend.html#ctc_batch_cost'), ('text', 'keras.backend.ctc_batch_cost(y_true, y_pred, input_length, label_length) Runs CTC loss algorithm on each batch element. Arguments y_true : tensor (samples, max_string_length) containing the truth labels y_pred : tensor (samples, time_steps, num_categories) containing the prediction, or output of the softmax input_length : tensor (samples,1) containing the sequence length for each batch item in y_pred label_length : tensor (samples,1) containing the sequence length for each batch item in y_true Returns Tensor with shape (samples,1) containing the CTC loss of each element'), ('title', 'ctc_batch_cost')]), OrderedDict([('location', 'backend.html#map_fn'), ('text', 'keras.backend.map_fn(fn, elems, name=None, dtype=None) Map the function fn over the elements elems and return the outputs. Arguments fn : Callable that will be called upon each element in elems elems : tensor, at least 2 dimensional name : A string name for the map node in the graph Returns Tensor with first dimension equal to the elems and second depending on fn'), ('title', 'map_fn')]), OrderedDict([('location', 'backend.html#foldl'), ('text', 'keras.backend.foldl(fn, elems, initializer=None, name=None) Reduce elems using fn to combine them from left to right. Arguments fn : Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x elems : tensor initializer : The first value used (elems[0] in case of None) name : A string name for the foldl node in the graph Returns Same type and shape as initializer'), ('title', 'foldl')]), OrderedDict([('location', 'backend.html#foldr'), ('text', 'keras.backend.foldr(fn, elems, initializer=None, name=None) Reduce elems using fn to combine them from right to left. Arguments fn : Callable that will be called upon each element in elems and an accumulator, for instance lambda acc, x: acc + x elems : tensor initializer : The first value used (elems[-1] in case of None) name : A string name for the foldr node in the graph Returns Same type and shape as initializer'), ('title', 'foldr')]), OrderedDict([('location', 'backend.html#local_conv1d'), ('text', 'keras.backend.local_conv1d(inputs, kernel, kernel_size, strides, data_format=None)'), ('title', 'local_conv1d')]), OrderedDict([('location', 'backend.html#local_conv2d'), ('text', 'keras.backend.local_conv2d(inputs, kernel, kernel_size, strides, output_shape, data_format=None)'), ('title', 'local_conv2d')]), OrderedDict([('location', 'backend.html#update_add'), ('text', 'keras.backend.update_add(x, increment)'), ('title', 'update_add')]), OrderedDict([('location', 'backend.html#epsilon'), ('text', 'keras.backend.epsilon() Returns the value of the fuzz factor used in numeric expressions. Returns A float. Example >>> keras.backend.epsilon() 1e-07'), ('title', 'epsilon')]), OrderedDict([('location', 'backend.html#set_epsilon'), ('text', 'keras.backend.set_epsilon(e) Sets the value of the fuzz factor used in numeric expressions. Arguments e : float. New value of epsilon. Example >>> from keras import backend as K >>> K.epsilon() 1e-07 >>> K.set_epsilon(1e-05) >>> K.epsilon() 1e-05'), ('title', 'set_epsilon')]), OrderedDict([('location', 'backend.html#floatx'), ('text', \"keras.backend.floatx() Returns the default float type, as a string. (e.g. 'float16', 'float32', 'float64'). Returns String, the current default float type. Example >>> keras.backend.floatx() 'float32'\"), ('title', 'floatx')]), OrderedDict([('location', 'backend.html#set_floatx'), ('text', \"keras.backend.set_floatx(floatx) Sets the default float type. Arguments floatx : String, 'float16', 'float32', or 'float64'. Example >>> from keras import backend as K >>> K.floatx() 'float32' >>> K.set_floatx('float16') >>> K.floatx() 'float16'\"), ('title', 'set_floatx')]), OrderedDict([('location', 'backend.html#cast_to_floatx'), ('text', \"keras.backend.cast_to_floatx(x) Cast a Numpy array to the default Keras float type. Arguments x : Numpy array. Returns The same Numpy array, cast to its new type. Example >>> from keras import backend as K >>> K.floatx() 'float32' >>> arr = numpy.array([1.0, 2.0], dtype='float64') >>> arr.dtype dtype('float64') >>> new_arr = K.cast_to_floatx(arr) >>> new_arr array([ 1., 2.], dtype=float32) >>> new_arr.dtype dtype('float32')\"), ('title', 'cast_to_floatx')]), OrderedDict([('location', 'backend.html#image_data_format'), ('text', \"keras.backend.image_data_format() Returns the default image data format convention ('channels_first' or 'channels_last'). Returns A string, either 'channels_first' or 'channels_last' Example >>> keras.backend.image_data_format() 'channels_first'\"), ('title', 'image_data_format')]), OrderedDict([('location', 'backend.html#set_image_data_format'), ('text', \"keras.backend.set_image_data_format(data_format) Sets the value of the data format convention. Arguments data_format : string. 'channels_first' or 'channels_last' . Example >>> from keras import backend as K >>> K.image_data_format() 'channels_first' >>> K.set_image_data_format('channels_last') >>> K.image_data_format() 'channels_last'\"), ('title', 'set_image_data_format')]), OrderedDict([('location', 'backend.html#backend'), ('text', \"keras.backend.backend() Publicly accessible method for determining the current backend. Returns String, the name of the backend Keras is currently using. Example >>> keras.backend.backend() 'tensorflow'\"), ('title', 'backend')]), OrderedDict([('location', 'callbacks.html'), ('text', 'Usage of callbacks A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callbacks (as the keyword argument callbacks ) to the .fit() method of the Sequential or Model classes. The relevant methods of the callbacks will then be called at each stage of the training. [source] Callback keras.callbacks.Callback() Abstract base class used to build new callbacks. Properties params : dict. Training parameters (eg. verbosity, batch size, number of epochs...). model : instance of keras.models.Model . Reference of the model being trained. The logs dictionary that callback methods take as argument will contain keys for quantities relevant to the current batch or epoch. Currently, the .fit() method of the Sequential model class will include the following quantities in the logs that it passes to its callbacks: on_epoch_end: logs include acc and loss , and optionally include val_loss (if validation is enabled in fit ), and val_acc (if validation and accuracy monitoring are enabled). on_batch_begin: logs include size , the number of samples in the current batch. on_batch_end: logs include loss , and optionally acc (if accuracy monitoring is enabled). [source] BaseLogger keras.callbacks.BaseLogger(stateful_metrics=None) Callback that accumulates epoch averages of metrics. This callback is automatically applied to every Keras model. Arguments stateful_metrics : Iterable of string names of metrics that should not be averaged over an epoch. Metrics in this list will be logged as-is in on_epoch_end . All others will be averaged in on_epoch_end . [source] TerminateOnNaN keras.callbacks.TerminateOnNaN() Callback that terminates training when a NaN loss is encountered. [source] ProgbarLogger keras.callbacks.ProgbarLogger(count_mode=\\'samples\\', stateful_metrics=None) Callback that prints metrics to stdout. Arguments count_mode : One of \"steps\" or \"samples\". Whether the progress bar should count samples seen or steps (batches) seen. stateful_metrics : Iterable of string names of metrics that should not be averaged over an epoch. Metrics in this list will be logged as-is. All others will be averaged over time (e.g. loss, etc). Raises ValueError : In case of invalid count_mode . [source] History keras.callbacks.History() Callback that records events into a History object. This callback is automatically applied to every Keras model. The History object gets returned by the fit method of models. [source] ModelCheckpoint keras.callbacks.ModelCheckpoint(filepath, monitor=\\'val_loss\\', verbose=0, save_best_only=False, save_weights_only=False, mode=\\'auto\\', period=1) Save the model after every epoch. filepath can contain named formatting options, which will be filled the value of epoch and keys in logs (passed in on_epoch_end ). For example: if filepath is weights.{epoch:02d}-{val_loss:.2f}.hdf5 , then the model checkpoints will be saved with the epoch number and the validation loss in the filename. Arguments filepath : string, path to save the model file. monitor : quantity to monitor. verbose : verbosity mode, 0 or 1. save_best_only : if save_best_only=True , the latest best model according to the quantity monitored will not be overwritten. mode : one of {auto, min, max}. If save_best_only=True , the decision to overwrite the current save file is made based on either the maximization or the minimization of the monitored quantity. For val_acc , this should be max , for val_loss this should be min , etc. In auto mode, the direction is automatically inferred from the name of the monitored quantity. save_weights_only : if True, then only the model\\'s weights will be saved ( model.save_weights(filepath) ), else the full model is saved ( model.save(filepath) ). period : Interval (number of epochs) between checkpoints. [source] EarlyStopping keras.callbacks.EarlyStopping(monitor=\\'val_loss\\', min_delta=0, patience=0, verbose=0, mode=\\'auto\\', baseline=None, restore_best_weights=False) Stop training when a monitored quantity has stopped improving. Arguments monitor : quantity to be monitored. min_delta : minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement. patience : number of epochs with no improvement after which training will be stopped. verbose : verbosity mode. mode : one of {auto, min, max}. In min mode, training will stop when the quantity monitored has stopped decreasing; in max mode it will stop when the quantity monitored has stopped increasing; in auto mode, the direction is automatically inferred from the name of the monitored quantity. baseline : Baseline value for the monitored quantity to reach. Training will stop if the model doesn\\'t show improvement over the baseline. restore_best_weights : whether to restore model weights from the epoch with the best value of the monitored quantity. If False, the model weights obtained at the last step of training are used. [source] RemoteMonitor keras.callbacks.RemoteMonitor(root=\\'http://localhost:9000\\', path=\\'/publish/epoch/end/\\', field=\\'data\\', headers=None, send_as_json=False) Callback used to stream events to a server. Requires the requests library. Events are sent to root + \\'/publish/epoch/end/\\' by default. Calls are HTTP POST, with a data argument which is a JSON-encoded dictionary of event data. If send_as_json is set to True, the content type of the request will be application/json. Otherwise the serialized JSON will be send within a form Arguments root : String; root url of the target server. path : String; path relative to root to which the events will be sent. field : String; JSON field under which the data will be stored. The field is used only if the payload is sent within a form (i.e. send_as_json is set to False). headers : Dictionary; optional custom HTTP headers. send_as_json : Boolean; whether the request should be send as application/json. [source] LearningRateScheduler keras.callbacks.LearningRateScheduler(schedule, verbose=0) Learning rate scheduler. Arguments schedule : a function that takes an epoch index as input (integer, indexed from 0) and current learning rate and returns a new learning rate as output (float). verbose : int. 0: quiet, 1: update messages. [source] TensorBoard keras.callbacks.TensorBoard(log_dir=\\'./logs\\', histogram_freq=0, batch_size=32, write_graph=True, write_grads=False, write_images=False, embeddings_freq=0, embeddings_layer_names=None, embeddings_metadata=None, embeddings_data=None, update_freq=\\'epoch\\') TensorBoard basic visualizations. TensorBoard is a visualization tool provided with TensorFlow. This callback writes a log for TensorBoard, which allows you to visualize dynamic graphs of your training and test metrics, as well as activation histograms for the different layers in your model. If you have installed TensorFlow with pip, you should be able to launch TensorBoard from the command line: tensorboard --logdir=/full_path_to_your_logs When using a backend other than TensorFlow, TensorBoard will still work (if you have TensorFlow installed), but the only feature available will be the display of the losses and metrics plots. Arguments log_dir : the path of the directory where to save the log files to be parsed by TensorBoard. histogram_freq : frequency (in epochs) at which to compute activation and weight histograms for the layers of the model. If set to 0, histograms won\\'t be computed. Validation data (or split) must be specified for histogram visualizations. write_graph : whether to visualize the graph in TensorBoard. The log file can become quite large when write_graph is set to True. write_grads : whether to visualize gradient histograms in TensorBoard. histogram_freq must be greater than 0. batch_size : size of batch of inputs to feed to the network for histograms computation. write_images : whether to write model weights to visualize as image in TensorBoard. embeddings_freq : frequency (in epochs) at which selected embedding layers will be saved. If set to 0, embeddings won\\'t be computed. Data to be visualized in TensorBoard\\'s Embedding tab must be passed as embeddings_data . embeddings_layer_names : a list of names of layers to keep eye on. If None or empty list all the embedding layer will be watched. embeddings_metadata : a dictionary which maps layer name to a file name in which metadata for this embedding layer is saved. See the details about metadata files format. In case if the same metadata file is used for all embedding layers, string can be passed. embeddings_data : data to be embedded at layers specified in embeddings_layer_names . Numpy array (if the model has a single input) or list of Numpy arrays (if the model has multiple inputs). Learn [more about embeddings] (https://www.tensorflow.org/programmers_guide/embedding). update_freq : \\'batch\\' or \\'epoch\\' or integer. When using \\'batch\\' , writes the losses and metrics to TensorBoard after each batch. The same applies for \\'epoch\\' . If using an integer, let\\'s say 10000 , the callback will write the metrics and losses to TensorBoard every 10000 samples. Note that writing too frequently to TensorBoard can slow down your training. [source] ReduceLROnPlateau keras.callbacks.ReduceLROnPlateau(monitor=\\'val_loss\\', factor=0.1, patience=10, verbose=0, mode=\\'auto\\', min_delta=0.0001, cooldown=0, min_lr=0) Reduce learning rate when a metric has stopped improving. Models often benefit from reducing the learning rate by a factor of 2-10 once learning stagnates. This callback monitors a quantity and if no improvement is seen for a \\'patience\\' number of epochs, the learning rate is reduced. Example reduce_lr = ReduceLROnPlateau(monitor=\\'val_loss\\', factor=0.2, patience=5, min_lr=0.001) model.fit(X_train, Y_train, callbacks=[reduce_lr]) Arguments monitor : quantity to be monitored. factor : factor by which the learning rate will be reduced. new_lr = lr * factor patience : number of epochs with no improvement after which learning rate will be reduced. verbose : int. 0: quiet, 1: update messages. mode : one of {auto, min, max}. In min mode, lr will be reduced when the quantity monitored has stopped decreasing; in max mode it will be reduced when the quantity monitored has stopped increasing; in auto mode, the direction is automatically inferred from the name of the monitored quantity. min_delta : threshold for measuring the new optimum, to only focus on significant changes. cooldown : number of epochs to wait before resuming normal operation after lr has been reduced. min_lr : lower bound on the learning rate. [source] CSVLogger keras.callbacks.CSVLogger(filename, separator=\\',\\', append=False) Callback that streams epoch results to a csv file. Supports all values that can be represented as a string, including 1D iterables such as np.ndarray. Example csv_logger = CSVLogger(\\'training.log\\') model.fit(X_train, Y_train, callbacks=[csv_logger]) Arguments filename : filename of the csv file, e.g. \\'run/log.csv\\'. separator : string used to separate elements in the csv file. append : True: append if file exists (useful for continuing training). False: overwrite existing file, [source] LambdaCallback keras.callbacks.LambdaCallback(on_epoch_begin=None, on_epoch_end=None, on_batch_begin=None, on_batch_end=None, on_train_begin=None, on_train_end=None) Callback for creating simple, custom callbacks on-the-fly. This callback is constructed with anonymous functions that will be called at the appropriate time. Note that the callbacks expects positional arguments, as: on_epoch_begin and on_epoch_end expect two positional arguments: epoch , logs on_batch_begin and on_batch_end expect two positional arguments: batch , logs on_train_begin and on_train_end expect one positional argument: logs Arguments on_epoch_begin : called at the beginning of every epoch. on_epoch_end : called at the end of every epoch. on_batch_begin : called at the beginning of every batch. on_batch_end : called at the end of every batch. on_train_begin : called at the beginning of model training. on_train_end : called at the end of model training. Example # Print the batch number at the beginning of every batch. batch_print_callback = LambdaCallback( on_batch_begin=lambda batch,logs: print(batch)) # Stream the epoch loss to a file in JSON format. The file content # is not well-formed JSON but rather has a JSON object per line. import json json_log = open(\\'loss_log.json\\', mode=\\'wt\\', buffering=1) json_logging_callback = LambdaCallback( on_epoch_end=lambda epoch, logs: json_log.write( json.dumps({\\'epoch\\': epoch, \\'loss\\': logs[\\'loss\\']}) + \\'\\\\n\\'), on_train_end=lambda logs: json_log.close() ) # Terminate some processes after having finished model training. processes = ... cleanup_callback = LambdaCallback( on_train_end=lambda logs: [ p.terminate() for p in processes if p.is_alive()]) model.fit(..., callbacks=[batch_print_callback, json_logging_callback, cleanup_callback]) Create a callback You can create a custom callback by extending the base class keras.callbacks.Callback . A callback has access to its associated model through the class property self.model . Here\\'s a simple example saving a list of losses over each batch during training: class LossHistory(keras.callbacks.Callback): def on_train_begin(self, logs={}): self.losses = [] def on_batch_end(self, batch, logs={}): self.losses.append(logs.get(\\'loss\\')) Example: recording loss history class LossHistory(keras.callbacks.Callback): def on_train_begin(self, logs={}): self.losses = [] def on_batch_end(self, batch, logs={}): self.losses.append(logs.get(\\'loss\\')) model = Sequential() model.add(Dense(10, input_dim=784, kernel_initializer=\\'uniform\\')) model.add(Activation(\\'softmax\\')) model.compile(loss=\\'categorical_crossentropy\\', optimizer=\\'rmsprop\\') history = LossHistory() model.fit(x_train, y_train, batch_size=128, epochs=20, verbose=0, callbacks=[history]) print(history.losses) # outputs \\'\\'\\' [0.66047596406559383, 0.3547245744908703, ..., 0.25953155204159617, 0.25901699725311789] \\'\\'\\' Example: model checkpoints from keras.callbacks import ModelCheckpoint model = Sequential() model.add(Dense(10, input_dim=784, kernel_initializer=\\'uniform\\')) model.add(Activation(\\'softmax\\')) model.compile(loss=\\'categorical_crossentropy\\', optimizer=\\'rmsprop\\') \\'\\'\\' saves the model weights after each epoch if the validation loss decreased \\'\\'\\' checkpointer = ModelCheckpoint(filepath=\\'/tmp/weights.hdf5\\', verbose=1, save_best_only=True) model.fit(x_train, y_train, batch_size=128, epochs=20, verbose=0, validation_data=(X_test, Y_test), callbacks=[checkpointer])'), ('title', 'Callbacks')]), OrderedDict([('location', 'callbacks.html#usage-of-callbacks'), ('text', 'A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callbacks (as the keyword argument callbacks ) to the .fit() method of the Sequential or Model classes. The relevant methods of the callbacks will then be called at each stage of the training. [source]'), ('title', 'Usage of callbacks')]), OrderedDict([('location', 'callbacks.html#callback'), ('text', 'keras.callbacks.Callback() Abstract base class used to build new callbacks. Properties params : dict. Training parameters (eg. verbosity, batch size, number of epochs...). model : instance of keras.models.Model . Reference of the model being trained. The logs dictionary that callback methods take as argument will contain keys for quantities relevant to the current batch or epoch. Currently, the .fit() method of the Sequential model class will include the following quantities in the logs that it passes to its callbacks: on_epoch_end: logs include acc and loss , and optionally include val_loss (if validation is enabled in fit ), and val_acc (if validation and accuracy monitoring are enabled). on_batch_begin: logs include size , the number of samples in the current batch. on_batch_end: logs include loss , and optionally acc (if accuracy monitoring is enabled). [source]'), ('title', 'Callback')]), OrderedDict([('location', 'callbacks.html#baselogger'), ('text', 'keras.callbacks.BaseLogger(stateful_metrics=None) Callback that accumulates epoch averages of metrics. This callback is automatically applied to every Keras model. Arguments stateful_metrics : Iterable of string names of metrics that should not be averaged over an epoch. Metrics in this list will be logged as-is in on_epoch_end . All others will be averaged in on_epoch_end . [source]'), ('title', 'BaseLogger')]), OrderedDict([('location', 'callbacks.html#terminateonnan'), ('text', 'keras.callbacks.TerminateOnNaN() Callback that terminates training when a NaN loss is encountered. [source]'), ('title', 'TerminateOnNaN')]), OrderedDict([('location', 'callbacks.html#progbarlogger'), ('text', 'keras.callbacks.ProgbarLogger(count_mode=\\'samples\\', stateful_metrics=None) Callback that prints metrics to stdout. Arguments count_mode : One of \"steps\" or \"samples\". Whether the progress bar should count samples seen or steps (batches) seen. stateful_metrics : Iterable of string names of metrics that should not be averaged over an epoch. Metrics in this list will be logged as-is. All others will be averaged over time (e.g. loss, etc). Raises ValueError : In case of invalid count_mode . [source]'), ('title', 'ProgbarLogger')]), OrderedDict([('location', 'callbacks.html#history'), ('text', 'keras.callbacks.History() Callback that records events into a History object. This callback is automatically applied to every Keras model. The History object gets returned by the fit method of models. [source]'), ('title', 'History')]), OrderedDict([('location', 'callbacks.html#modelcheckpoint'), ('text', \"keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1) Save the model after every epoch. filepath can contain named formatting options, which will be filled the value of epoch and keys in logs (passed in on_epoch_end ). For example: if filepath is weights.{epoch:02d}-{val_loss:.2f}.hdf5 , then the model checkpoints will be saved with the epoch number and the validation loss in the filename. Arguments filepath : string, path to save the model file. monitor : quantity to monitor. verbose : verbosity mode, 0 or 1. save_best_only : if save_best_only=True , the latest best model according to the quantity monitored will not be overwritten. mode : one of {auto, min, max}. If save_best_only=True , the decision to overwrite the current save file is made based on either the maximization or the minimization of the monitored quantity. For val_acc , this should be max , for val_loss this should be min , etc. In auto mode, the direction is automatically inferred from the name of the monitored quantity. save_weights_only : if True, then only the model's weights will be saved ( model.save_weights(filepath) ), else the full model is saved ( model.save(filepath) ). period : Interval (number of epochs) between checkpoints. [source]\"), ('title', 'ModelCheckpoint')]), OrderedDict([('location', 'callbacks.html#earlystopping'), ('text', \"keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=0, verbose=0, mode='auto', baseline=None, restore_best_weights=False) Stop training when a monitored quantity has stopped improving. Arguments monitor : quantity to be monitored. min_delta : minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement. patience : number of epochs with no improvement after which training will be stopped. verbose : verbosity mode. mode : one of {auto, min, max}. In min mode, training will stop when the quantity monitored has stopped decreasing; in max mode it will stop when the quantity monitored has stopped increasing; in auto mode, the direction is automatically inferred from the name of the monitored quantity. baseline : Baseline value for the monitored quantity to reach. Training will stop if the model doesn't show improvement over the baseline. restore_best_weights : whether to restore model weights from the epoch with the best value of the monitored quantity. If False, the model weights obtained at the last step of training are used. [source]\"), ('title', 'EarlyStopping')]), OrderedDict([('location', 'callbacks.html#remotemonitor'), ('text', \"keras.callbacks.RemoteMonitor(root='http://localhost:9000', path='/publish/epoch/end/', field='data', headers=None, send_as_json=False) Callback used to stream events to a server. Requires the requests library. Events are sent to root + '/publish/epoch/end/' by default. Calls are HTTP POST, with a data argument which is a JSON-encoded dictionary of event data. If send_as_json is set to True, the content type of the request will be application/json. Otherwise the serialized JSON will be send within a form Arguments root : String; root url of the target server. path : String; path relative to root to which the events will be sent. field : String; JSON field under which the data will be stored. The field is used only if the payload is sent within a form (i.e. send_as_json is set to False). headers : Dictionary; optional custom HTTP headers. send_as_json : Boolean; whether the request should be send as application/json. [source]\"), ('title', 'RemoteMonitor')]), OrderedDict([('location', 'callbacks.html#learningratescheduler'), ('text', 'keras.callbacks.LearningRateScheduler(schedule, verbose=0) Learning rate scheduler. Arguments schedule : a function that takes an epoch index as input (integer, indexed from 0) and current learning rate and returns a new learning rate as output (float). verbose : int. 0: quiet, 1: update messages. [source]'), ('title', 'LearningRateScheduler')]), OrderedDict([('location', 'callbacks.html#tensorboard'), ('text', \"keras.callbacks.TensorBoard(log_dir='./logs', histogram_freq=0, batch_size=32, write_graph=True, write_grads=False, write_images=False, embeddings_freq=0, embeddings_layer_names=None, embeddings_metadata=None, embeddings_data=None, update_freq='epoch') TensorBoard basic visualizations. TensorBoard is a visualization tool provided with TensorFlow. This callback writes a log for TensorBoard, which allows you to visualize dynamic graphs of your training and test metrics, as well as activation histograms for the different layers in your model. If you have installed TensorFlow with pip, you should be able to launch TensorBoard from the command line: tensorboard --logdir=/full_path_to_your_logs When using a backend other than TensorFlow, TensorBoard will still work (if you have TensorFlow installed), but the only feature available will be the display of the losses and metrics plots. Arguments log_dir : the path of the directory where to save the log files to be parsed by TensorBoard. histogram_freq : frequency (in epochs) at which to compute activation and weight histograms for the layers of the model. If set to 0, histograms won't be computed. Validation data (or split) must be specified for histogram visualizations. write_graph : whether to visualize the graph in TensorBoard. The log file can become quite large when write_graph is set to True. write_grads : whether to visualize gradient histograms in TensorBoard. histogram_freq must be greater than 0. batch_size : size of batch of inputs to feed to the network for histograms computation. write_images : whether to write model weights to visualize as image in TensorBoard. embeddings_freq : frequency (in epochs) at which selected embedding layers will be saved. If set to 0, embeddings won't be computed. Data to be visualized in TensorBoard's Embedding tab must be passed as embeddings_data . embeddings_layer_names : a list of names of layers to keep eye on. If None or empty list all the embedding layer will be watched. embeddings_metadata : a dictionary which maps layer name to a file name in which metadata for this embedding layer is saved. See the details about metadata files format. In case if the same metadata file is used for all embedding layers, string can be passed. embeddings_data : data to be embedded at layers specified in embeddings_layer_names . Numpy array (if the model has a single input) or list of Numpy arrays (if the model has multiple inputs). Learn [more about embeddings] (https://www.tensorflow.org/programmers_guide/embedding). update_freq : 'batch' or 'epoch' or integer. When using 'batch' , writes the losses and metrics to TensorBoard after each batch. The same applies for 'epoch' . If using an integer, let's say 10000 , the callback will write the metrics and losses to TensorBoard every 10000 samples. Note that writing too frequently to TensorBoard can slow down your training. [source]\"), ('title', 'TensorBoard')]), OrderedDict([('location', 'callbacks.html#reducelronplateau'), ('text', \"keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', min_delta=0.0001, cooldown=0, min_lr=0) Reduce learning rate when a metric has stopped improving. Models often benefit from reducing the learning rate by a factor of 2-10 once learning stagnates. This callback monitors a quantity and if no improvement is seen for a 'patience' number of epochs, the learning rate is reduced. Example reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.001) model.fit(X_train, Y_train, callbacks=[reduce_lr]) Arguments monitor : quantity to be monitored. factor : factor by which the learning rate will be reduced. new_lr = lr * factor patience : number of epochs with no improvement after which learning rate will be reduced. verbose : int. 0: quiet, 1: update messages. mode : one of {auto, min, max}. In min mode, lr will be reduced when the quantity monitored has stopped decreasing; in max mode it will be reduced when the quantity monitored has stopped increasing; in auto mode, the direction is automatically inferred from the name of the monitored quantity. min_delta : threshold for measuring the new optimum, to only focus on significant changes. cooldown : number of epochs to wait before resuming normal operation after lr has been reduced. min_lr : lower bound on the learning rate. [source]\"), ('title', 'ReduceLROnPlateau')]), OrderedDict([('location', 'callbacks.html#csvlogger'), ('text', \"keras.callbacks.CSVLogger(filename, separator=',', append=False) Callback that streams epoch results to a csv file. Supports all values that can be represented as a string, including 1D iterables such as np.ndarray. Example csv_logger = CSVLogger('training.log') model.fit(X_train, Y_train, callbacks=[csv_logger]) Arguments filename : filename of the csv file, e.g. 'run/log.csv'. separator : string used to separate elements in the csv file. append : True: append if file exists (useful for continuing training). False: overwrite existing file, [source]\"), ('title', 'CSVLogger')]), OrderedDict([('location', 'callbacks.html#lambdacallback'), ('text', \"keras.callbacks.LambdaCallback(on_epoch_begin=None, on_epoch_end=None, on_batch_begin=None, on_batch_end=None, on_train_begin=None, on_train_end=None) Callback for creating simple, custom callbacks on-the-fly. This callback is constructed with anonymous functions that will be called at the appropriate time. Note that the callbacks expects positional arguments, as: on_epoch_begin and on_epoch_end expect two positional arguments: epoch , logs on_batch_begin and on_batch_end expect two positional arguments: batch , logs on_train_begin and on_train_end expect one positional argument: logs Arguments on_epoch_begin : called at the beginning of every epoch. on_epoch_end : called at the end of every epoch. on_batch_begin : called at the beginning of every batch. on_batch_end : called at the end of every batch. on_train_begin : called at the beginning of model training. on_train_end : called at the end of model training. Example # Print the batch number at the beginning of every batch. batch_print_callback = LambdaCallback( on_batch_begin=lambda batch,logs: print(batch)) # Stream the epoch loss to a file in JSON format. The file content # is not well-formed JSON but rather has a JSON object per line. import json json_log = open('loss_log.json', mode='wt', buffering=1) json_logging_callback = LambdaCallback( on_epoch_end=lambda epoch, logs: json_log.write( json.dumps({'epoch': epoch, 'loss': logs['loss']}) + '\\\\n'), on_train_end=lambda logs: json_log.close() ) # Terminate some processes after having finished model training. processes = ... cleanup_callback = LambdaCallback( on_train_end=lambda logs: [ p.terminate() for p in processes if p.is_alive()]) model.fit(..., callbacks=[batch_print_callback, json_logging_callback, cleanup_callback])\"), ('title', 'LambdaCallback')]), OrderedDict([('location', 'callbacks.html#create-a-callback'), ('text', \"You can create a custom callback by extending the base class keras.callbacks.Callback . A callback has access to its associated model through the class property self.model . Here's a simple example saving a list of losses over each batch during training: class LossHistory(keras.callbacks.Callback): def on_train_begin(self, logs={}): self.losses = [] def on_batch_end(self, batch, logs={}): self.losses.append(logs.get('loss'))\"), ('title', 'Create a callback')]), OrderedDict([('location', 'callbacks.html#example-recording-loss-history'), ('text', \"class LossHistory(keras.callbacks.Callback): def on_train_begin(self, logs={}): self.losses = [] def on_batch_end(self, batch, logs={}): self.losses.append(logs.get('loss')) model = Sequential() model.add(Dense(10, input_dim=784, kernel_initializer='uniform')) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop') history = LossHistory() model.fit(x_train, y_train, batch_size=128, epochs=20, verbose=0, callbacks=[history]) print(history.losses) # outputs ''' [0.66047596406559383, 0.3547245744908703, ..., 0.25953155204159617, 0.25901699725311789] '''\"), ('title', 'Example: recording loss history')]), OrderedDict([('location', 'callbacks.html#example-model-checkpoints'), ('text', \"from keras.callbacks import ModelCheckpoint model = Sequential() model.add(Dense(10, input_dim=784, kernel_initializer='uniform')) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop') ''' saves the model weights after each epoch if the validation loss decreased ''' checkpointer = ModelCheckpoint(filepath='/tmp/weights.hdf5', verbose=1, save_best_only=True) model.fit(x_train, y_train, batch_size=128, epochs=20, verbose=0, validation_data=(X_test, Y_test), callbacks=[checkpointer])\"), ('title', 'Example: model checkpoints')]), OrderedDict([('location', 'constraints.html'), ('text', 'Usage of constraints Functions from the constraints module allow setting constraints (eg. non-negativity) on network parameters during optimization. The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers Dense , Conv1D , Conv2D and Conv3D have a unified API. These layers expose 2 keyword arguments: kernel_constraint for the main weights matrix bias_constraint for the bias. from keras.constraints import max_norm model.add(Dense(64, kernel_constraint=max_norm(2.))) Available constraints max_norm(max_value=2, axis=0) : maximum-norm constraint non_neg() : non-negativity constraint unit_norm(axis=0) : unit-norm constraint min_max_norm(min_value=0.0, max_value=1.0, rate=1.0, axis=0) : minimum/maximum-norm constraint'), ('title', 'Constraints')]), OrderedDict([('location', 'constraints.html#usage-of-constraints'), ('text', 'Functions from the constraints module allow setting constraints (eg. non-negativity) on network parameters during optimization. The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers Dense , Conv1D , Conv2D and Conv3D have a unified API. These layers expose 2 keyword arguments: kernel_constraint for the main weights matrix bias_constraint for the bias. from keras.constraints import max_norm model.add(Dense(64, kernel_constraint=max_norm(2.)))'), ('title', 'Usage of constraints')]), OrderedDict([('location', 'constraints.html#available-constraints'), ('text', 'max_norm(max_value=2, axis=0) : maximum-norm constraint non_neg() : non-negativity constraint unit_norm(axis=0) : unit-norm constraint min_max_norm(min_value=0.0, max_value=1.0, rate=1.0, axis=0) : minimum/maximum-norm constraint'), ('title', 'Available constraints')]), OrderedDict([('location', 'contributing.html'), ('text', 'On Github Issues and Pull Requests Found a bug? Have a new feature to suggest? Want to contribute changes to the codebase? Make sure to read this first. Bug reporting Your code doesn\\'t work, and you have determined that the issue lies with Keras? Follow these steps to report a bug. Your bug may already be fixed. Make sure to update to the current Keras master branch, as well as the latest Theano/TensorFlow/CNTK master branch. To easily update Theano: pip install git+git://github.com/Theano/Theano.git --upgrade Search for similar issues. Make sure to delete is:open on the issue search to find solved tickets as well. It\\'s possible somebody has encountered this bug already. Also remember to check out Keras\\' FAQ . Still having a problem? Open an issue on Github to let us know. Make sure you provide us with useful information about your configuration: what OS are you using? What Keras backend are you using? Are you running on GPU? If so, what is your version of Cuda, of cuDNN? What is your GPU? Provide us with a script to reproduce the issue. This script should be runnable as-is and should not require external data download (use randomly generated data if you need to run a model on some test data). We recommend that you use Github Gists to post your code. Any issue that cannot be reproduced is likely to be closed. If possible, take a stab at fixing the bug yourself --if you can! The more information you provide, the easier it is for us to validate that there is a bug and the faster we\\'ll be able to take action. If you want your issue to be resolved quickly, following the steps above is crucial. Requesting a Feature You can also use Github issues to request features you would like to see in Keras, or changes in the Keras API. Provide a clear and detailed explanation of the feature you want and why it\\'s important to add. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset. If you\\'re just targeting a minority of users, consider writing an add-on library for Keras. It is crucial for Keras to avoid bloating the API and codebase. Provide code snippets demonstrating the API you have in mind and illustrating the use cases of your feature. Of course, you don\\'t need to write any real code at this point! After discussing the feature you may choose to attempt a Pull Request. If you\\'re at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along. Requests for Contributions This is the board where we list current outstanding issues and features to be added. If you want to start contributing to Keras, this is the place to start. Pull Requests Where should I submit my pull request? Keras improvements and bugfixes go to the Keras master branch . Experimental new features such as layers and datasets go to keras-contrib . Unless it is a new feature listed in Requests for Contributions , in which case it belongs in core Keras. If you think your feature belongs in core Keras, you can submit a design doc to explain your feature and argue for it (see explanations below). Please note that PRs that are primarily about code style (as opposed to fixing bugs, improving docs, or adding new functionality) will likely be rejected. Here\\'s a quick guide to submitting your improvements: If your PR introduces a change in functionality, make sure you start by writing a design doc and sending it to the Keras mailing list to discuss whether the change should be made, and how to handle it. This will save you from having your PR closed down the road! Of course, if your PR is a simple bug fix, you don\\'t need to do that. The process for writing and submitting design docs is as follow: Start from this Google Doc template , and copy it to new Google doc. Fill in the content. Note that you will need to insert code examples. To insert code, use a Google Doc extension such as CodePretty (there are several such extensions available). Set sharing settings to \"everyone with the link is allowed to comment\" Send the document to keras-users@googlegroups.com with a subject that starts with [API DESIGN REVIEW] (all caps) so that we notice it. Wait for comments, and answer them as they come. Edit the proposal as necessary. The proposal will finally be approved or rejected. Once approved, you can send out Pull Requests or ask others to write Pull Requests. Write the code (or get others to write it). This is the hard part! Make sure any new function or class you introduce has proper docstrings. Make sure any code you touch still has up-to-date docstrings and documentation. Docstring style should be respected. In particular, they should be formatted in MarkDown, and there should be sections for Arguments , Returns , Raises (if applicable). Look at other docstrings in the codebase for examples. Write tests. Your code should have full unit test coverage. If you want to see your PR merged promptly, this is crucial. Run our test suite locally. It\\'s easy: from the Keras folder, simply run: py.test tests/ . You will need to install the test requirements as well: pip install -e .[tests] . Make sure all tests are passing: with the Theano backend, on Python 2.7 and Python 3.6. Make sure you have the development version of Theano. with the TensorFlow backend, on Python 2.7 and Python 3.6. Make sure you have the development version of TensorFlow. with the CNTK backend, on Python 2.7 and Python 3.6. Make sure you have the development version of CNTK. We use PEP8 syntax conventions, but we aren\\'t dogmatic when it comes to line length. Make sure your lines stay reasonably sized, though. To make your life easier, we recommend running a PEP8 linter: Install PEP8 packages: pip install pep8 pytest-pep8 autopep8 Run a standalone PEP8 check: py.test --pep8 -m pep8 You can automatically fix some PEP8 error by running: autopep8 -i --select <errors> <FILENAME> for example: autopep8 -i --select E128 tests/keras/backend/test_backends.py When committing, use appropriate, descriptive commit messages. Update the documentation. If introducing new functionality, make sure you include code snippets demonstrating the usage of your new feature. Submit your PR. If your changes have been approved in a previous discussion, and if you have complete (and passing) unit tests as well as proper docstrings/documentation, your PR is likely to be merged promptly. Adding new examples Even if you don\\'t contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of examples. Existing examples show idiomatic Keras code: make sure to keep your own script in the same spirit.'), ('title', 'Contributing')]), OrderedDict([('location', 'contributing.html#on-github-issues-and-pull-requests'), ('text', 'Found a bug? Have a new feature to suggest? Want to contribute changes to the codebase? Make sure to read this first.'), ('title', 'On Github Issues and Pull Requests')]), OrderedDict([('location', 'contributing.html#bug-reporting'), ('text', \"Your code doesn't work, and you have determined that the issue lies with Keras? Follow these steps to report a bug. Your bug may already be fixed. Make sure to update to the current Keras master branch, as well as the latest Theano/TensorFlow/CNTK master branch. To easily update Theano: pip install git+git://github.com/Theano/Theano.git --upgrade Search for similar issues. Make sure to delete is:open on the issue search to find solved tickets as well. It's possible somebody has encountered this bug already. Also remember to check out Keras' FAQ . Still having a problem? Open an issue on Github to let us know. Make sure you provide us with useful information about your configuration: what OS are you using? What Keras backend are you using? Are you running on GPU? If so, what is your version of Cuda, of cuDNN? What is your GPU? Provide us with a script to reproduce the issue. This script should be runnable as-is and should not require external data download (use randomly generated data if you need to run a model on some test data). We recommend that you use Github Gists to post your code. Any issue that cannot be reproduced is likely to be closed. If possible, take a stab at fixing the bug yourself --if you can! The more information you provide, the easier it is for us to validate that there is a bug and the faster we'll be able to take action. If you want your issue to be resolved quickly, following the steps above is crucial.\"), ('title', 'Bug reporting')]), OrderedDict([('location', 'contributing.html#requesting-a-feature'), ('text', \"You can also use Github issues to request features you would like to see in Keras, or changes in the Keras API. Provide a clear and detailed explanation of the feature you want and why it's important to add. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset. If you're just targeting a minority of users, consider writing an add-on library for Keras. It is crucial for Keras to avoid bloating the API and codebase. Provide code snippets demonstrating the API you have in mind and illustrating the use cases of your feature. Of course, you don't need to write any real code at this point! After discussing the feature you may choose to attempt a Pull Request. If you're at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along.\"), ('title', 'Requesting a Feature')]), OrderedDict([('location', 'contributing.html#requests-for-contributions'), ('text', 'This is the board where we list current outstanding issues and features to be added. If you want to start contributing to Keras, this is the place to start.'), ('title', 'Requests for Contributions')]), OrderedDict([('location', 'contributing.html#pull-requests'), ('text', 'Where should I submit my pull request? Keras improvements and bugfixes go to the Keras master branch . Experimental new features such as layers and datasets go to keras-contrib . Unless it is a new feature listed in Requests for Contributions , in which case it belongs in core Keras. If you think your feature belongs in core Keras, you can submit a design doc to explain your feature and argue for it (see explanations below). Please note that PRs that are primarily about code style (as opposed to fixing bugs, improving docs, or adding new functionality) will likely be rejected. Here\\'s a quick guide to submitting your improvements: If your PR introduces a change in functionality, make sure you start by writing a design doc and sending it to the Keras mailing list to discuss whether the change should be made, and how to handle it. This will save you from having your PR closed down the road! Of course, if your PR is a simple bug fix, you don\\'t need to do that. The process for writing and submitting design docs is as follow: Start from this Google Doc template , and copy it to new Google doc. Fill in the content. Note that you will need to insert code examples. To insert code, use a Google Doc extension such as CodePretty (there are several such extensions available). Set sharing settings to \"everyone with the link is allowed to comment\" Send the document to keras-users@googlegroups.com with a subject that starts with [API DESIGN REVIEW] (all caps) so that we notice it. Wait for comments, and answer them as they come. Edit the proposal as necessary. The proposal will finally be approved or rejected. Once approved, you can send out Pull Requests or ask others to write Pull Requests. Write the code (or get others to write it). This is the hard part! Make sure any new function or class you introduce has proper docstrings. Make sure any code you touch still has up-to-date docstrings and documentation. Docstring style should be respected. In particular, they should be formatted in MarkDown, and there should be sections for Arguments , Returns , Raises (if applicable). Look at other docstrings in the codebase for examples. Write tests. Your code should have full unit test coverage. If you want to see your PR merged promptly, this is crucial. Run our test suite locally. It\\'s easy: from the Keras folder, simply run: py.test tests/ . You will need to install the test requirements as well: pip install -e .[tests] . Make sure all tests are passing: with the Theano backend, on Python 2.7 and Python 3.6. Make sure you have the development version of Theano. with the TensorFlow backend, on Python 2.7 and Python 3.6. Make sure you have the development version of TensorFlow. with the CNTK backend, on Python 2.7 and Python 3.6. Make sure you have the development version of CNTK. We use PEP8 syntax conventions, but we aren\\'t dogmatic when it comes to line length. Make sure your lines stay reasonably sized, though. To make your life easier, we recommend running a PEP8 linter: Install PEP8 packages: pip install pep8 pytest-pep8 autopep8 Run a standalone PEP8 check: py.test --pep8 -m pep8 You can automatically fix some PEP8 error by running: autopep8 -i --select <errors> <FILENAME> for example: autopep8 -i --select E128 tests/keras/backend/test_backends.py When committing, use appropriate, descriptive commit messages. Update the documentation. If introducing new functionality, make sure you include code snippets demonstrating the usage of your new feature. Submit your PR. If your changes have been approved in a previous discussion, and if you have complete (and passing) unit tests as well as proper docstrings/documentation, your PR is likely to be merged promptly.'), ('title', 'Pull Requests')]), OrderedDict([('location', 'contributing.html#adding-new-examples'), ('text', \"Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of examples. Existing examples show idiomatic Keras code: make sure to keep your own script in the same spirit.\"), ('title', 'Adding new examples')]), OrderedDict([('location', 'datasets.html'), ('text', 'Datasets CIFAR10 small image classification Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 10,000 test images. Usage: from keras.datasets import cifar10 (x_train, y_train), (x_test, y_test) = cifar10.load_data() Returns: 2 tuples: x_train, x_test : uint8 array of RGB image data with shape (num_samples, 3, 32, 32) or (num_samples, 32, 32, 3) based on the image_data_format backend setting of either channels_first or channels_last respectively. y_train, y_test : uint8 array of category labels (integers in range 0-9) with shape (num_samples,). CIFAR100 small image classification Dataset of 50,000 32x32 color training images, labeled over 100 categories, and 10,000 test images. Usage: from keras.datasets import cifar100 (x_train, y_train), (x_test, y_test) = cifar100.load_data(label_mode=\\'fine\\') Returns: 2 tuples: x_train, x_test : uint8 array of RGB image data with shape (num_samples, 3, 32, 32) or (num_samples, 32, 32, 3) based on the image_data_format backend setting of either channels_first or channels_last respectively. y_train, y_test : uint8 array of category labels with shape (num_samples,). Arguments: label_mode : \"fine\" or \"coarse\". IMDB Movie reviews sentiment classification Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer \"3\" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: \"only consider the top 10,000 most common words, but eliminate the top 20 most common words\". As a convention, \"0\" does not stand for a specific word, but instead is used to encode any unknown word. Usage: from keras.datasets import imdb (x_train, y_train), (x_test, y_test) = imdb.load_data(path=\"imdb.npz\", num_words=None, skip_top=0, maxlen=None, seed=113, start_char=1, oov_char=2, index_from=3) Returns: 2 tuples: x_train, x_test : list of sequences, which are lists of indexes (integers). If the num_words argument was specific, the maximum possible index value is num_words-1. If the maxlen argument was specified, the largest possible sequence length is maxlen. y_train, y_test : list of integer labels (1 or 0). Arguments: path : if you do not have the data locally (at \\'~/.keras/datasets/\\' + path ), it will be downloaded to this location. num_words : integer or None. Top most frequent words to consider. Any less frequent word will appear as oov_char value in the sequence data. skip_top : integer. Top most frequent words to ignore (they will appear as oov_char value in the sequence data). maxlen : int. Maximum sequence length. Any longer sequence will be truncated. seed : int. Seed for reproducible data shuffling. start_char : int. The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character. oov_char : int. words that were cut out because of the num_words or skip_top limit will be replaced with this character. index_from : int. Index actual words with this index and higher. Reuters newswire topics classification Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with the IMDB dataset, each wire is encoded as a sequence of word indexes (same conventions). Usage: from keras.datasets import reuters (x_train, y_train), (x_test, y_test) = reuters.load_data(path=\"reuters.npz\", num_words=None, skip_top=0, maxlen=None, test_split=0.2, seed=113, start_char=1, oov_char=2, index_from=3) The specifications are the same as that of the IMDB dataset, with the addition of: test_split : float. Fraction of the dataset to be used as test data. This dataset also makes available the word index used for encoding the sequences: word_index = reuters.get_word_index(path=\"reuters_word_index.json\") Returns: A dictionary where key are words (str) and values are indexes (integer). eg. word_index[\"giraffe\"] might return 1234 . Arguments: path : if you do not have the index file locally (at \\'~/.keras/datasets/\\' + path ), it will be downloaded to this location. MNIST database of handwritten digits Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images. Usage: from keras.datasets import mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() Returns: 2 tuples: x_train, x_test : uint8 array of grayscale image data with shape (num_samples, 28, 28). y_train, y_test : uint8 array of digit labels (integers in range 0-9) with shape (num_samples,). Arguments: path : if you do not have the index file locally (at \\'~/.keras/datasets/\\' + path ), it will be downloaded to this location. Fashion-MNIST database of fashion articles Dataset of 60,000 28x28 grayscale images of 10 fashion categories, along with a test set of 10,000 images. This dataset can be used as a drop-in replacement for MNIST. The class labels are: Label Description 0 T-shirt/top 1 Trouser 2 Pullover 3 Dress 4 Coat 5 Sandal 6 Shirt 7 Sneaker 8 Bag 9 Ankle boot Usage: from keras.datasets import fashion_mnist (x_train, y_train), (x_test, y_test) = fashion_mnist.load_data() Returns: 2 tuples: x_train, x_test : uint8 array of grayscale image data with shape (num_samples, 28, 28). y_train, y_test : uint8 array of labels (integers in range 0-9) with shape (num_samples,). Boston housing price regression dataset Dataset taken from the StatLib library which is maintained at Carnegie Mellon University. Samples contain 13 attributes of houses at different locations around the Boston suburbs in the late 1970s. Targets are the median values of the houses at a location (in k$). Usage: from keras.datasets import boston_housing (x_train, y_train), (x_test, y_test) = boston_housing.load_data() Arguments: path : path where to cache the dataset locally (relative to ~/.keras/datasets). seed : Random seed for shuffling the data before computing the test split. test_split : fraction of the data to reserve as test set. Returns: Tuple of Numpy arrays: (x_train, y_train), (x_test, y_test) .'), ('title', 'Datasets')]), OrderedDict([('location', 'datasets.html#datasets'), ('text', ''), ('title', 'Datasets')]), OrderedDict([('location', 'datasets.html#cifar10-small-image-classification'), ('text', 'Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 10,000 test images.'), ('title', 'CIFAR10 small image classification')]), OrderedDict([('location', 'datasets.html#usage'), ('text', 'from keras.datasets import cifar10 (x_train, y_train), (x_test, y_test) = cifar10.load_data() Returns: 2 tuples: x_train, x_test : uint8 array of RGB image data with shape (num_samples, 3, 32, 32) or (num_samples, 32, 32, 3) based on the image_data_format backend setting of either channels_first or channels_last respectively. y_train, y_test : uint8 array of category labels (integers in range 0-9) with shape (num_samples,).'), ('title', 'Usage:')]), OrderedDict([('location', 'datasets.html#cifar100-small-image-classification'), ('text', 'Dataset of 50,000 32x32 color training images, labeled over 100 categories, and 10,000 test images.'), ('title', 'CIFAR100 small image classification')]), OrderedDict([('location', 'datasets.html#usage_1'), ('text', 'from keras.datasets import cifar100 (x_train, y_train), (x_test, y_test) = cifar100.load_data(label_mode=\\'fine\\') Returns: 2 tuples: x_train, x_test : uint8 array of RGB image data with shape (num_samples, 3, 32, 32) or (num_samples, 32, 32, 3) based on the image_data_format backend setting of either channels_first or channels_last respectively. y_train, y_test : uint8 array of category labels with shape (num_samples,). Arguments: label_mode : \"fine\" or \"coarse\".'), ('title', 'Usage:')]), OrderedDict([('location', 'datasets.html#imdb-movie-reviews-sentiment-classification'), ('text', 'Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer \"3\" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: \"only consider the top 10,000 most common words, but eliminate the top 20 most common words\". As a convention, \"0\" does not stand for a specific word, but instead is used to encode any unknown word.'), ('title', 'IMDB Movie reviews sentiment classification')]), OrderedDict([('location', 'datasets.html#usage_2'), ('text', 'from keras.datasets import imdb (x_train, y_train), (x_test, y_test) = imdb.load_data(path=\"imdb.npz\", num_words=None, skip_top=0, maxlen=None, seed=113, start_char=1, oov_char=2, index_from=3) Returns: 2 tuples: x_train, x_test : list of sequences, which are lists of indexes (integers). If the num_words argument was specific, the maximum possible index value is num_words-1. If the maxlen argument was specified, the largest possible sequence length is maxlen. y_train, y_test : list of integer labels (1 or 0). Arguments: path : if you do not have the data locally (at \\'~/.keras/datasets/\\' + path ), it will be downloaded to this location. num_words : integer or None. Top most frequent words to consider. Any less frequent word will appear as oov_char value in the sequence data. skip_top : integer. Top most frequent words to ignore (they will appear as oov_char value in the sequence data). maxlen : int. Maximum sequence length. Any longer sequence will be truncated. seed : int. Seed for reproducible data shuffling. start_char : int. The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character. oov_char : int. words that were cut out because of the num_words or skip_top limit will be replaced with this character. index_from : int. Index actual words with this index and higher.'), ('title', 'Usage:')]), OrderedDict([('location', 'datasets.html#reuters-newswire-topics-classification'), ('text', 'Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with the IMDB dataset, each wire is encoded as a sequence of word indexes (same conventions).'), ('title', 'Reuters newswire topics classification')]), OrderedDict([('location', 'datasets.html#usage_3'), ('text', 'from keras.datasets import reuters (x_train, y_train), (x_test, y_test) = reuters.load_data(path=\"reuters.npz\", num_words=None, skip_top=0, maxlen=None, test_split=0.2, seed=113, start_char=1, oov_char=2, index_from=3) The specifications are the same as that of the IMDB dataset, with the addition of: test_split : float. Fraction of the dataset to be used as test data. This dataset also makes available the word index used for encoding the sequences: word_index = reuters.get_word_index(path=\"reuters_word_index.json\") Returns: A dictionary where key are words (str) and values are indexes (integer). eg. word_index[\"giraffe\"] might return 1234 . Arguments: path : if you do not have the index file locally (at \\'~/.keras/datasets/\\' + path ), it will be downloaded to this location.'), ('title', 'Usage:')]), OrderedDict([('location', 'datasets.html#mnist-database-of-handwritten-digits'), ('text', 'Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.'), ('title', 'MNIST database of handwritten digits')]), OrderedDict([('location', 'datasets.html#usage_4'), ('text', \"from keras.datasets import mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() Returns: 2 tuples: x_train, x_test : uint8 array of grayscale image data with shape (num_samples, 28, 28). y_train, y_test : uint8 array of digit labels (integers in range 0-9) with shape (num_samples,). Arguments: path : if you do not have the index file locally (at '~/.keras/datasets/' + path ), it will be downloaded to this location.\"), ('title', 'Usage:')]), OrderedDict([('location', 'datasets.html#fashion-mnist-database-of-fashion-articles'), ('text', 'Dataset of 60,000 28x28 grayscale images of 10 fashion categories, along with a test set of 10,000 images. This dataset can be used as a drop-in replacement for MNIST. The class labels are: Label Description 0 T-shirt/top 1 Trouser 2 Pullover 3 Dress 4 Coat 5 Sandal 6 Shirt 7 Sneaker 8 Bag 9 Ankle boot'), ('title', 'Fashion-MNIST database of fashion articles')]), OrderedDict([('location', 'datasets.html#usage_5'), ('text', 'from keras.datasets import fashion_mnist (x_train, y_train), (x_test, y_test) = fashion_mnist.load_data() Returns: 2 tuples: x_train, x_test : uint8 array of grayscale image data with shape (num_samples, 28, 28). y_train, y_test : uint8 array of labels (integers in range 0-9) with shape (num_samples,).'), ('title', 'Usage:')]), OrderedDict([('location', 'datasets.html#boston-housing-price-regression-dataset'), ('text', 'Dataset taken from the StatLib library which is maintained at Carnegie Mellon University. Samples contain 13 attributes of houses at different locations around the Boston suburbs in the late 1970s. Targets are the median values of the houses at a location (in k$).'), ('title', 'Boston housing price regression dataset')]), OrderedDict([('location', 'datasets.html#usage_6'), ('text', 'from keras.datasets import boston_housing (x_train, y_train), (x_test, y_test) = boston_housing.load_data() Arguments: path : path where to cache the dataset locally (relative to ~/.keras/datasets). seed : Random seed for shuffling the data before computing the test split. test_split : fraction of the data to reserve as test set. Returns: Tuple of Numpy arrays: (x_train, y_train), (x_test, y_test) .'), ('title', 'Usage:')]), OrderedDict([('location', 'initializers.html'), ('text', 'Usage of initializers Initializations define the way to set the initial random weights of Keras layers. The keyword arguments used for passing initializers to layers will depend on the layer. Usually it is simply kernel_initializer and bias_initializer : model.add(Dense(64, kernel_initializer=\\'random_uniform\\', bias_initializer=\\'zeros\\')) Available initializers The following built-in initializers are available as part of the keras.initializers module: [source] Ones keras.initializers.Ones() Initializer that generates tensors initialized to 1. [source] Constant keras.initializers.Constant(value=0) Initializer that generates tensors initialized to a constant value. Arguments value : float; the value of the generator tensors. [source] RandomNormal keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None) Initializer that generates tensors with a normal distribution. Arguments mean : a python scalar or a scalar tensor. Mean of the random values to generate. stddev : a python scalar or a scalar tensor. Standard deviation of the random values to generate. seed : A Python integer. Used to seed the random generator. [source] RandomUniform keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None) Initializer that generates tensors with a uniform distribution. Arguments minval : A python scalar or a scalar tensor. Lower bound of the range of random values to generate. maxval : A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types. seed : A Python integer. Used to seed the random generator. [source] TruncatedNormal keras.initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=None) Initializer that generates a truncated normal distribution. These values are similar to values from a RandomNormal except that values more than two standard deviations from the mean are discarded and re-drawn. This is the recommended initializer for neural network weights and filters. Arguments mean : a python scalar or a scalar tensor. Mean of the random values to generate. stddev : a python scalar or a scalar tensor. Standard deviation of the random values to generate. seed : A Python integer. Used to seed the random generator. [source] VarianceScaling keras.initializers.VarianceScaling(scale=1.0, mode=\\'fan_in\\', distribution=\\'normal\\', seed=None) Initializer capable of adapting its scale to the shape of weights. With distribution=\"normal\" , samples are drawn from a truncated normal distribution centered on zero, with stddev = sqrt(scale / n) where n is: number of input units in the weight tensor, if mode = \"fan_in\" number of output units, if mode = \"fan_out\" average of the numbers of input and output units, if mode = \"fan_avg\" With distribution=\"uniform\" , samples are drawn from a uniform distribution within [-limit, limit], with limit = sqrt(3 * scale / n) . Arguments scale : Scaling factor (positive float). mode : One of \"fan_in\", \"fan_out\", \"fan_avg\". distribution : Random distribution to use. One of \"normal\", \"uniform\". seed : A Python integer. Used to seed the random generator. Raises ValueError : In case of an invalid value for the \"scale\", mode\" or \"distribution\" arguments. [source] Orthogonal keras.initializers.Orthogonal(gain=1.0, seed=None) Initializer that generates a random orthogonal matrix. Arguments gain : Multiplicative factor to apply to the orthogonal matrix. seed : A Python integer. Used to seed the random generator. References Saxe et al., http://arxiv.org/abs/1312.6120 [source] Identity keras.initializers.Identity(gain=1.0) Initializer that generates the identity matrix. Only use for 2D matrices. If the long side of the matrix is a multiple of the short side, multiple identity matrices are concatenated along the long side. Arguments gain : Multiplicative factor to apply to the identity matrix. [source] Initializer keras.initializers.Initializer() Initializer base class: all initializers inherit from this class. [source] Zeros keras.initializers.Zeros() Initializer that generates tensors initialized to 0. glorot_normal keras.initializers.glorot_normal(seed=None) Glorot normal initializer, also called Xavier normal initializer. It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References Glorot & Bengio, AISTATS 2010 - http ://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf glorot_uniform keras.initializers.glorot_uniform(seed=None) Glorot uniform initializer, also called Xavier uniform initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References Glorot & Bengio, AISTATS 2010 - http ://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf he_normal keras.initializers.he_normal(seed=None) He normal initializer. It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References He et al., http://arxiv.org/abs/1502.01852 lecun_normal keras.initializers.lecun_normal(seed=None) LeCun normal initializer. It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References Self-Normalizing Neural Networks Efficient Backprop he_uniform keras.initializers.he_uniform(seed=None) He uniform variance scaling initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References He et al., http://arxiv.org/abs/1502.01852 lecun_uniform keras.initializers.lecun_uniform(seed=None) LeCun uniform initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(3 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References LeCun 98, Efficient Backprop, - http ://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf An initializer may be passed as a string (must match one of the available initializers above), or as a callable: from keras import initializers model.add(Dense(64, kernel_initializer=initializers.random_normal(stddev=0.01))) # also works; will use the default parameters. model.add(Dense(64, kernel_initializer=\\'random_normal\\')) Using custom initializers If passing a custom callable, then it must take the argument shape (shape of the variable to initialize) and dtype (dtype of generated values): from keras import backend as K def my_init(shape, dtype=None): return K.random_normal(shape, dtype=dtype) model.add(Dense(64, kernel_initializer=my_init))'), ('title', 'Initializers')]), OrderedDict([('location', 'initializers.html#usage-of-initializers'), ('text', \"Initializations define the way to set the initial random weights of Keras layers. The keyword arguments used for passing initializers to layers will depend on the layer. Usually it is simply kernel_initializer and bias_initializer : model.add(Dense(64, kernel_initializer='random_uniform', bias_initializer='zeros'))\"), ('title', 'Usage of initializers')]), OrderedDict([('location', 'initializers.html#available-initializers'), ('text', 'The following built-in initializers are available as part of the keras.initializers module: [source]'), ('title', 'Available initializers')]), OrderedDict([('location', 'initializers.html#ones'), ('text', 'keras.initializers.Ones() Initializer that generates tensors initialized to 1. [source]'), ('title', 'Ones')]), OrderedDict([('location', 'initializers.html#constant'), ('text', 'keras.initializers.Constant(value=0) Initializer that generates tensors initialized to a constant value. Arguments value : float; the value of the generator tensors. [source]'), ('title', 'Constant')]), OrderedDict([('location', 'initializers.html#randomnormal'), ('text', 'keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None) Initializer that generates tensors with a normal distribution. Arguments mean : a python scalar or a scalar tensor. Mean of the random values to generate. stddev : a python scalar or a scalar tensor. Standard deviation of the random values to generate. seed : A Python integer. Used to seed the random generator. [source]'), ('title', 'RandomNormal')]), OrderedDict([('location', 'initializers.html#randomuniform'), ('text', 'keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None) Initializer that generates tensors with a uniform distribution. Arguments minval : A python scalar or a scalar tensor. Lower bound of the range of random values to generate. maxval : A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types. seed : A Python integer. Used to seed the random generator. [source]'), ('title', 'RandomUniform')]), OrderedDict([('location', 'initializers.html#truncatednormal'), ('text', 'keras.initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=None) Initializer that generates a truncated normal distribution. These values are similar to values from a RandomNormal except that values more than two standard deviations from the mean are discarded and re-drawn. This is the recommended initializer for neural network weights and filters. Arguments mean : a python scalar or a scalar tensor. Mean of the random values to generate. stddev : a python scalar or a scalar tensor. Standard deviation of the random values to generate. seed : A Python integer. Used to seed the random generator. [source]'), ('title', 'TruncatedNormal')]), OrderedDict([('location', 'initializers.html#variancescaling'), ('text', 'keras.initializers.VarianceScaling(scale=1.0, mode=\\'fan_in\\', distribution=\\'normal\\', seed=None) Initializer capable of adapting its scale to the shape of weights. With distribution=\"normal\" , samples are drawn from a truncated normal distribution centered on zero, with stddev = sqrt(scale / n) where n is: number of input units in the weight tensor, if mode = \"fan_in\" number of output units, if mode = \"fan_out\" average of the numbers of input and output units, if mode = \"fan_avg\" With distribution=\"uniform\" , samples are drawn from a uniform distribution within [-limit, limit], with limit = sqrt(3 * scale / n) . Arguments scale : Scaling factor (positive float). mode : One of \"fan_in\", \"fan_out\", \"fan_avg\". distribution : Random distribution to use. One of \"normal\", \"uniform\". seed : A Python integer. Used to seed the random generator. Raises ValueError : In case of an invalid value for the \"scale\", mode\" or \"distribution\" arguments. [source]'), ('title', 'VarianceScaling')]), OrderedDict([('location', 'initializers.html#orthogonal'), ('text', 'keras.initializers.Orthogonal(gain=1.0, seed=None) Initializer that generates a random orthogonal matrix. Arguments gain : Multiplicative factor to apply to the orthogonal matrix. seed : A Python integer. Used to seed the random generator. References Saxe et al., http://arxiv.org/abs/1312.6120 [source]'), ('title', 'Orthogonal')]), OrderedDict([('location', 'initializers.html#identity'), ('text', 'keras.initializers.Identity(gain=1.0) Initializer that generates the identity matrix. Only use for 2D matrices. If the long side of the matrix is a multiple of the short side, multiple identity matrices are concatenated along the long side. Arguments gain : Multiplicative factor to apply to the identity matrix. [source]'), ('title', 'Identity')]), OrderedDict([('location', 'initializers.html#initializer'), ('text', 'keras.initializers.Initializer() Initializer base class: all initializers inherit from this class. [source]'), ('title', 'Initializer')]), OrderedDict([('location', 'initializers.html#zeros'), ('text', 'keras.initializers.Zeros() Initializer that generates tensors initialized to 0.'), ('title', 'Zeros')]), OrderedDict([('location', 'initializers.html#glorot_normal'), ('text', 'keras.initializers.glorot_normal(seed=None) Glorot normal initializer, also called Xavier normal initializer. It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References Glorot & Bengio, AISTATS 2010 - http ://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf'), ('title', 'glorot_normal')]), OrderedDict([('location', 'initializers.html#glorot_uniform'), ('text', 'keras.initializers.glorot_uniform(seed=None) Glorot uniform initializer, also called Xavier uniform initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References Glorot & Bengio, AISTATS 2010 - http ://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf'), ('title', 'glorot_uniform')]), OrderedDict([('location', 'initializers.html#he_normal'), ('text', 'keras.initializers.he_normal(seed=None) He normal initializer. It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References He et al., http://arxiv.org/abs/1502.01852'), ('title', 'he_normal')]), OrderedDict([('location', 'initializers.html#lecun_normal'), ('text', 'keras.initializers.lecun_normal(seed=None) LeCun normal initializer. It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References Self-Normalizing Neural Networks Efficient Backprop'), ('title', 'lecun_normal')]), OrderedDict([('location', 'initializers.html#he_uniform'), ('text', 'keras.initializers.he_uniform(seed=None) He uniform variance scaling initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References He et al., http://arxiv.org/abs/1502.01852'), ('title', 'he_uniform')]), OrderedDict([('location', 'initializers.html#lecun_uniform'), ('text', \"keras.initializers.lecun_uniform(seed=None) LeCun uniform initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(3 / fan_in) where fan_in is the number of input units in the weight tensor. Arguments seed : A Python integer. Used to seed the random generator. Returns An initializer. References LeCun 98, Efficient Backprop, - http ://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf An initializer may be passed as a string (must match one of the available initializers above), or as a callable: from keras import initializers model.add(Dense(64, kernel_initializer=initializers.random_normal(stddev=0.01))) # also works; will use the default parameters. model.add(Dense(64, kernel_initializer='random_normal'))\"), ('title', 'lecun_uniform')]), OrderedDict([('location', 'initializers.html#using-custom-initializers'), ('text', 'If passing a custom callable, then it must take the argument shape (shape of the variable to initialize) and dtype (dtype of generated values): from keras import backend as K def my_init(shape, dtype=None): return K.random_normal(shape, dtype=dtype) model.add(Dense(64, kernel_initializer=my_init))'), ('title', 'Using custom initializers')]), OrderedDict([('location', 'losses.html'), ('text', \"Usage of loss functions A loss function (or objective function, or optimization score function) is one of the two parameters required to compile a model: model.compile(loss='mean_squared_error', optimizer='sgd') from keras import losses model.compile(loss=losses.mean_squared_error, optimizer='sgd') You can either pass the name of an existing loss function, or pass a TensorFlow/Theano symbolic function that returns a scalar for each data-point and takes the following two arguments: y_true : True labels. TensorFlow/Theano tensor. y_pred : Predictions. TensorFlow/Theano tensor of the same shape as y_true. The actual optimized objective is the mean of the output array across all datapoints. For a few examples of such functions, check out the losses source . Available loss functions mean_squared_error keras.losses.mean_squared_error(y_true, y_pred) mean_absolute_error keras.losses.mean_absolute_error(y_true, y_pred) mean_absolute_percentage_error keras.losses.mean_absolute_percentage_error(y_true, y_pred) mean_squared_logarithmic_error keras.losses.mean_squared_logarithmic_error(y_true, y_pred) squared_hinge keras.losses.squared_hinge(y_true, y_pred) hinge keras.losses.hinge(y_true, y_pred) categorical_hinge keras.losses.categorical_hinge(y_true, y_pred) logcosh keras.losses.logcosh(y_true, y_pred) Logarithm of the hyperbolic cosine of the prediction error. log(cosh(x)) is approximately equal to (x ** 2) / 2 for small x and to abs(x) - log(2) for large x . This means that 'logcosh' works mostly like the mean squared error, but will not be so strongly affected by the occasional wildly incorrect prediction. Arguments y_true : tensor of true targets. y_pred : tensor of predicted targets. Returns Tensor with one scalar loss entry per sample. categorical_crossentropy keras.losses.categorical_crossentropy(y_true, y_pred) sparse_categorical_crossentropy keras.losses.sparse_categorical_crossentropy(y_true, y_pred) binary_crossentropy keras.losses.binary_crossentropy(y_true, y_pred) kullback_leibler_divergence keras.losses.kullback_leibler_divergence(y_true, y_pred) poisson keras.losses.poisson(y_true, y_pred) cosine_proximity keras.losses.cosine_proximity(y_true, y_pred) Note : when using the categorical_crossentropy loss, your targets should be in categorical format (e.g. if you have 10 classes, the target for each sample should be a 10-dimensional vector that is all-zeros except for a 1 at the index corresponding to the class of the sample). In order to convert integer targets into categorical targets , you can use the Keras utility to_categorical : from keras.utils.np_utils import to_categorical categorical_labels = to_categorical(int_labels, num_classes=None)\"), ('title', 'Losses')]), OrderedDict([('location', 'losses.html#usage-of-loss-functions'), ('text', \"A loss function (or objective function, or optimization score function) is one of the two parameters required to compile a model: model.compile(loss='mean_squared_error', optimizer='sgd') from keras import losses model.compile(loss=losses.mean_squared_error, optimizer='sgd') You can either pass the name of an existing loss function, or pass a TensorFlow/Theano symbolic function that returns a scalar for each data-point and takes the following two arguments: y_true : True labels. TensorFlow/Theano tensor. y_pred : Predictions. TensorFlow/Theano tensor of the same shape as y_true. The actual optimized objective is the mean of the output array across all datapoints. For a few examples of such functions, check out the losses source .\"), ('title', 'Usage of loss functions')]), OrderedDict([('location', 'losses.html#available-loss-functions'), ('text', ''), ('title', 'Available loss functions')]), OrderedDict([('location', 'losses.html#mean_squared_error'), ('text', 'keras.losses.mean_squared_error(y_true, y_pred)'), ('title', 'mean_squared_error')]), OrderedDict([('location', 'losses.html#mean_absolute_error'), ('text', 'keras.losses.mean_absolute_error(y_true, y_pred)'), ('title', 'mean_absolute_error')]), OrderedDict([('location', 'losses.html#mean_absolute_percentage_error'), ('text', 'keras.losses.mean_absolute_percentage_error(y_true, y_pred)'), ('title', 'mean_absolute_percentage_error')]), OrderedDict([('location', 'losses.html#mean_squared_logarithmic_error'), ('text', 'keras.losses.mean_squared_logarithmic_error(y_true, y_pred)'), ('title', 'mean_squared_logarithmic_error')]), OrderedDict([('location', 'losses.html#squared_hinge'), ('text', 'keras.losses.squared_hinge(y_true, y_pred)'), ('title', 'squared_hinge')]), OrderedDict([('location', 'losses.html#hinge'), ('text', 'keras.losses.hinge(y_true, y_pred)'), ('title', 'hinge')]), OrderedDict([('location', 'losses.html#categorical_hinge'), ('text', 'keras.losses.categorical_hinge(y_true, y_pred)'), ('title', 'categorical_hinge')]), OrderedDict([('location', 'losses.html#logcosh'), ('text', \"keras.losses.logcosh(y_true, y_pred) Logarithm of the hyperbolic cosine of the prediction error. log(cosh(x)) is approximately equal to (x ** 2) / 2 for small x and to abs(x) - log(2) for large x . This means that 'logcosh' works mostly like the mean squared error, but will not be so strongly affected by the occasional wildly incorrect prediction. Arguments y_true : tensor of true targets. y_pred : tensor of predicted targets. Returns Tensor with one scalar loss entry per sample.\"), ('title', 'logcosh')]), OrderedDict([('location', 'losses.html#categorical_crossentropy'), ('text', 'keras.losses.categorical_crossentropy(y_true, y_pred)'), ('title', 'categorical_crossentropy')]), OrderedDict([('location', 'losses.html#sparse_categorical_crossentropy'), ('text', 'keras.losses.sparse_categorical_crossentropy(y_true, y_pred)'), ('title', 'sparse_categorical_crossentropy')]), OrderedDict([('location', 'losses.html#binary_crossentropy'), ('text', 'keras.losses.binary_crossentropy(y_true, y_pred)'), ('title', 'binary_crossentropy')]), OrderedDict([('location', 'losses.html#kullback_leibler_divergence'), ('text', 'keras.losses.kullback_leibler_divergence(y_true, y_pred)'), ('title', 'kullback_leibler_divergence')]), OrderedDict([('location', 'losses.html#poisson'), ('text', 'keras.losses.poisson(y_true, y_pred)'), ('title', 'poisson')]), OrderedDict([('location', 'losses.html#cosine_proximity'), ('text', 'keras.losses.cosine_proximity(y_true, y_pred) Note : when using the categorical_crossentropy loss, your targets should be in categorical format (e.g. if you have 10 classes, the target for each sample should be a 10-dimensional vector that is all-zeros except for a 1 at the index corresponding to the class of the sample). In order to convert integer targets into categorical targets , you can use the Keras utility to_categorical : from keras.utils.np_utils import to_categorical categorical_labels = to_categorical(int_labels, num_classes=None)'), ('title', 'cosine_proximity')]), OrderedDict([('location', 'metrics.html'), ('text', \"Usage of metrics A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the metrics parameter when a model is compiled. model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['mae', 'acc']) from keras import metrics model.compile(loss='mean_squared_error', optimizer='sgd', metrics=[metrics.mae, metrics.categorical_accuracy]) A metric function is similar to a loss function , except that the results from evaluating a metric are not used when training the model. You can either pass the name of an existing metric, or pass a Theano/TensorFlow symbolic function (see Custom metrics ). Arguments y_true : True labels. Theano/TensorFlow tensor. y_pred : Predictions. Theano/TensorFlow tensor of the same shape as y_true. Returns Single tensor value representing the mean of the output array across all datapoints. Available metrics binary_accuracy keras.metrics.binary_accuracy(y_true, y_pred) categorical_accuracy keras.metrics.categorical_accuracy(y_true, y_pred) sparse_categorical_accuracy keras.metrics.sparse_categorical_accuracy(y_true, y_pred) top_k_categorical_accuracy keras.metrics.top_k_categorical_accuracy(y_true, y_pred, k=5) sparse_top_k_categorical_accuracy keras.metrics.sparse_top_k_categorical_accuracy(y_true, y_pred, k=5) Custom metrics Custom metrics can be passed at the compilation step. The function would need to take (y_true, y_pred) as arguments and return a single tensor value. import keras.backend as K def mean_pred(y_true, y_pred): return K.mean(y_pred) model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy', mean_pred])\"), ('title', 'Metrics')]), OrderedDict([('location', 'metrics.html#usage-of-metrics'), ('text', \"A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the metrics parameter when a model is compiled. model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['mae', 'acc']) from keras import metrics model.compile(loss='mean_squared_error', optimizer='sgd', metrics=[metrics.mae, metrics.categorical_accuracy]) A metric function is similar to a loss function , except that the results from evaluating a metric are not used when training the model. You can either pass the name of an existing metric, or pass a Theano/TensorFlow symbolic function (see Custom metrics ).\"), ('title', 'Usage of metrics')]), OrderedDict([('location', 'metrics.html#arguments'), ('text', 'y_true : True labels. Theano/TensorFlow tensor. y_pred : Predictions. Theano/TensorFlow tensor of the same shape as y_true.'), ('title', 'Arguments')]), OrderedDict([('location', 'metrics.html#returns'), ('text', 'Single tensor value representing the mean of the output array across all datapoints.'), ('title', 'Returns')]), OrderedDict([('location', 'metrics.html#available-metrics'), ('text', ''), ('title', 'Available metrics')]), OrderedDict([('location', 'metrics.html#binary_accuracy'), ('text', 'keras.metrics.binary_accuracy(y_true, y_pred)'), ('title', 'binary_accuracy')]), OrderedDict([('location', 'metrics.html#categorical_accuracy'), ('text', 'keras.metrics.categorical_accuracy(y_true, y_pred)'), ('title', 'categorical_accuracy')]), OrderedDict([('location', 'metrics.html#sparse_categorical_accuracy'), ('text', 'keras.metrics.sparse_categorical_accuracy(y_true, y_pred)'), ('title', 'sparse_categorical_accuracy')]), OrderedDict([('location', 'metrics.html#top_k_categorical_accuracy'), ('text', 'keras.metrics.top_k_categorical_accuracy(y_true, y_pred, k=5)'), ('title', 'top_k_categorical_accuracy')]), OrderedDict([('location', 'metrics.html#sparse_top_k_categorical_accuracy'), ('text', 'keras.metrics.sparse_top_k_categorical_accuracy(y_true, y_pred, k=5)'), ('title', 'sparse_top_k_categorical_accuracy')]), OrderedDict([('location', 'metrics.html#custom-metrics'), ('text', \"Custom metrics can be passed at the compilation step. The function would need to take (y_true, y_pred) as arguments and return a single tensor value. import keras.backend as K def mean_pred(y_true, y_pred): return K.mean(y_pred) model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy', mean_pred])\"), ('title', 'Custom metrics')]), OrderedDict([('location', 'optimizers.html'), ('text', 'Usage of optimizers An optimizer is one of the two arguments required for compiling a Keras model: from keras import optimizers model = Sequential() model.add(Dense(64, kernel_initializer=\\'uniform\\', input_shape=(10,))) model.add(Activation(\\'softmax\\')) sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss=\\'mean_squared_error\\', optimizer=sgd) You can either instantiate an optimizer before passing it to model.compile() , as in the above example, or you can call it by its name. In the latter case, the default parameters for the optimizer will be used. # pass optimizer by name: default parameters will be used model.compile(loss=\\'mean_squared_error\\', optimizer=\\'sgd\\') Parameters common to all Keras optimizers The parameters clipnorm and clipvalue can be used with all optimizers to control gradient clipping: from keras import optimizers # All parameter gradients will be clipped to # a maximum norm of 1. sgd = optimizers.SGD(lr=0.01, clipnorm=1.) from keras import optimizers # All parameter gradients will be clipped to # a maximum value of 0.5 and # a minimum value of -0.5. sgd = optimizers.SGD(lr=0.01, clipvalue=0.5) [source] SGD keras.optimizers.SGD(lr=0.01, momentum=0.0, decay=0.0, nesterov=False) Stochastic gradient descent optimizer. Includes support for momentum, learning rate decay, and Nesterov momentum. Arguments lr : float >= 0. Learning rate. momentum : float >= 0. Parameter that accelerates SGD in the relevant direction and dampens oscillations. decay : float >= 0. Learning rate decay over each update. nesterov : boolean. Whether to apply Nesterov momentum. [source] RMSprop keras.optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0) RMSProp optimizer. It is recommended to leave the parameters of this optimizer at their default values (except the learning rate, which can be freely tuned). This optimizer is usually a good choice for recurrent neural networks. Arguments lr : float >= 0. Learning rate. rho : float >= 0. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. References [rmsprop: Divide the gradient by a running average of its recent magnitude] (http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf) [source] Adagrad keras.optimizers.Adagrad(lr=0.01, epsilon=None, decay=0.0) Adagrad optimizer. Adagrad is an optimizer with parameter-specific learning rates, which are adapted relative to how frequently a parameter gets updated during training. The more updates a parameter receives, the smaller the updates. It is recommended to leave the parameters of this optimizer at their default values. Arguments lr : float >= 0. Initial learning rate. epsilon : float >= 0. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. References Adaptive Subgradient Methods for Online Learning and Stochastic Optimization [source] Adadelta keras.optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=None, decay=0.0) Adadelta optimizer. Adadelta is a more robust extension of Adagrad that adapts learning rates based on a moving window of gradient updates, instead of accumulating all past gradients. This way, Adadelta continues learning even when many updates have been done. Compared to Adagrad, in the original version of Adadelta you don\\'t have to set an initial learning rate. In this version, initial learning rate and decay factor can be set, as in most other Keras optimizers. It is recommended to leave the parameters of this optimizer at their default values. Arguments lr : float >= 0. Initial learning rate, defaults to 1. It is recommended to leave it at the default value. rho : float >= 0. Adadelta decay factor, corresponding to fraction of gradient to keep at each time step. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Initial learning rate decay. References [Adadelta - an adaptive learning rate method] (https://arxiv.org/abs/1212.5701) [source] Adam keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False) Adam optimizer. Default parameters follow those provided in the original paper. Arguments lr : float >= 0. Learning rate. beta_1 : float, 0 < beta < 1. Generally close to 1. beta_2 : float, 0 < beta < 1. Generally close to 1. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. amsgrad : boolean. Whether to apply the AMSGrad variant of this algorithm from the paper \"On the Convergence of Adam and Beyond\". References [Adam - A Method for Stochastic Optimization] (https://arxiv.org/abs/1412.6980v8) [On the Convergence of Adam and Beyond] (https://openreview.net/forum?id=ryQu7f-RZ) [source] Adamax keras.optimizers.Adamax(lr=0.002, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0) Adamax optimizer from Adam paper\\'s Section 7. It is a variant of Adam based on the infinity norm. Default parameters follow those provided in the paper. Arguments lr : float >= 0. Learning rate. beta_1/beta_2 : floats, 0 < beta < 1. Generally close to 1. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. References [Adam - A Method for Stochastic Optimization] (https://arxiv.org/abs/1412.6980v8) [source] Nadam keras.optimizers.Nadam(lr=0.002, beta_1=0.9, beta_2=0.999, epsilon=None, schedule_decay=0.004) Nesterov Adam optimizer. Much like Adam is essentially RMSprop with momentum, Nadam is Adam RMSprop with Nesterov momentum. Default parameters follow those provided in the paper. It is recommended to leave the parameters of this optimizer at their default values. Arguments lr : float >= 0. Learning rate. beta_1/beta_2 : floats, 0 < beta < 1. Generally close to 1. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . References Nadam report [On the importance of initialization and momentum in deep learning] (http://www.cs.toronto.edu/~fritz/absps/momentum.pdf)'), ('title', 'Optimizers')]), OrderedDict([('location', 'optimizers.html#usage-of-optimizers'), ('text', \"An optimizer is one of the two arguments required for compiling a Keras model: from keras import optimizers model = Sequential() model.add(Dense(64, kernel_initializer='uniform', input_shape=(10,))) model.add(Activation('softmax')) sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss='mean_squared_error', optimizer=sgd) You can either instantiate an optimizer before passing it to model.compile() , as in the above example, or you can call it by its name. In the latter case, the default parameters for the optimizer will be used. # pass optimizer by name: default parameters will be used model.compile(loss='mean_squared_error', optimizer='sgd')\"), ('title', 'Usage of optimizers')]), OrderedDict([('location', 'optimizers.html#parameters-common-to-all-keras-optimizers'), ('text', 'The parameters clipnorm and clipvalue can be used with all optimizers to control gradient clipping: from keras import optimizers # All parameter gradients will be clipped to # a maximum norm of 1. sgd = optimizers.SGD(lr=0.01, clipnorm=1.) from keras import optimizers # All parameter gradients will be clipped to # a maximum value of 0.5 and # a minimum value of -0.5. sgd = optimizers.SGD(lr=0.01, clipvalue=0.5) [source]'), ('title', 'Parameters common to all Keras optimizers')]), OrderedDict([('location', 'optimizers.html#sgd'), ('text', 'keras.optimizers.SGD(lr=0.01, momentum=0.0, decay=0.0, nesterov=False) Stochastic gradient descent optimizer. Includes support for momentum, learning rate decay, and Nesterov momentum. Arguments lr : float >= 0. Learning rate. momentum : float >= 0. Parameter that accelerates SGD in the relevant direction and dampens oscillations. decay : float >= 0. Learning rate decay over each update. nesterov : boolean. Whether to apply Nesterov momentum. [source]'), ('title', 'SGD')]), OrderedDict([('location', 'optimizers.html#rmsprop'), ('text', 'keras.optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0) RMSProp optimizer. It is recommended to leave the parameters of this optimizer at their default values (except the learning rate, which can be freely tuned). This optimizer is usually a good choice for recurrent neural networks. Arguments lr : float >= 0. Learning rate. rho : float >= 0. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. References [rmsprop: Divide the gradient by a running average of its recent magnitude] (http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf) [source]'), ('title', 'RMSprop')]), OrderedDict([('location', 'optimizers.html#adagrad'), ('text', 'keras.optimizers.Adagrad(lr=0.01, epsilon=None, decay=0.0) Adagrad optimizer. Adagrad is an optimizer with parameter-specific learning rates, which are adapted relative to how frequently a parameter gets updated during training. The more updates a parameter receives, the smaller the updates. It is recommended to leave the parameters of this optimizer at their default values. Arguments lr : float >= 0. Initial learning rate. epsilon : float >= 0. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. References Adaptive Subgradient Methods for Online Learning and Stochastic Optimization [source]'), ('title', 'Adagrad')]), OrderedDict([('location', 'optimizers.html#adadelta'), ('text', \"keras.optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=None, decay=0.0) Adadelta optimizer. Adadelta is a more robust extension of Adagrad that adapts learning rates based on a moving window of gradient updates, instead of accumulating all past gradients. This way, Adadelta continues learning even when many updates have been done. Compared to Adagrad, in the original version of Adadelta you don't have to set an initial learning rate. In this version, initial learning rate and decay factor can be set, as in most other Keras optimizers. It is recommended to leave the parameters of this optimizer at their default values. Arguments lr : float >= 0. Initial learning rate, defaults to 1. It is recommended to leave it at the default value. rho : float >= 0. Adadelta decay factor, corresponding to fraction of gradient to keep at each time step. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Initial learning rate decay. References [Adadelta - an adaptive learning rate method] (https://arxiv.org/abs/1212.5701) [source]\"), ('title', 'Adadelta')]), OrderedDict([('location', 'optimizers.html#adam'), ('text', 'keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False) Adam optimizer. Default parameters follow those provided in the original paper. Arguments lr : float >= 0. Learning rate. beta_1 : float, 0 < beta < 1. Generally close to 1. beta_2 : float, 0 < beta < 1. Generally close to 1. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. amsgrad : boolean. Whether to apply the AMSGrad variant of this algorithm from the paper \"On the Convergence of Adam and Beyond\". References [Adam - A Method for Stochastic Optimization] (https://arxiv.org/abs/1412.6980v8) [On the Convergence of Adam and Beyond] (https://openreview.net/forum?id=ryQu7f-RZ) [source]'), ('title', 'Adam')]), OrderedDict([('location', 'optimizers.html#adamax'), ('text', \"keras.optimizers.Adamax(lr=0.002, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0) Adamax optimizer from Adam paper's Section 7. It is a variant of Adam based on the infinity norm. Default parameters follow those provided in the paper. Arguments lr : float >= 0. Learning rate. beta_1/beta_2 : floats, 0 < beta < 1. Generally close to 1. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . decay : float >= 0. Learning rate decay over each update. References [Adam - A Method for Stochastic Optimization] (https://arxiv.org/abs/1412.6980v8) [source]\"), ('title', 'Adamax')]), OrderedDict([('location', 'optimizers.html#nadam'), ('text', 'keras.optimizers.Nadam(lr=0.002, beta_1=0.9, beta_2=0.999, epsilon=None, schedule_decay=0.004) Nesterov Adam optimizer. Much like Adam is essentially RMSprop with momentum, Nadam is Adam RMSprop with Nesterov momentum. Default parameters follow those provided in the paper. It is recommended to leave the parameters of this optimizer at their default values. Arguments lr : float >= 0. Learning rate. beta_1/beta_2 : floats, 0 < beta < 1. Generally close to 1. epsilon : float >= 0. Fuzz factor. If None , defaults to K.epsilon() . References Nadam report [On the importance of initialization and momentum in deep learning] (http://www.cs.toronto.edu/~fritz/absps/momentum.pdf)'), ('title', 'Nadam')]), OrderedDict([('location', 'regularizers.html'), ('text', 'Usage of regularizers Regularizers allow to apply penalties on layer parameters or layer activity during optimization. These penalties are incorporated in the loss function that the network optimizes. The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers Dense , Conv1D , Conv2D and Conv3D have a unified API. These layers expose 3 keyword arguments: kernel_regularizer : instance of keras.regularizers.Regularizer bias_regularizer : instance of keras.regularizers.Regularizer activity_regularizer : instance of keras.regularizers.Regularizer Example from keras import regularizers model.add(Dense(64, input_dim=64, kernel_regularizer=regularizers.l2(0.01), activity_regularizer=regularizers.l1(0.01))) Available penalties keras.regularizers.l1(0.) keras.regularizers.l2(0.) keras.regularizers.l1_l2(l1=0.01, l2=0.01) Developing new regularizers Any function that takes in a weight matrix and returns a loss contribution tensor can be used as a regularizer, e.g.: from keras import backend as K def l1_reg(weight_matrix): return 0.01 * K.sum(K.abs(weight_matrix)) model.add(Dense(64, input_dim=64, kernel_regularizer=l1_reg)) Alternatively, you can write your regularizers in an object-oriented way; see the keras/regularizers.py module for examples.'), ('title', 'Regularizers')]), OrderedDict([('location', 'regularizers.html#usage-of-regularizers'), ('text', 'Regularizers allow to apply penalties on layer parameters or layer activity during optimization. These penalties are incorporated in the loss function that the network optimizes. The penalties are applied on a per-layer basis. The exact API will depend on the layer, but the layers Dense , Conv1D , Conv2D and Conv3D have a unified API. These layers expose 3 keyword arguments: kernel_regularizer : instance of keras.regularizers.Regularizer bias_regularizer : instance of keras.regularizers.Regularizer activity_regularizer : instance of keras.regularizers.Regularizer'), ('title', 'Usage of regularizers')]), OrderedDict([('location', 'regularizers.html#example'), ('text', 'from keras import regularizers model.add(Dense(64, input_dim=64, kernel_regularizer=regularizers.l2(0.01), activity_regularizer=regularizers.l1(0.01)))'), ('title', 'Example')]), OrderedDict([('location', 'regularizers.html#available-penalties'), ('text', 'keras.regularizers.l1(0.) keras.regularizers.l2(0.) keras.regularizers.l1_l2(l1=0.01, l2=0.01)'), ('title', 'Available penalties')]), OrderedDict([('location', 'regularizers.html#developing-new-regularizers'), ('text', 'Any function that takes in a weight matrix and returns a loss contribution tensor can be used as a regularizer, e.g.: from keras import backend as K def l1_reg(weight_matrix): return 0.01 * K.sum(K.abs(weight_matrix)) model.add(Dense(64, input_dim=64, kernel_regularizer=l1_reg)) Alternatively, you can write your regularizers in an object-oriented way; see the keras/regularizers.py module for examples.'), ('title', 'Developing new regularizers')]), OrderedDict([('location', 'scikit-learn-api.html'), ('text', \"Wrappers for the Scikit-Learn API You can use Sequential Keras models (single-input only) as part of your Scikit-Learn workflow via the wrappers found at keras.wrappers.scikit_learn.py . There are two wrappers available: keras.wrappers.scikit_learn.KerasClassifier(build_fn=None, **sk_params) , which implements the Scikit-Learn classifier interface, keras.wrappers.scikit_learn.KerasRegressor(build_fn=None, **sk_params) , which implements the Scikit-Learn regressor interface. Arguments build_fn : callable function or class instance sk_params : model parameters & fitting parameters build_fn should construct, compile and return a Keras model, which will then be used to fit/predict. One of the following three values could be passed to build_fn : A function An instance of a class that implements the __call__ method None. This means you implement a class that inherits from either KerasClassifier or KerasRegressor . The __call__ method of the present class will then be treated as the default build_fn . sk_params takes both model parameters and fitting parameters. Legal model parameters are the arguments of build_fn . Note that like all other estimators in scikit-learn, build_fn should provide default values for its arguments, so that you could create the estimator without passing any values to sk_params . sk_params could also accept parameters for calling fit , predict , predict_proba , and score methods (e.g., epochs , batch_size ). fitting (predicting) parameters are selected in the following order: Values passed to the dictionary arguments of fit , predict , predict_proba , and score methods Values passed to sk_params The default values of the keras.models.Sequential fit , predict , predict_proba and score methods When using scikit-learn's grid_search API, legal tunable parameters are those you could pass to sk_params , including fitting parameters. In other words, you could use grid_search to search for the best batch_size or epochs as well as the model parameters.\"), ('title', 'Scikit-learn API')]), OrderedDict([('location', 'scikit-learn-api.html#wrappers-for-the-scikit-learn-api'), ('text', 'You can use Sequential Keras models (single-input only) as part of your Scikit-Learn workflow via the wrappers found at keras.wrappers.scikit_learn.py . There are two wrappers available: keras.wrappers.scikit_learn.KerasClassifier(build_fn=None, **sk_params) , which implements the Scikit-Learn classifier interface, keras.wrappers.scikit_learn.KerasRegressor(build_fn=None, **sk_params) , which implements the Scikit-Learn regressor interface.'), ('title', 'Wrappers for the Scikit-Learn API')]), OrderedDict([('location', 'scikit-learn-api.html#arguments'), ('text', \"build_fn : callable function or class instance sk_params : model parameters & fitting parameters build_fn should construct, compile and return a Keras model, which will then be used to fit/predict. One of the following three values could be passed to build_fn : A function An instance of a class that implements the __call__ method None. This means you implement a class that inherits from either KerasClassifier or KerasRegressor . The __call__ method of the present class will then be treated as the default build_fn . sk_params takes both model parameters and fitting parameters. Legal model parameters are the arguments of build_fn . Note that like all other estimators in scikit-learn, build_fn should provide default values for its arguments, so that you could create the estimator without passing any values to sk_params . sk_params could also accept parameters for calling fit , predict , predict_proba , and score methods (e.g., epochs , batch_size ). fitting (predicting) parameters are selected in the following order: Values passed to the dictionary arguments of fit , predict , predict_proba , and score methods Values passed to sk_params The default values of the keras.models.Sequential fit , predict , predict_proba and score methods When using scikit-learn's grid_search API, legal tunable parameters are those you could pass to sk_params , including fitting parameters. In other words, you could use grid_search to search for the best batch_size or epochs as well as the model parameters.\"), ('title', 'Arguments')]), OrderedDict([('location', 'utils.html'), ('text', \"[source] CustomObjectScope keras.utils.CustomObjectScope() Provides a scope that changes to _GLOBAL_CUSTOM_OBJECTS cannot escape. Code within a with statement will be able to access custom objects by name. Changes to global custom objects persist within the enclosing with statement. At end of the with statement, global custom objects are reverted to state at beginning of the with statement. Example Consider a custom object MyObject (e.g. a class): with CustomObjectScope({'MyObject':MyObject}): layer = Dense(..., kernel_regularizer='MyObject') # save, load, etc. will recognize custom object by name [source] HDF5Matrix keras.utils.HDF5Matrix(datapath, dataset, start=0, end=None, normalizer=None) Representation of HDF5 dataset to be used instead of a Numpy array. Example x_data = HDF5Matrix('input/file.hdf5', 'data') model.predict(x_data) Providing start and end allows use of a slice of the dataset. Optionally, a normalizer function (or lambda) can be given. This will be called on every slice of data retrieved. Arguments datapath : string, path to a HDF5 file dataset : string, name of the HDF5 dataset in the file specified in datapath start : int, start of desired slice of the specified dataset end : int, end of desired slice of the specified dataset normalizer : function to be called on data when retrieved Returns An array-like HDF5 dataset. [source] Sequence keras.utils.Sequence() Base object for fitting to a sequence of data, such as a dataset. Every Sequence must implement the __getitem__ and the __len__ methods. If you want to modify your dataset between epochs you may implement on_epoch_end . The method __getitem__ should return a complete batch. Notes Sequence are a safer way to do multiprocessing. This structure guarantees that the network will only train once on each sample per epoch which is not the case with generators. Examples from skimage.io import imread from skimage.transform import resize import numpy as np # Here, `x_set` is list of path to the images # and `y_set` are the associated classes. class CIFAR10Sequence(Sequence): def __init__(self, x_set, y_set, batch_size): self.x, self.y = x_set, y_set self.batch_size = batch_size def __len__(self): return int(np.ceil(len(self.x) / float(self.batch_size))) def __getitem__(self, idx): batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size] batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size] return np.array([ resize(imread(file_name), (200, 200)) for file_name in batch_x]), np.array(batch_y) to_categorical keras.utils.to_categorical(y, num_classes=None, dtype='float32') Converts a class vector (integers) to binary class matrix. E.g. for use with categorical_crossentropy. Arguments y : class vector to be converted into a matrix (integers from 0 to num_classes). num_classes : total number of classes. dtype : The data type expected by the input, as a string ( float32 , float64 , int32 ...) Returns A binary matrix representation of the input. The classes axis is placed last. normalize keras.utils.normalize(x, axis=-1, order=2) Normalizes a Numpy array. Arguments x : Numpy array to normalize. axis : axis along which to normalize. order : Normalization order (e.g. 2 for L2 norm). Returns A normalized copy of the array. get_file keras.utils.get_file(fname, origin, untar=False, md5_hash=None, file_hash=None, cache_subdir='datasets', hash_algorithm='auto', extract=False, archive_format='auto', cache_dir=None) Downloads a file from a URL if it not already in the cache. By default the file at the url origin is downloaded to the cache_dir ~/.keras , placed in the cache_subdir datasets , and given the filename fname . The final location of a file example.txt would therefore be ~/.keras/datasets/example.txt . Files in tar, tar.gz, tar.bz, and zip formats can also be extracted. Passing a hash will verify the file after download. The command line programs shasum and sha256sum can compute the hash. Arguments fname : Name of the file. If an absolute path /path/to/file.txt is specified the file will be saved at that location. origin : Original URL of the file. untar : Deprecated in favor of 'extract'. boolean, whether the file should be decompressed md5_hash : Deprecated in favor of 'file_hash'. md5 hash of the file for verification file_hash : The expected hash string of the file after download. The sha256 and md5 hash algorithms are both supported. cache_subdir : Subdirectory under the Keras cache dir where the file is saved. If an absolute path /path/to/folder is specified the file will be saved at that location. hash_algorithm : Select the hash algorithm to verify the file. options are 'md5', 'sha256', and 'auto'. The default 'auto' detects the hash algorithm in use. extract : True tries extracting the file as an Archive, like tar or zip. archive_format : Archive format to try for extracting the file. Options are 'auto', 'tar', 'zip', and None. 'tar' includes tar, tar.gz, and tar.bz files. The default 'auto' is ['tar', 'zip']. None or an empty list will return no matches found. cache_dir : Location to store cached files, when None it defaults to the Keras Directory . Returns Path to the downloaded file print_summary keras.utils.print_summary(model, line_length=None, positions=None, print_fn=None) Prints a summary of a model. Arguments model : Keras model instance. line_length : Total length of printed lines (e.g. set this to adapt the display to different terminal window sizes). positions : Relative or absolute positions of log elements in each line. If not provided, defaults to [.33, .55, .67, 1.] . print_fn : Print function to use. It will be called on each line of the summary. You can set it to a custom function in order to capture the string summary. It defaults to print (prints to stdout). plot_model keras.utils.plot_model(model, to_file='model.png', show_shapes=False, show_layer_names=True, rankdir='TB') Converts a Keras model to dot format and save to a file. Arguments model : A Keras model instance to_file : File name of the plot image. show_shapes : whether to display shape information. show_layer_names : whether to display layer names. rankdir : rankdir argument passed to PyDot, a string specifying the format of the plot: 'TB' creates a vertical plot; 'LR' creates a horizontal plot. multi_gpu_model keras.utils.multi_gpu_model(model, gpus=None, cpu_merge=True, cpu_relocation=False) Replicates a model on different GPUs. Specifically, this function implements single-machine multi-GPU data parallelism. It works in the following way: Divide the model's input(s) into multiple sub-batches. Apply a model copy on each sub-batch. Every model copy is executed on a dedicated GPU. Concatenate the results (on CPU) into one big batch. E.g. if your batch_size is 64 and you use gpus=2 , then we will divide the input into 2 sub-batches of 32 samples, process each sub-batch on one GPU, then return the full batch of 64 processed samples. This induces quasi-linear speedup on up to 8 GPUs. This function is only available with the TensorFlow backend for the time being. Arguments model : A Keras model instance. To avoid OOM errors, this model could have been built on CPU, for instance (see usage example below). gpus : Integer >= 2 or list of integers, number of GPUs or list of GPU IDs on which to create model replicas. cpu_merge : A boolean value to identify whether to force merging model weights under the scope of the CPU or not. cpu_relocation : A boolean value to identify whether to create the model's weights under the scope of the CPU. If the model is not defined under any preceding device scope, you can still rescue it by activating this option. Returns A Keras Model instance which can be used just like the initial model argument, but which distributes its workload on multiple GPUs. Example 1 - Training models with weights merge on CPU $Example_2_-_Training_models_with_weights_merge_on_CPU_using_cpu_relocation$0 Example 2 - Training models with weights merge on CPU using cpu_relocation $Example_2_-_Training_models_with_weights_merge_on_CPU_using_cpu_relocation$1 Example 3 - Training models with weights merge on GPU (recommended for NV-link) $Example_2_-_Training_models_with_weights_merge_on_CPU_using_cpu_relocation$2 On model saving To save the multi-gpu model, use .save(fname) or .save_weights(fname) with the template model (the argument you passed to multi_gpu_model ), rather than the model returned by multi_gpu_model .\"), ('title', 'Utils')]), OrderedDict([('location', 'utils.html#customobjectscope'), ('text', \"keras.utils.CustomObjectScope() Provides a scope that changes to _GLOBAL_CUSTOM_OBJECTS cannot escape. Code within a with statement will be able to access custom objects by name. Changes to global custom objects persist within the enclosing with statement. At end of the with statement, global custom objects are reverted to state at beginning of the with statement. Example Consider a custom object MyObject (e.g. a class): with CustomObjectScope({'MyObject':MyObject}): layer = Dense(..., kernel_regularizer='MyObject') # save, load, etc. will recognize custom object by name [source]\"), ('title', 'CustomObjectScope')]), OrderedDict([('location', 'utils.html#hdf5matrix'), ('text', \"keras.utils.HDF5Matrix(datapath, dataset, start=0, end=None, normalizer=None) Representation of HDF5 dataset to be used instead of a Numpy array. Example x_data = HDF5Matrix('input/file.hdf5', 'data') model.predict(x_data) Providing start and end allows use of a slice of the dataset. Optionally, a normalizer function (or lambda) can be given. This will be called on every slice of data retrieved. Arguments datapath : string, path to a HDF5 file dataset : string, name of the HDF5 dataset in the file specified in datapath start : int, start of desired slice of the specified dataset end : int, end of desired slice of the specified dataset normalizer : function to be called on data when retrieved Returns An array-like HDF5 dataset. [source]\"), ('title', 'HDF5Matrix')]), OrderedDict([('location', 'utils.html#sequence'), ('text', 'keras.utils.Sequence() Base object for fitting to a sequence of data, such as a dataset. Every Sequence must implement the __getitem__ and the __len__ methods. If you want to modify your dataset between epochs you may implement on_epoch_end . The method __getitem__ should return a complete batch. Notes Sequence are a safer way to do multiprocessing. This structure guarantees that the network will only train once on each sample per epoch which is not the case with generators. Examples from skimage.io import imread from skimage.transform import resize import numpy as np # Here, `x_set` is list of path to the images # and `y_set` are the associated classes. class CIFAR10Sequence(Sequence): def __init__(self, x_set, y_set, batch_size): self.x, self.y = x_set, y_set self.batch_size = batch_size def __len__(self): return int(np.ceil(len(self.x) / float(self.batch_size))) def __getitem__(self, idx): batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size] batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size] return np.array([ resize(imread(file_name), (200, 200)) for file_name in batch_x]), np.array(batch_y)'), ('title', 'Sequence')]), OrderedDict([('location', 'utils.html#to_categorical'), ('text', \"keras.utils.to_categorical(y, num_classes=None, dtype='float32') Converts a class vector (integers) to binary class matrix. E.g. for use with categorical_crossentropy. Arguments y : class vector to be converted into a matrix (integers from 0 to num_classes). num_classes : total number of classes. dtype : The data type expected by the input, as a string ( float32 , float64 , int32 ...) Returns A binary matrix representation of the input. The classes axis is placed last.\"), ('title', 'to_categorical')]), OrderedDict([('location', 'utils.html#normalize'), ('text', 'keras.utils.normalize(x, axis=-1, order=2) Normalizes a Numpy array. Arguments x : Numpy array to normalize. axis : axis along which to normalize. order : Normalization order (e.g. 2 for L2 norm). Returns A normalized copy of the array.'), ('title', 'normalize')]), OrderedDict([('location', 'utils.html#get_file'), ('text', \"keras.utils.get_file(fname, origin, untar=False, md5_hash=None, file_hash=None, cache_subdir='datasets', hash_algorithm='auto', extract=False, archive_format='auto', cache_dir=None) Downloads a file from a URL if it not already in the cache. By default the file at the url origin is downloaded to the cache_dir ~/.keras , placed in the cache_subdir datasets , and given the filename fname . The final location of a file example.txt would therefore be ~/.keras/datasets/example.txt . Files in tar, tar.gz, tar.bz, and zip formats can also be extracted. Passing a hash will verify the file after download. The command line programs shasum and sha256sum can compute the hash. Arguments fname : Name of the file. If an absolute path /path/to/file.txt is specified the file will be saved at that location. origin : Original URL of the file. untar : Deprecated in favor of 'extract'. boolean, whether the file should be decompressed md5_hash : Deprecated in favor of 'file_hash'. md5 hash of the file for verification file_hash : The expected hash string of the file after download. The sha256 and md5 hash algorithms are both supported. cache_subdir : Subdirectory under the Keras cache dir where the file is saved. If an absolute path /path/to/folder is specified the file will be saved at that location. hash_algorithm : Select the hash algorithm to verify the file. options are 'md5', 'sha256', and 'auto'. The default 'auto' detects the hash algorithm in use. extract : True tries extracting the file as an Archive, like tar or zip. archive_format : Archive format to try for extracting the file. Options are 'auto', 'tar', 'zip', and None. 'tar' includes tar, tar.gz, and tar.bz files. The default 'auto' is ['tar', 'zip']. None or an empty list will return no matches found. cache_dir : Location to store cached files, when None it defaults to the Keras Directory . Returns Path to the downloaded file\"), ('title', 'get_file')]), OrderedDict([('location', 'utils.html#print_summary'), ('text', 'keras.utils.print_summary(model, line_length=None, positions=None, print_fn=None) Prints a summary of a model. Arguments model : Keras model instance. line_length : Total length of printed lines (e.g. set this to adapt the display to different terminal window sizes). positions : Relative or absolute positions of log elements in each line. If not provided, defaults to [.33, .55, .67, 1.] . print_fn : Print function to use. It will be called on each line of the summary. You can set it to a custom function in order to capture the string summary. It defaults to print (prints to stdout).'), ('title', 'print_summary')]), OrderedDict([('location', 'utils.html#plot_model'), ('text', \"keras.utils.plot_model(model, to_file='model.png', show_shapes=False, show_layer_names=True, rankdir='TB') Converts a Keras model to dot format and save to a file. Arguments model : A Keras model instance to_file : File name of the plot image. show_shapes : whether to display shape information. show_layer_names : whether to display layer names. rankdir : rankdir argument passed to PyDot, a string specifying the format of the plot: 'TB' creates a vertical plot; 'LR' creates a horizontal plot.\"), ('title', 'plot_model')]), OrderedDict([('location', 'utils.html#multi_gpu_model'), ('text', \"keras.utils.multi_gpu_model(model, gpus=None, cpu_merge=True, cpu_relocation=False) Replicates a model on different GPUs. Specifically, this function implements single-machine multi-GPU data parallelism. It works in the following way: Divide the model's input(s) into multiple sub-batches. Apply a model copy on each sub-batch. Every model copy is executed on a dedicated GPU. Concatenate the results (on CPU) into one big batch. E.g. if your batch_size is 64 and you use gpus=2 , then we will divide the input into 2 sub-batches of 32 samples, process each sub-batch on one GPU, then return the full batch of 64 processed samples. This induces quasi-linear speedup on up to 8 GPUs. This function is only available with the TensorFlow backend for the time being. Arguments model : A Keras model instance. To avoid OOM errors, this model could have been built on CPU, for instance (see usage example below). gpus : Integer >= 2 or list of integers, number of GPUs or list of GPU IDs on which to create model replicas. cpu_merge : A boolean value to identify whether to force merging model weights under the scope of the CPU or not. cpu_relocation : A boolean value to identify whether to create the model's weights under the scope of the CPU. If the model is not defined under any preceding device scope, you can still rescue it by activating this option. Returns A Keras Model instance which can be used just like the initial model argument, but which distributes its workload on multiple GPUs. Example 1 - Training models with weights merge on CPU $Example_2_-_Training_models_with_weights_merge_on_CPU_using_cpu_relocation$0 Example 2 - Training models with weights merge on CPU using cpu_relocation $Example_2_-_Training_models_with_weights_merge_on_CPU_using_cpu_relocation$1 Example 3 - Training models with weights merge on GPU (recommended for NV-link) $Example_2_-_Training_models_with_weights_merge_on_CPU_using_cpu_relocation$2 On model saving To save the multi-gpu model, use .save(fname) or .save_weights(fname) with the template model (the argument you passed to multi_gpu_model ), rather than the model returned by multi_gpu_model .\"), ('title', 'multi_gpu_model')]), OrderedDict([('location', 'visualization.html'), ('text', \"Model visualization The keras.utils.vis_utils module provides utility functions to plot a Keras model (using graphviz ). This will plot a graph of the model and save it to a file: from keras.utils import plot_model plot_model(model, to_file='model.png') plot_model takes two optional arguments: show_shapes (defaults to False) controls whether output shapes are shown in the graph. show_layer_names (defaults to True) controls whether layer names are shown in the graph. You can also directly obtain the pydot.Graph object and render it yourself, for example to show it in an ipython notebook : from IPython.display import SVG from keras.utils.vis_utils import model_to_dot SVG(model_to_dot(model).create(prog='dot', format='svg')) Training history visualization The fit() method on a Keras Model returns a History object. The History.history attribute is a dictionary recording training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Here is a simple example using matplotlib to generate loss & accuracy plots for training & validation: import matplotlib.pyplot as plt history = model.fit(x, y, validation_split=0.25, epochs=50, batch_size=16, verbose=1) # Plot training & validation accuracy values plt.plot(history.history['acc']) plt.plot(history.history['val_acc']) plt.title('Model accuracy') plt.ylabel('Accuracy') plt.xlabel('Epoch') plt.legend(['Train', 'Test'], loc='upper left') plt.show() # Plot training & validation loss values plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('Model loss') plt.ylabel('Loss') plt.xlabel('Epoch') plt.legend(['Train', 'Test'], loc='upper left') plt.show()\"), ('title', 'Visualization')]), OrderedDict([('location', 'visualization.html#model-visualization'), ('text', \"The keras.utils.vis_utils module provides utility functions to plot a Keras model (using graphviz ). This will plot a graph of the model and save it to a file: from keras.utils import plot_model plot_model(model, to_file='model.png') plot_model takes two optional arguments: show_shapes (defaults to False) controls whether output shapes are shown in the graph. show_layer_names (defaults to True) controls whether layer names are shown in the graph. You can also directly obtain the pydot.Graph object and render it yourself, for example to show it in an ipython notebook : from IPython.display import SVG from keras.utils.vis_utils import model_to_dot SVG(model_to_dot(model).create(prog='dot', format='svg'))\"), ('title', 'Model visualization')]), OrderedDict([('location', 'visualization.html#training-history-visualization'), ('text', \"The fit() method on a Keras Model returns a History object. The History.history attribute is a dictionary recording training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Here is a simple example using matplotlib to generate loss & accuracy plots for training & validation: import matplotlib.pyplot as plt history = model.fit(x, y, validation_split=0.25, epochs=50, batch_size=16, verbose=1) # Plot training & validation accuracy values plt.plot(history.history['acc']) plt.plot(history.history['val_acc']) plt.title('Model accuracy') plt.ylabel('Accuracy') plt.xlabel('Epoch') plt.legend(['Train', 'Test'], loc='upper left') plt.show() # Plot training & validation loss values plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('Model loss') plt.ylabel('Loss') plt.xlabel('Epoch') plt.legend(['Train', 'Test'], loc='upper left') plt.show()\"), ('title', 'Training history visualization')]), OrderedDict([('location', 'why-use-keras.html'), ('text', \"Why use Keras? There are countless deep learning frameworks available today. Why use Keras rather than any other? Here are some of the areas in which Keras compares favorably to existing alternatives. Keras prioritizes developer experience Keras is an API designed for human beings, not machines. Keras follows best practices for reducing cognitive load : it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear and actionable feedback upon user error. This makes Keras easy to learn and easy to use. As a Keras user, you are more productive, allowing you to try more ideas than your competition, faster -- which in turn helps you win machine learning competitions . This ease of use does not come at the cost of reduced flexibility: because Keras integrates with lower-level deep learning languages (in particular TensorFlow), it enables you to implement anything you could have built in the base language. In particular, as tf.keras , the Keras API integrates seamlessly with your TensorFlow workflows. Keras has broad adoption in the industry and the research community Deep learning frameworks ranking computed by Jeff Hale, based on 11 data sources across 7 categories With over 250,000 individual users as of mid-2018, Keras has stronger adoption in both the industry and the research community than any other deep learning framework except TensorFlow itself (and the Keras API is the official frontend of TensorFlow, via the tf.keras module). You are already constantly interacting with features built with Keras -- it is in use at Netflix, Uber, Yelp, Instacart, Zocdoc, Square, and many others. It is especially popular among startups that place deep learning at the core of their products. Keras is also a favorite among deep learning researchers, coming in #2 in terms of mentions in scientific papers uploaded to the preprint server arXiv.org . Keras has also been adopted by researchers at large scientific organizations, in particular CERN and NASA. Keras makes it easy to turn models into products Your Keras models can be easily deployed across a greater range of platforms than any other deep learning framework: On iOS, via Apple\u2019s CoreML (Keras support officially provided by Apple). Here's a tutorial . On Android, via the TensorFlow Android runtime. Example: Not Hotdog app . In the browser, via GPU-accelerated JavaScript runtimes such as Keras.js and WebDNN . On Google Cloud, via TensorFlow-Serving . In a Python webapp backend (such as a Flask app) . On the JVM, via DL4J model import provided by SkyMind . On Raspberry Pi. Keras supports multiple backend engines and does not lock you into one ecosystem Your Keras models can be developed with a range of different deep learning backends . Importantly, any Keras model that only leverages built-in layers will be portable across all these backends: you can train a model with one backend, and load it with another (e.g. for deployment). Available backends include: The TensorFlow backend (from Google) The CNTK backend (from Microsoft) The Theano backend Amazon is also currently working on developing a MXNet backend for Keras. As such, your Keras model can be trained on a number of different hardware platforms beyond CPUs: NVIDIA GPUs Google TPUs , via the TensorFlow backend and Google Cloud OpenCL-enabled GPUs, such as those from AMD, via the PlaidML Keras backend Keras has strong multi-GPU support and distributed training support Keras has built-in support for multi-GPU data parallelism Horovod , from Uber, has first-class support for Keras models Keras models can be turned into TensorFlow Estimators and trained on clusters of GPUs on Google Cloud Keras can be run on Spark via Dist-Keras (from CERN) and Elephas Keras development is backed by key companies in the deep learning ecosystem Keras development is backed primarily by Google, and the Keras API comes packaged in TensorFlow as tf.keras . Additionally, Microsoft maintains the CNTK Keras backend. Amazon AWS is developing MXNet support. Other contributing companies include NVIDIA, Uber, and Apple (with CoreML).\"), ('title', 'Why use Keras')]), OrderedDict([('location', 'why-use-keras.html#why-use-keras'), ('text', 'There are countless deep learning frameworks available today. Why use Keras rather than any other? Here are some of the areas in which Keras compares favorably to existing alternatives.'), ('title', 'Why use Keras?')]), OrderedDict([('location', 'why-use-keras.html#keras-prioritizes-developer-experience'), ('text', 'Keras is an API designed for human beings, not machines. Keras follows best practices for reducing cognitive load : it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear and actionable feedback upon user error. This makes Keras easy to learn and easy to use. As a Keras user, you are more productive, allowing you to try more ideas than your competition, faster -- which in turn helps you win machine learning competitions . This ease of use does not come at the cost of reduced flexibility: because Keras integrates with lower-level deep learning languages (in particular TensorFlow), it enables you to implement anything you could have built in the base language. In particular, as tf.keras , the Keras API integrates seamlessly with your TensorFlow workflows.'), ('title', 'Keras prioritizes developer experience')]), OrderedDict([('location', 'why-use-keras.html#keras-has-broad-adoption-in-the-industry-and-the-research-community'), ('text', 'Deep learning frameworks ranking computed by Jeff Hale, based on 11 data sources across 7 categories With over 250,000 individual users as of mid-2018, Keras has stronger adoption in both the industry and the research community than any other deep learning framework except TensorFlow itself (and the Keras API is the official frontend of TensorFlow, via the tf.keras module). You are already constantly interacting with features built with Keras -- it is in use at Netflix, Uber, Yelp, Instacart, Zocdoc, Square, and many others. It is especially popular among startups that place deep learning at the core of their products. Keras is also a favorite among deep learning researchers, coming in #2 in terms of mentions in scientific papers uploaded to the preprint server arXiv.org . Keras has also been adopted by researchers at large scientific organizations, in particular CERN and NASA.'), ('title', 'Keras has broad adoption in the industry and the research community')]), OrderedDict([('location', 'why-use-keras.html#keras-makes-it-easy-to-turn-models-into-products'), ('text', \"Your Keras models can be easily deployed across a greater range of platforms than any other deep learning framework: On iOS, via Apple\u2019s CoreML (Keras support officially provided by Apple). Here's a tutorial . On Android, via the TensorFlow Android runtime. Example: Not Hotdog app . In the browser, via GPU-accelerated JavaScript runtimes such as Keras.js and WebDNN . On Google Cloud, via TensorFlow-Serving . In a Python webapp backend (such as a Flask app) . On the JVM, via DL4J model import provided by SkyMind . On Raspberry Pi.\"), ('title', 'Keras makes it easy to turn models into products')]), OrderedDict([('location', 'why-use-keras.html#keras-supports-multiple-backend-engines-and-does-not-lock-you-into-one-ecosystem'), ('text', 'Your Keras models can be developed with a range of different deep learning backends . Importantly, any Keras model that only leverages built-in layers will be portable across all these backends: you can train a model with one backend, and load it with another (e.g. for deployment). Available backends include: The TensorFlow backend (from Google) The CNTK backend (from Microsoft) The Theano backend Amazon is also currently working on developing a MXNet backend for Keras. As such, your Keras model can be trained on a number of different hardware platforms beyond CPUs: NVIDIA GPUs Google TPUs , via the TensorFlow backend and Google Cloud OpenCL-enabled GPUs, such as those from AMD, via the PlaidML Keras backend'), ('title', 'Keras supports multiple backend engines and does not lock you into one ecosystem')]), OrderedDict([('location', 'why-use-keras.html#keras-has-strong-multi-gpu-support-and-distributed-training-support'), ('text', 'Keras has built-in support for multi-GPU data parallelism Horovod , from Uber, has first-class support for Keras models Keras models can be turned into TensorFlow Estimators and trained on clusters of GPUs on Google Cloud Keras can be run on Spark via Dist-Keras (from CERN) and Elephas'), ('title', 'Keras has strong multi-GPU support and distributed training support')]), OrderedDict([('location', 'why-use-keras.html#keras-development-is-backed-by-key-companies-in-the-deep-learning-ecosystem'), ('text', 'Keras development is backed primarily by Google, and the Keras API comes packaged in TensorFlow as tf.keras . Additionally, Microsoft maintains the CNTK Keras backend. Amazon AWS is developing MXNet support. Other contributing companies include NVIDIA, Uber, and Apple (with CoreML).'), ('title', 'Keras development is backed by key companies in the deep learning ecosystem')]), OrderedDict([('location', 'getting-started/faq.html'), ('text', 'Keras FAQ: Frequently Asked Keras Questions How should I cite Keras? How can I run Keras on GPU? How can I run a Keras model on multiple GPUs? What does \"sample\", \"batch\", \"epoch\" mean? How can I save a Keras model? Why is the training loss much higher than the testing loss? How can I obtain the output of an intermediate layer? How can I use Keras with datasets that don\\'t fit in memory? How can I interrupt training when the validation loss isn\\'t decreasing anymore? How is the validation split computed? Is the data shuffled during training? How can I record the training / validation loss / accuracy at each epoch? How can I \"freeze\" layers? How can I use stateful RNNs? How can I remove a layer from a Sequential model? How can I use pre-trained models in Keras? How can I use HDF5 inputs with Keras? Where is the Keras configuration file stored? How can I obtain reproducible results using Keras during development? How can I install HDF5 or h5py to save my models in Keras? How should I cite Keras? Please cite Keras in your publications if it helps your research. Here is an example BibTeX entry: @misc{chollet2015keras, title={Keras}, author={Chollet, Fran\\\\c{c}ois and others}, year={2015}, howpublished={\\\\url{https://keras.io}}, } How can I run Keras on GPU? If you are running on the TensorFlow or CNTK backends, your code will automatically run on GPU if any available GPU is detected. If you are running on the Theano backend, you can use one of the following methods: Method 1 : use Theano flags. THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py The name \\'gpu\\' might have to be changed depending on your device\\'s identifier (e.g. gpu0 , gpu1 , etc). Method 2 : set up your .theanorc : Instructions Method 3 : manually set theano.config.device , theano.config.floatX at the beginning of your code: import theano theano.config.device = \\'gpu\\' theano.config.floatX = \\'float32\\' How can I run a Keras model on multiple GPUs? We recommend doing so using the TensorFlow backend. There are two ways to run a single model on multiple GPUs: data parallelism and device parallelism . In most cases, what you need is most likely data parallelism. Data parallelism Data parallelism consists in replicating the target model once on each device, and using each replica to process a different fraction of the input data. Keras has a built-in utility, keras.utils.multi_gpu_model , which can produce a data-parallel version of any model, and achieves quasi-linear speedup on up to 8 GPUs. For more information, see the documentation for multi_gpu_model . Here is a quick example: from keras.utils import multi_gpu_model # Replicates `model` on 8 GPUs. # This assumes that your machine has 8 available GPUs. parallel_model = multi_gpu_model(model, gpus=8) parallel_model.compile(loss=\\'categorical_crossentropy\\', optimizer=\\'rmsprop\\') # This `fit` call will be distributed on 8 GPUs. # Since the batch size is 256, each GPU will process 32 samples. parallel_model.fit(x, y, epochs=20, batch_size=256) Device parallelism Device parallelism consists in running different parts of a same model on different devices. It works best for models that have a parallel architecture, e.g. a model with two branches. This can be achieved by using TensorFlow device scopes. Here is a quick example: # Model where a shared LSTM is used to encode two different sequences in parallel input_a = keras.Input(shape=(140, 256)) input_b = keras.Input(shape=(140, 256)) shared_lstm = keras.layers.LSTM(64) # Process the first sequence on one GPU with tf.device_scope(\\'/gpu:0\\'): encoded_a = shared_lstm(tweet_a) # Process the next sequence on another GPU with tf.device_scope(\\'/gpu:1\\'): encoded_b = shared_lstm(tweet_b) # Concatenate results on CPU with tf.device_scope(\\'/cpu:0\\'): merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1) What does \"sample\", \"batch\", \"epoch\" mean? Below are some common definitions that are necessary to know and understand to correctly utilize Keras: Sample : one element of a dataset. Example: one image is a sample in a convolutional network Example: one audio file is a sample for a speech recognition model Batch : a set of N samples. The samples in a batch are processed independently, in parallel. If training, a batch results in only one update to the model. A batch generally approximates the distribution of the input data better than a single input. The larger the batch, the better the approximation; however, it is also true that the batch will take longer to process and will still result in only one update. For inference (evaluate/predict), it is recommended to pick a batch size that is as large as you can afford without going out of memory (since larger batches will usually result in faster evaluating/prediction). Epoch : an arbitrary cutoff, generally defined as \"one pass over the entire dataset\", used to separate training into distinct phases, which is useful for logging and periodic evaluation. When using evaluation_data or evaluation_split with the fit method of Keras models, evaluation will be run at the end of every epoch . Within Keras, there is the ability to add callbacks specifically designed to be run at the end of an epoch . Examples of these are learning rate changes and model checkpointing (saving). How can I save a Keras model? Saving/loading whole models (architecture + weights + optimizer state) It is not recommended to use pickle or cPickle to save a Keras model. You can use model.save(filepath) to save a Keras model into a single HDF5 file which will contain: the architecture of the model, allowing to re-create the model the weights of the model the training configuration (loss, optimizer) the state of the optimizer, allowing to resume training exactly where you left off. You can then use keras.models.load_model(filepath) to reinstantiate your model. load_model will also take care of compiling the model using the saved training configuration (unless the model was never compiled in the first place). Please also see How can I install HDF5 or h5py to save my models in Keras? for instructions on how to install h5py . Example: from keras.models import load_model model.save(\\'my_model.h5\\') # creates a HDF5 file \\'my_model.h5\\' del model # deletes the existing model # returns a compiled model # identical to the previous one model = load_model(\\'my_model.h5\\') Saving/loading only a model\\'s architecture If you only need to save the architecture of a model , and not its weights or its training configuration, you can do: # save as JSON json_string = model.to_json() # save as YAML yaml_string = model.to_yaml() The generated JSON / YAML files are human-readable and can be manually edited if needed. You can then build a fresh model from this data: # model reconstruction from JSON: from keras.models import model_from_json model = model_from_json(json_string) # model reconstruction from YAML from keras.models import model_from_yaml model = model_from_yaml(yaml_string) Saving/loading only a model\\'s weights If you need to save the weights of a model , you can do so in HDF5 with the code below. model.save_weights(\\'my_model_weights.h5\\') Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the same architecture: model.load_weights(\\'my_model_weights.h5\\') If you need to load weights into a different architecture (with some layers in common), for instance for fine-tuning or transfer-learning, you can load weights by layer name : model.load_weights(\\'my_model_weights.h5\\', by_name=True) Please also see How can I install HDF5 or h5py to save my models in Keras? for instructions on how to install h5py . For example: \"\"\" Assuming the original model looks like this: model = Sequential() model.add(Dense(2, input_dim=3, name=\\'dense_1\\')) model.add(Dense(3, name=\\'dense_2\\')) ... model.save_weights(fname) \"\"\" # new model model = Sequential() model.add(Dense(2, input_dim=3, name=\\'dense_1\\')) # will be loaded model.add(Dense(10, name=\\'new_dense\\')) # will not be loaded # load weights from first model; will only affect the first layer, dense_1. model.load_weights(fname, by_name=True) Handling custom layers (or other custom objects) in saved models If the model you want to load includes custom layers or other custom classes or functions, you can pass them to the loading mechanism via the custom_objects argument: from keras.models import load_model # Assuming your model includes instance of an \"AttentionLayer\" class model = load_model(\\'my_model.h5\\', custom_objects={\\'AttentionLayer\\': AttentionLayer}) Alternatively, you can use a custom object scope : from keras.utils import CustomObjectScope with CustomObjectScope({\\'AttentionLayer\\': AttentionLayer}): model = load_model(\\'my_model.h5\\') Custom objects handling works the same way for load_model , model_from_json , model_from_yaml : from keras.models import model_from_json model = model_from_json(json_string, custom_objects={\\'AttentionLayer\\': AttentionLayer}) Why is the training loss much higher than the testing loss? A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time. Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss. How can I obtain the output of an intermediate layer? One simple way is to create a new Model that will output the layers that you are interested in: from keras.models import Model model = ... # create the original model layer_name = \\'my_layer\\' intermediate_layer_model = Model(inputs=model.input, outputs=model.get_layer(layer_name).output) intermediate_output = intermediate_layer_model.predict(data) Alternatively, you can build a Keras function that will return the output of a certain layer given a certain input, for example: from keras import backend as K # with a Sequential model get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[3].output]) layer_output = get_3rd_layer_output([x])[0] Similarly, you could build a Theano and TensorFlow function directly. Note that if your model has a different behavior in training and testing phase (e.g. if it uses Dropout , BatchNormalization , etc.), you will need to pass the learning phase flag to your function: get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()], [model.layers[3].output]) # output in test mode = 0 layer_output = get_3rd_layer_output([x, 0])[0] # output in train mode = 1 layer_output = get_3rd_layer_output([x, 1])[0] How can I use Keras with datasets that don\\'t fit in memory? You can do batch training using model.train_on_batch(x, y) and model.test_on_batch(x, y) . See the models documentation . Alternatively, you can write a generator that yields batches of training data and use the method model.fit_generator(data_generator, steps_per_epoch, epochs) . You can see batch training in action in our CIFAR10 example . How can I interrupt training when the validation loss isn\\'t decreasing anymore? You can use an EarlyStopping callback: from keras.callbacks import EarlyStopping early_stopping = EarlyStopping(monitor=\\'val_loss\\', patience=2) model.fit(x, y, validation_split=0.2, callbacks=[early_stopping]) Find out more in the callbacks documentation . How is the validation split computed? If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data. If you set it to 0.25, it will be the last 25% of the data, etc. Note that the data isn\\'t shuffled before extracting the validation split, so the validation is literally just the last x% of samples in the input you passed. The same validation set is used for all epochs (within a same call to fit ). Is the data shuffled during training? Yes, if the shuffle argument in model.fit is set to True (which is the default), the training data will be randomly shuffled at each epoch. Validation data is never shuffled. How can I record the training / validation loss / accuracy at each epoch? The model.fit method returns an History callback, which has a history attribute containing the lists of successive losses and other metrics. hist = model.fit(x, y, validation_split=0.2) print(hist.history) How can I \"freeze\" Keras layers? To \"freeze\" a layer means to exclude it from training, i.e. its weights will never be updated. This is useful in the context of fine-tuning a model, or using fixed embeddings for a text input. You can pass a trainable argument (boolean) to a layer constructor to set a layer to be non-trainable: frozen_layer = Dense(32, trainable=False) Additionally, you can set the trainable property of a layer to True or False after instantiation. For this to take effect, you will need to call compile() on your model after modifying the trainable property. Here\\'s an example: x = Input(shape=(32,)) layer = Dense(32) layer.trainable = False y = layer(x) frozen_model = Model(x, y) # in the model below, the weights of `layer` will not be updated during training frozen_model.compile(optimizer=\\'rmsprop\\', loss=\\'mse\\') layer.trainable = True trainable_model = Model(x, y) # with this model the weights of the layer will be updated during training # (which will also affect the above model since it uses the same layer instance) trainable_model.compile(optimizer=\\'rmsprop\\', loss=\\'mse\\') frozen_model.fit(data, labels) # this does NOT update the weights of `layer` trainable_model.fit(data, labels) # this updates the weights of `layer` How can I use stateful RNNs? Making a RNN stateful means that the states for the samples of each batch will be reused as initial states for the samples in the next batch. When using stateful RNNs, it is therefore assumed that: all batches have the same number of samples If x1 and x2 are successive batches of samples, then x2[i] is the follow-up sequence to x1[i] , for every i . To use statefulness in RNNs, you need to: explicitly specify the batch size you are using, by passing a batch_size argument to the first layer in your model. E.g. batch_size=32 for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep. set stateful=True in your RNN layer(s). specify shuffle=False when calling fit(). To reset the states accumulated: use model.reset_states() to reset the states of all layers in the model use layer.reset_states() to reset the states of a specific stateful RNN layer Example: x # this is our input data, of shape (32, 21, 16) # we will feed it to our model in sequences of length 10 model = Sequential() model.add(LSTM(32, input_shape=(10, 16), batch_size=32, stateful=True)) model.add(Dense(16, activation=\\'softmax\\')) model.compile(optimizer=\\'rmsprop\\', loss=\\'categorical_crossentropy\\') # we train the network to predict the 11th timestep given the first 10: model.train_on_batch(x[:, :10, :], np.reshape(x[:, 10, :], (32, 16))) # the state of the network has changed. We can feed the follow-up sequences: model.train_on_batch(x[:, 10:20, :], np.reshape(x[:, 20, :], (32, 16))) # let\\'s reset the states of the LSTM layer: model.reset_states() # another way to do it in this case: model.layers[0].reset_states() Note that the methods predict , fit , train_on_batch , predict_classes , etc. will all update the states of the stateful layers in a model. This allows you to do not only stateful training, but also stateful prediction. How can I remove a layer from a Sequential model? You can remove the last added layer in a Sequential model by calling .pop() : model = Sequential() model.add(Dense(32, activation=\\'relu\\', input_dim=784)) model.add(Dense(32, activation=\\'relu\\')) print(len(model.layers)) # \"2\" model.pop() print(len(model.layers)) # \"1\" How can I use pre-trained models in Keras? Code and pre-trained weights are available for the following image classification models: Xception VGG16 VGG19 ResNet50 Inception v3 Inception-ResNet v2 MobileNet v1 They can be imported from the module keras.applications : from keras.applications.xception import Xception from keras.applications.vgg16 import VGG16 from keras.applications.vgg19 import VGG19 from keras.applications.resnet50 import ResNet50 from keras.applications.inception_v3 import InceptionV3 from keras.applications.inception_resnet_v2 import InceptionResNetV2 from keras.applications.mobilenet import MobileNet model = VGG16(weights=\\'imagenet\\', include_top=True) For a few simple usage examples, see the documentation for the Applications module . For a detailed example of how to use such a pre-trained model for feature extraction or for fine-tuning, see this blog post . The VGG16 model is also the basis for several Keras example scripts: Style transfer Feature visualization Deep dream How can I use HDF5 inputs with Keras? You can use the HDF5Matrix class from keras.utils.io_utils . See the HDF5Matrix documentation for details. You can also directly use a HDF5 dataset: import h5py with h5py.File(\\'input/file.hdf5\\', \\'r\\') as f: x_data = f[\\'x_data\\'] model.predict(x_data) Please also see How can I install HDF5 or h5py to save my models in Keras? for instructions on how to install h5py . Where is the Keras configuration file stored? The default directory where all Keras data is stored is: $HOME/.keras/ Note that Windows users should replace $HOME with %USERPROFILE% . In case Keras cannot create the above directory (e.g. due to permission issues), /tmp/.keras/ is used as a backup. The Keras configuration file is a JSON file stored at $HOME/.keras/keras.json . The default configuration file looks like this: { \"image_data_format\": \"channels_last\", \"epsilon\": 1e-07, \"floatx\": \"float32\", \"backend\": \"tensorflow\" } It contains the following fields: The image data format to be used as default by image processing layers and utilities (either channels_last or channels_first ). The epsilon numerical fuzz factor to be used to prevent division by zero in some operations. The default float data type. The default backend. See the backend documentation . Likewise, cached dataset files, such as those downloaded with get_file() , are stored by default in $HOME/.keras/datasets/ . How can I obtain reproducible results using Keras during development? During development of a model, sometimes it is useful to be able to obtain reproducible results from run to run in order to determine if a change in performance is due to an actual model or data modification, or merely a result of a new random sample. First, you need to set the PYTHONHASHSEED environment variable to 0 before the program starts (not within the program itself). This is necessary in Python 3.2.3 onwards to have reproducible behavior for certain hash-based operations (e.g., the item order in a set or a dict, see Python\\'s documentation or issue #2280 for further details). One way to set the environment variable is when starting python like this: $ cat test_hash.py print(hash(\"keras\")) $ python3 test_hash.py # non-reproducible hash (Python 3.2.3+) -8127205062320133199 $ python3 test_hash.py # non-reproducible hash (Python 3.2.3+) 3204480642156461591 $ PYTHONHASHSEED=0 python3 test_hash.py # reproducible hash 4883664951434749476 $ PYTHONHASHSEED=0 python3 test_hash.py # reproducible hash 4883664951434749476 Moreover, when using the TensorFlow backend and running on a GPU, some operations have non-deterministic outputs, in particular tf.reduce_sum() . This is due to the fact that GPUs run many operations in parallel, so the order of execution is not always guaranteed. Due to the limited precision of floats, even adding several numbers together may give slightly different results depending on the order in which you add them. You can try to avoid the non-deterministic operations, but some may be created automatically by TensorFlow to compute the gradients, so it is much simpler to just run the code on the CPU. For this, you can set the CUDA_VISIBLE_DEVICES environment variable to an empty string, for example: $ CUDA_VISIBLE_DEVICES=\"\" PYTHONHASHSEED=0 python your_program.py The below snippet of code provides an example of how to obtain reproducible results - this is geared towards a TensorFlow backend for a Python 3 environment: import numpy as np import tensorflow as tf import random as rn # The below is necessary for starting Numpy generated random numbers # in a well-defined initial state. np.random.seed(42) # The below is necessary for starting core Python generated random numbers # in a well-defined state. rn.seed(12345) # Force TensorFlow to use single thread. # Multiple threads are a potential source of non-reproducible results. # For further details, see: https://stackoverflow.com/questions/42022950/ session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1) from keras import backend as K # The below tf.set_random_seed() will make random number generation # in the TensorFlow backend have a well-defined initial state. # For further details, see: # https://www.tensorflow.org/api_docs/python/tf/set_random_seed tf.set_random_seed(1234) sess = tf.Session(graph=tf.get_default_graph(), config=session_conf) K.set_session(sess) # Rest of code follows ... How can I install HDF5 or h5py to save my models in Keras? In order to save your Keras models as HDF5 files, e.g. via keras.callbacks.ModelCheckpoint , Keras uses the h5py Python package. It is a dependency of Keras and should be installed by default. On Debian-based distributions, you will have to additionally install libhdf5 : sudo apt-get install libhdf5-serial-dev If you are unsure if h5py is installed you can open a Python shell and load the module via import h5py If it imports without error it is installed otherwise you can find detailed installation instructions here: http://docs.h5py.org/en/latest/build.html'), ('title', 'FAQ')]), OrderedDict([('location', 'getting-started/faq.html#keras-faq-frequently-asked-keras-questions'), ('text', 'How should I cite Keras? How can I run Keras on GPU? How can I run a Keras model on multiple GPUs? What does \"sample\", \"batch\", \"epoch\" mean? How can I save a Keras model? Why is the training loss much higher than the testing loss? How can I obtain the output of an intermediate layer? How can I use Keras with datasets that don\\'t fit in memory? How can I interrupt training when the validation loss isn\\'t decreasing anymore? How is the validation split computed? Is the data shuffled during training? How can I record the training / validation loss / accuracy at each epoch? How can I \"freeze\" layers? How can I use stateful RNNs? How can I remove a layer from a Sequential model? How can I use pre-trained models in Keras? How can I use HDF5 inputs with Keras? Where is the Keras configuration file stored? How can I obtain reproducible results using Keras during development? How can I install HDF5 or h5py to save my models in Keras?'), ('title', 'Keras FAQ: Frequently Asked Keras Questions')]), OrderedDict([('location', 'getting-started/faq.html#how-should-i-cite-keras'), ('text', 'Please cite Keras in your publications if it helps your research. Here is an example BibTeX entry: @misc{chollet2015keras, title={Keras}, author={Chollet, Fran\\\\c{c}ois and others}, year={2015}, howpublished={\\\\url{https://keras.io}}, }'), ('title', 'How should I cite Keras?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-run-keras-on-gpu'), ('text', \"If you are running on the TensorFlow or CNTK backends, your code will automatically run on GPU if any available GPU is detected. If you are running on the Theano backend, you can use one of the following methods: Method 1 : use Theano flags. THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py The name 'gpu' might have to be changed depending on your device's identifier (e.g. gpu0 , gpu1 , etc). Method 2 : set up your .theanorc : Instructions Method 3 : manually set theano.config.device , theano.config.floatX at the beginning of your code: import theano theano.config.device = 'gpu' theano.config.floatX = 'float32'\"), ('title', 'How can I run Keras on GPU?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-run-a-keras-model-on-multiple-gpus'), ('text', 'We recommend doing so using the TensorFlow backend. There are two ways to run a single model on multiple GPUs: data parallelism and device parallelism . In most cases, what you need is most likely data parallelism.'), ('title', 'How can I run a Keras model on multiple GPUs?')]), OrderedDict([('location', 'getting-started/faq.html#data-parallelism'), ('text', \"Data parallelism consists in replicating the target model once on each device, and using each replica to process a different fraction of the input data. Keras has a built-in utility, keras.utils.multi_gpu_model , which can produce a data-parallel version of any model, and achieves quasi-linear speedup on up to 8 GPUs. For more information, see the documentation for multi_gpu_model . Here is a quick example: from keras.utils import multi_gpu_model # Replicates `model` on 8 GPUs. # This assumes that your machine has 8 available GPUs. parallel_model = multi_gpu_model(model, gpus=8) parallel_model.compile(loss='categorical_crossentropy', optimizer='rmsprop') # This `fit` call will be distributed on 8 GPUs. # Since the batch size is 256, each GPU will process 32 samples. parallel_model.fit(x, y, epochs=20, batch_size=256)\"), ('title', 'Data parallelism')]), OrderedDict([('location', 'getting-started/faq.html#device-parallelism'), ('text', \"Device parallelism consists in running different parts of a same model on different devices. It works best for models that have a parallel architecture, e.g. a model with two branches. This can be achieved by using TensorFlow device scopes. Here is a quick example: # Model where a shared LSTM is used to encode two different sequences in parallel input_a = keras.Input(shape=(140, 256)) input_b = keras.Input(shape=(140, 256)) shared_lstm = keras.layers.LSTM(64) # Process the first sequence on one GPU with tf.device_scope('/gpu:0'): encoded_a = shared_lstm(tweet_a) # Process the next sequence on another GPU with tf.device_scope('/gpu:1'): encoded_b = shared_lstm(tweet_b) # Concatenate results on CPU with tf.device_scope('/cpu:0'): merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)\"), ('title', 'Device parallelism')]), OrderedDict([('location', 'getting-started/faq.html#what-does-sample-batch-epoch-mean'), ('text', 'Below are some common definitions that are necessary to know and understand to correctly utilize Keras: Sample : one element of a dataset. Example: one image is a sample in a convolutional network Example: one audio file is a sample for a speech recognition model Batch : a set of N samples. The samples in a batch are processed independently, in parallel. If training, a batch results in only one update to the model. A batch generally approximates the distribution of the input data better than a single input. The larger the batch, the better the approximation; however, it is also true that the batch will take longer to process and will still result in only one update. For inference (evaluate/predict), it is recommended to pick a batch size that is as large as you can afford without going out of memory (since larger batches will usually result in faster evaluating/prediction). Epoch : an arbitrary cutoff, generally defined as \"one pass over the entire dataset\", used to separate training into distinct phases, which is useful for logging and periodic evaluation. When using evaluation_data or evaluation_split with the fit method of Keras models, evaluation will be run at the end of every epoch . Within Keras, there is the ability to add callbacks specifically designed to be run at the end of an epoch . Examples of these are learning rate changes and model checkpointing (saving).'), ('title', 'What does \"sample\", \"batch\", \"epoch\" mean?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-save-a-keras-model'), ('text', ''), ('title', 'How can I save a Keras model?')]), OrderedDict([('location', 'getting-started/faq.html#savingloading-whole-models-architecture-weights-optimizer-state'), ('text', \"It is not recommended to use pickle or cPickle to save a Keras model. You can use model.save(filepath) to save a Keras model into a single HDF5 file which will contain: the architecture of the model, allowing to re-create the model the weights of the model the training configuration (loss, optimizer) the state of the optimizer, allowing to resume training exactly where you left off. You can then use keras.models.load_model(filepath) to reinstantiate your model. load_model will also take care of compiling the model using the saved training configuration (unless the model was never compiled in the first place). Please also see How can I install HDF5 or h5py to save my models in Keras? for instructions on how to install h5py . Example: from keras.models import load_model model.save('my_model.h5') # creates a HDF5 file 'my_model.h5' del model # deletes the existing model # returns a compiled model # identical to the previous one model = load_model('my_model.h5')\"), ('title', 'Saving/loading whole models (architecture + weights + optimizer state)')]), OrderedDict([('location', 'getting-started/faq.html#savingloading-only-a-models-architecture'), ('text', 'If you only need to save the architecture of a model , and not its weights or its training configuration, you can do: # save as JSON json_string = model.to_json() # save as YAML yaml_string = model.to_yaml() The generated JSON / YAML files are human-readable and can be manually edited if needed. You can then build a fresh model from this data: # model reconstruction from JSON: from keras.models import model_from_json model = model_from_json(json_string) # model reconstruction from YAML from keras.models import model_from_yaml model = model_from_yaml(yaml_string)'), ('title', \"Saving/loading only a model's architecture\")]), OrderedDict([('location', 'getting-started/faq.html#savingloading-only-a-models-weights'), ('text', 'If you need to save the weights of a model , you can do so in HDF5 with the code below. model.save_weights(\\'my_model_weights.h5\\') Assuming you have code for instantiating your model, you can then load the weights you saved into a model with the same architecture: model.load_weights(\\'my_model_weights.h5\\') If you need to load weights into a different architecture (with some layers in common), for instance for fine-tuning or transfer-learning, you can load weights by layer name : model.load_weights(\\'my_model_weights.h5\\', by_name=True) Please also see How can I install HDF5 or h5py to save my models in Keras? for instructions on how to install h5py . For example: \"\"\" Assuming the original model looks like this: model = Sequential() model.add(Dense(2, input_dim=3, name=\\'dense_1\\')) model.add(Dense(3, name=\\'dense_2\\')) ... model.save_weights(fname) \"\"\" # new model model = Sequential() model.add(Dense(2, input_dim=3, name=\\'dense_1\\')) # will be loaded model.add(Dense(10, name=\\'new_dense\\')) # will not be loaded # load weights from first model; will only affect the first layer, dense_1. model.load_weights(fname, by_name=True)'), ('title', \"Saving/loading only a model's weights\")]), OrderedDict([('location', 'getting-started/faq.html#handling-custom-layers-or-other-custom-objects-in-saved-models'), ('text', 'If the model you want to load includes custom layers or other custom classes or functions, you can pass them to the loading mechanism via the custom_objects argument: from keras.models import load_model # Assuming your model includes instance of an \"AttentionLayer\" class model = load_model(\\'my_model.h5\\', custom_objects={\\'AttentionLayer\\': AttentionLayer}) Alternatively, you can use a custom object scope : from keras.utils import CustomObjectScope with CustomObjectScope({\\'AttentionLayer\\': AttentionLayer}): model = load_model(\\'my_model.h5\\') Custom objects handling works the same way for load_model , model_from_json , model_from_yaml : from keras.models import model_from_json model = model_from_json(json_string, custom_objects={\\'AttentionLayer\\': AttentionLayer})'), ('title', 'Handling custom layers (or other custom objects) in saved models')]), OrderedDict([('location', 'getting-started/faq.html#why-is-the-training-loss-much-higher-than-the-testing-loss'), ('text', 'A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time. Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.'), ('title', 'Why is the training loss much higher than the testing loss?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-obtain-the-output-of-an-intermediate-layer'), ('text', \"One simple way is to create a new Model that will output the layers that you are interested in: from keras.models import Model model = ... # create the original model layer_name = 'my_layer' intermediate_layer_model = Model(inputs=model.input, outputs=model.get_layer(layer_name).output) intermediate_output = intermediate_layer_model.predict(data) Alternatively, you can build a Keras function that will return the output of a certain layer given a certain input, for example: from keras import backend as K # with a Sequential model get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[3].output]) layer_output = get_3rd_layer_output([x])[0] Similarly, you could build a Theano and TensorFlow function directly. Note that if your model has a different behavior in training and testing phase (e.g. if it uses Dropout , BatchNormalization , etc.), you will need to pass the learning phase flag to your function: get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()], [model.layers[3].output]) # output in test mode = 0 layer_output = get_3rd_layer_output([x, 0])[0] # output in train mode = 1 layer_output = get_3rd_layer_output([x, 1])[0]\"), ('title', 'How can I obtain the output of an intermediate layer?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory'), ('text', 'You can do batch training using model.train_on_batch(x, y) and model.test_on_batch(x, y) . See the models documentation . Alternatively, you can write a generator that yields batches of training data and use the method model.fit_generator(data_generator, steps_per_epoch, epochs) . You can see batch training in action in our CIFAR10 example .'), ('title', \"How can I use Keras with datasets that don't fit in memory?\")]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-interrupt-training-when-the-validation-loss-isnt-decreasing-anymore'), ('text', \"You can use an EarlyStopping callback: from keras.callbacks import EarlyStopping early_stopping = EarlyStopping(monitor='val_loss', patience=2) model.fit(x, y, validation_split=0.2, callbacks=[early_stopping]) Find out more in the callbacks documentation .\"), ('title', \"How can I interrupt training when the validation loss isn't decreasing anymore?\")]), OrderedDict([('location', 'getting-started/faq.html#how-is-the-validation-split-computed'), ('text', \"If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data. If you set it to 0.25, it will be the last 25% of the data, etc. Note that the data isn't shuffled before extracting the validation split, so the validation is literally just the last x% of samples in the input you passed. The same validation set is used for all epochs (within a same call to fit ).\"), ('title', 'How is the validation split computed?')]), OrderedDict([('location', 'getting-started/faq.html#is-the-data-shuffled-during-training'), ('text', 'Yes, if the shuffle argument in model.fit is set to True (which is the default), the training data will be randomly shuffled at each epoch. Validation data is never shuffled.'), ('title', 'Is the data shuffled during training?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-record-the-training-validation-loss-accuracy-at-each-epoch'), ('text', 'The model.fit method returns an History callback, which has a history attribute containing the lists of successive losses and other metrics. hist = model.fit(x, y, validation_split=0.2) print(hist.history)'), ('title', 'How can I record the training / validation loss / accuracy at each epoch?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-freeze-keras-layers'), ('text', 'To \"freeze\" a layer means to exclude it from training, i.e. its weights will never be updated. This is useful in the context of fine-tuning a model, or using fixed embeddings for a text input. You can pass a trainable argument (boolean) to a layer constructor to set a layer to be non-trainable: frozen_layer = Dense(32, trainable=False) Additionally, you can set the trainable property of a layer to True or False after instantiation. For this to take effect, you will need to call compile() on your model after modifying the trainable property. Here\\'s an example: x = Input(shape=(32,)) layer = Dense(32) layer.trainable = False y = layer(x) frozen_model = Model(x, y) # in the model below, the weights of `layer` will not be updated during training frozen_model.compile(optimizer=\\'rmsprop\\', loss=\\'mse\\') layer.trainable = True trainable_model = Model(x, y) # with this model the weights of the layer will be updated during training # (which will also affect the above model since it uses the same layer instance) trainable_model.compile(optimizer=\\'rmsprop\\', loss=\\'mse\\') frozen_model.fit(data, labels) # this does NOT update the weights of `layer` trainable_model.fit(data, labels) # this updates the weights of `layer`'), ('title', 'How can I \"freeze\" Keras layers?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-use-stateful-rnns'), ('text', \"Making a RNN stateful means that the states for the samples of each batch will be reused as initial states for the samples in the next batch. When using stateful RNNs, it is therefore assumed that: all batches have the same number of samples If x1 and x2 are successive batches of samples, then x2[i] is the follow-up sequence to x1[i] , for every i . To use statefulness in RNNs, you need to: explicitly specify the batch size you are using, by passing a batch_size argument to the first layer in your model. E.g. batch_size=32 for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep. set stateful=True in your RNN layer(s). specify shuffle=False when calling fit(). To reset the states accumulated: use model.reset_states() to reset the states of all layers in the model use layer.reset_states() to reset the states of a specific stateful RNN layer Example: x # this is our input data, of shape (32, 21, 16) # we will feed it to our model in sequences of length 10 model = Sequential() model.add(LSTM(32, input_shape=(10, 16), batch_size=32, stateful=True)) model.add(Dense(16, activation='softmax')) model.compile(optimizer='rmsprop', loss='categorical_crossentropy') # we train the network to predict the 11th timestep given the first 10: model.train_on_batch(x[:, :10, :], np.reshape(x[:, 10, :], (32, 16))) # the state of the network has changed. We can feed the follow-up sequences: model.train_on_batch(x[:, 10:20, :], np.reshape(x[:, 20, :], (32, 16))) # let's reset the states of the LSTM layer: model.reset_states() # another way to do it in this case: model.layers[0].reset_states() Note that the methods predict , fit , train_on_batch , predict_classes , etc. will all update the states of the stateful layers in a model. This allows you to do not only stateful training, but also stateful prediction.\"), ('title', 'How can I use stateful RNNs?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-remove-a-layer-from-a-sequential-model'), ('text', 'You can remove the last added layer in a Sequential model by calling .pop() : model = Sequential() model.add(Dense(32, activation=\\'relu\\', input_dim=784)) model.add(Dense(32, activation=\\'relu\\')) print(len(model.layers)) # \"2\" model.pop() print(len(model.layers)) # \"1\"'), ('title', 'How can I remove a layer from a Sequential model?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-use-pre-trained-models-in-keras'), ('text', \"Code and pre-trained weights are available for the following image classification models: Xception VGG16 VGG19 ResNet50 Inception v3 Inception-ResNet v2 MobileNet v1 They can be imported from the module keras.applications : from keras.applications.xception import Xception from keras.applications.vgg16 import VGG16 from keras.applications.vgg19 import VGG19 from keras.applications.resnet50 import ResNet50 from keras.applications.inception_v3 import InceptionV3 from keras.applications.inception_resnet_v2 import InceptionResNetV2 from keras.applications.mobilenet import MobileNet model = VGG16(weights='imagenet', include_top=True) For a few simple usage examples, see the documentation for the Applications module . For a detailed example of how to use such a pre-trained model for feature extraction or for fine-tuning, see this blog post . The VGG16 model is also the basis for several Keras example scripts: Style transfer Feature visualization Deep dream\"), ('title', 'How can I use pre-trained models in Keras?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-use-hdf5-inputs-with-keras'), ('text', \"You can use the HDF5Matrix class from keras.utils.io_utils . See the HDF5Matrix documentation for details. You can also directly use a HDF5 dataset: import h5py with h5py.File('input/file.hdf5', 'r') as f: x_data = f['x_data'] model.predict(x_data) Please also see How can I install HDF5 or h5py to save my models in Keras? for instructions on how to install h5py .\"), ('title', 'How can I use HDF5 inputs with Keras?')]), OrderedDict([('location', 'getting-started/faq.html#where-is-the-keras-configuration-file-stored'), ('text', 'The default directory where all Keras data is stored is: $HOME/.keras/ Note that Windows users should replace $HOME with %USERPROFILE% . In case Keras cannot create the above directory (e.g. due to permission issues), /tmp/.keras/ is used as a backup. The Keras configuration file is a JSON file stored at $HOME/.keras/keras.json . The default configuration file looks like this: { \"image_data_format\": \"channels_last\", \"epsilon\": 1e-07, \"floatx\": \"float32\", \"backend\": \"tensorflow\" } It contains the following fields: The image data format to be used as default by image processing layers and utilities (either channels_last or channels_first ). The epsilon numerical fuzz factor to be used to prevent division by zero in some operations. The default float data type. The default backend. See the backend documentation . Likewise, cached dataset files, such as those downloaded with get_file() , are stored by default in $HOME/.keras/datasets/ .'), ('title', 'Where is the Keras configuration file stored?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-obtain-reproducible-results-using-keras-during-development'), ('text', 'During development of a model, sometimes it is useful to be able to obtain reproducible results from run to run in order to determine if a change in performance is due to an actual model or data modification, or merely a result of a new random sample. First, you need to set the PYTHONHASHSEED environment variable to 0 before the program starts (not within the program itself). This is necessary in Python 3.2.3 onwards to have reproducible behavior for certain hash-based operations (e.g., the item order in a set or a dict, see Python\\'s documentation or issue #2280 for further details). One way to set the environment variable is when starting python like this: $ cat test_hash.py print(hash(\"keras\")) $ python3 test_hash.py # non-reproducible hash (Python 3.2.3+) -8127205062320133199 $ python3 test_hash.py # non-reproducible hash (Python 3.2.3+) 3204480642156461591 $ PYTHONHASHSEED=0 python3 test_hash.py # reproducible hash 4883664951434749476 $ PYTHONHASHSEED=0 python3 test_hash.py # reproducible hash 4883664951434749476 Moreover, when using the TensorFlow backend and running on a GPU, some operations have non-deterministic outputs, in particular tf.reduce_sum() . This is due to the fact that GPUs run many operations in parallel, so the order of execution is not always guaranteed. Due to the limited precision of floats, even adding several numbers together may give slightly different results depending on the order in which you add them. You can try to avoid the non-deterministic operations, but some may be created automatically by TensorFlow to compute the gradients, so it is much simpler to just run the code on the CPU. For this, you can set the CUDA_VISIBLE_DEVICES environment variable to an empty string, for example: $ CUDA_VISIBLE_DEVICES=\"\" PYTHONHASHSEED=0 python your_program.py The below snippet of code provides an example of how to obtain reproducible results - this is geared towards a TensorFlow backend for a Python 3 environment: import numpy as np import tensorflow as tf import random as rn # The below is necessary for starting Numpy generated random numbers # in a well-defined initial state. np.random.seed(42) # The below is necessary for starting core Python generated random numbers # in a well-defined state. rn.seed(12345) # Force TensorFlow to use single thread. # Multiple threads are a potential source of non-reproducible results. # For further details, see: https://stackoverflow.com/questions/42022950/ session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1) from keras import backend as K # The below tf.set_random_seed() will make random number generation # in the TensorFlow backend have a well-defined initial state. # For further details, see: # https://www.tensorflow.org/api_docs/python/tf/set_random_seed tf.set_random_seed(1234) sess = tf.Session(graph=tf.get_default_graph(), config=session_conf) K.set_session(sess) # Rest of code follows ...'), ('title', 'How can I obtain reproducible results using Keras during development?')]), OrderedDict([('location', 'getting-started/faq.html#how-can-i-install-hdf5-or-h5py-to-save-my-models-in-keras'), ('text', 'In order to save your Keras models as HDF5 files, e.g. via keras.callbacks.ModelCheckpoint , Keras uses the h5py Python package. It is a dependency of Keras and should be installed by default. On Debian-based distributions, you will have to additionally install libhdf5 : sudo apt-get install libhdf5-serial-dev If you are unsure if h5py is installed you can open a Python shell and load the module via import h5py If it imports without error it is installed otherwise you can find detailed installation instructions here: http://docs.h5py.org/en/latest/build.html'), ('title', 'How can I install HDF5 or h5py to save my models in Keras?')]), OrderedDict([('location', 'getting-started/functional-api-guide.html'), ('text', 'Getting started with the Keras functional API The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers. This guide assumes that you are already familiar with the Sequential model. Let\\'s start with something simple. First example: a densely-connected network The Sequential model is probably a better choice to implement such a network, but it helps to start with something really simple. A layer instance is callable (on a tensor), and it returns a tensor Input tensor(s) and output tensor(s) can then be used to define a Model Such a model can be trained just like Keras Sequential models. from keras.layers import Input, Dense from keras.models import Model # This returns a tensor inputs = Input(shape=(784,)) # a layer instance is callable on a tensor, and returns a tensor x = Dense(64, activation=\\'relu\\')(inputs) x = Dense(64, activation=\\'relu\\')(x) predictions = Dense(10, activation=\\'softmax\\')(x) # This creates a model that includes # the Input layer and three Dense layers model = Model(inputs=inputs, outputs=predictions) model.compile(optimizer=\\'rmsprop\\', loss=\\'categorical_crossentropy\\', metrics=[\\'accuracy\\']) model.fit(data, labels) # starts training All models are callable, just like layers With the functional API, it is easy to reuse trained models: you can treat any model as if it were a layer, by calling it on a tensor. Note that by calling a model you aren\\'t just reusing the architecture of the model, you are also reusing its weights. x = Input(shape=(784,)) # This works, and returns the 10-way softmax we defined above. y = model(x) This can allow, for instance, to quickly create models that can process sequences of inputs. You could turn an image classification model into a video classification model, in just one line. from keras.layers import TimeDistributed # Input tensor for sequences of 20 timesteps, # each containing a 784-dimensional vector input_sequences = Input(shape=(20, 784)) # This applies our previous model to every timestep in the input sequences. # the output of the previous model was a 10-way softmax, # so the output of the layer below will be a sequence of 20 vectors of size 10. processed_sequences = TimeDistributed(model)(input_sequences) Multi-input and multi-output models Here\\'s a good use case for the functional API: models with multiple inputs and outputs. The functional API makes it easy to manipulate a large number of intertwined datastreams. Let\\'s consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter. The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc. The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models. Here\\'s what our model looks like: Let\\'s implement it with the functional API. The main input will receive the headline, as a sequence of integers (each integer encodes a word). The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long. from keras.layers import Input, Embedding, LSTM, Dense from keras.models import Model # Headline input: meant to receive sequences of 100 integers, between 1 and 10000. # Note that we can name any layer by passing it a \"name\" argument. main_input = Input(shape=(100,), dtype=\\'int32\\', name=\\'main_input\\') # This embedding layer will encode the input sequence # into a sequence of dense 512-dimensional vectors. x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input) # A LSTM will transform the vector sequence into a single vector, # containing information about the entire sequence lstm_out = LSTM(32)(x) Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model. auxiliary_output = Dense(1, activation=\\'sigmoid\\', name=\\'aux_output\\')(lstm_out) At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output: auxiliary_input = Input(shape=(5,), name=\\'aux_input\\') x = keras.layers.concatenate([lstm_out, auxiliary_input]) # We stack a deep densely-connected network on top x = Dense(64, activation=\\'relu\\')(x) x = Dense(64, activation=\\'relu\\')(x) x = Dense(64, activation=\\'relu\\')(x) # And finally we add the main logistic regression layer main_output = Dense(1, activation=\\'sigmoid\\', name=\\'main_output\\')(x) This defines a model with two inputs and two outputs: model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output]) We compile the model and assign a weight of 0.2 to the auxiliary loss. To specify different loss_weights or loss for each different output, you can use a list or a dictionary. Here we pass a single loss as the loss argument, so the same loss will be used on all outputs. model.compile(optimizer=\\'rmsprop\\', loss=\\'binary_crossentropy\\', loss_weights=[1., 0.2]) We can train the model by passing it lists of input arrays and target arrays: model.fit([headline_data, additional_data], [labels, labels], epochs=50, batch_size=32) Since our inputs and outputs are named (we passed them a \"name\" argument), we could also have compiled the model via: model.compile(optimizer=\\'rmsprop\\', loss={\\'main_output\\': \\'binary_crossentropy\\', \\'aux_output\\': \\'binary_crossentropy\\'}, loss_weights={\\'main_output\\': 1., \\'aux_output\\': 0.2}) # And trained it via: model.fit({\\'main_input\\': headline_data, \\'aux_input\\': additional_data}, {\\'main_output\\': labels, \\'aux_output\\': labels}, epochs=50, batch_size=32) Shared layers Another good use for the functional API are models that use shared layers. Let\\'s take a look at shared layers. Let\\'s consider a dataset of tweets. We want to build a model that can tell whether two tweets are from the same person or not (this can allow us to compare users by the similarity of their tweets, for instance). One way to achieve this is to build a model that encodes two tweets into two vectors, concatenates the vectors and then adds a logistic regression; this outputs a probability that the two tweets share the same author. The model would then be trained on positive tweet pairs and negative tweet pairs. Because the problem is symmetric, the mechanism that encodes the first tweet should be reused (weights and all) to encode the second tweet. Here we use a shared LSTM layer to encode the tweets. Let\\'s build this with the functional API. We will take as input for a tweet a binary matrix of shape (280, 256) , i.e. a sequence of 280 vectors of size 256, where each dimension in the 256-dimensional vector encodes the presence/absence of a character (out of an alphabet of 256 frequent characters). import keras from keras.layers import Input, LSTM, Dense from keras.models import Model tweet_a = Input(shape=(280, 256)) tweet_b = Input(shape=(280, 256)) To share a layer across different inputs, simply instantiate the layer once, then call it on as many inputs as you want: # This layer can take as input a matrix # and will return a vector of size 64 shared_lstm = LSTM(64) # When we reuse the same layer instance # multiple times, the weights of the layer # are also being reused # (it is effectively *the same* layer) encoded_a = shared_lstm(tweet_a) encoded_b = shared_lstm(tweet_b) # We can then concatenate the two vectors: merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1) # And add a logistic regression on top predictions = Dense(1, activation=\\'sigmoid\\')(merged_vector) # We define a trainable model linking the # tweet inputs to the predictions model = Model(inputs=[tweet_a, tweet_b], outputs=predictions) model.compile(optimizer=\\'rmsprop\\', loss=\\'binary_crossentropy\\', metrics=[\\'accuracy\\']) model.fit([data_a, data_b], labels, epochs=10) Let\\'s pause to take a look at how to read the shared layer\\'s output or output shape. The concept of layer \"node\" Whenever you are calling a layer on some input, you are creating a new tensor (the output of the layer), and you are adding a \"node\" to the layer, linking the input tensor to the output tensor. When you are calling the same layer multiple times, that layer owns multiple nodes indexed as 0, 1, 2... In previous versions of Keras, you could obtain the output tensor of a layer instance via layer.get_output() , or its output shape via layer.output_shape . You still can (except get_output() has been replaced by the property output ). But what if a layer is connected to multiple inputs? As long as a layer is only connected to one input, there is no confusion, and .output will return the one output of the layer: a = Input(shape=(280, 256)) lstm = LSTM(32) encoded_a = lstm(a) assert lstm.output == encoded_a Not so if the layer has multiple inputs: a = Input(shape=(280, 256)) b = Input(shape=(280, 256)) lstm = LSTM(32) encoded_a = lstm(a) encoded_b = lstm(b) lstm.output >> AttributeError: Layer lstm_1 has multiple inbound nodes, hence the notion of \"layer output\" is ill-defined. Use `get_output_at(node_index)` instead. Okay then. The following works: assert lstm.get_output_at(0) == encoded_a assert lstm.get_output_at(1) == encoded_b Simple enough, right? The same is true for the properties input_shape and output_shape : as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of \"layer output/input shape\" is well defined, and that one shape will be returned by layer.output_shape / layer.input_shape . But if, for instance, you apply the same Conv2D layer to an input of shape (32, 32, 3) , and then to an input of shape (64, 64, 3) , the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to: a = Input(shape=(32, 32, 3)) b = Input(shape=(64, 64, 3)) conv = Conv2D(16, (3, 3), padding=\\'same\\') conved_a = conv(a) # Only one input so far, the following will work: assert conv.input_shape == (None, 32, 32, 3) conved_b = conv(b) # now the `.input_shape` property wouldn\\'t work, but this does: assert conv.get_input_shape_at(0) == (None, 32, 32, 3) assert conv.get_input_shape_at(1) == (None, 64, 64, 3) More examples Code examples are still the best way to get started, so here are a few more. Inception module For more information about the Inception architecture, see Going Deeper with Convolutions . from keras.layers import Conv2D, MaxPooling2D, Input input_img = Input(shape=(256, 256, 3)) tower_1 = Conv2D(64, (1, 1), padding=\\'same\\', activation=\\'relu\\')(input_img) tower_1 = Conv2D(64, (3, 3), padding=\\'same\\', activation=\\'relu\\')(tower_1) tower_2 = Conv2D(64, (1, 1), padding=\\'same\\', activation=\\'relu\\')(input_img) tower_2 = Conv2D(64, (5, 5), padding=\\'same\\', activation=\\'relu\\')(tower_2) tower_3 = MaxPooling2D((3, 3), strides=(1, 1), padding=\\'same\\')(input_img) tower_3 = Conv2D(64, (1, 1), padding=\\'same\\', activation=\\'relu\\')(tower_3) output = keras.layers.concatenate([tower_1, tower_2, tower_3], axis=1) Residual connection on a convolution layer For more information about residual networks, see Deep Residual Learning for Image Recognition . from keras.layers import Conv2D, Input # input tensor for a 3-channel 256x256 image x = Input(shape=(256, 256, 3)) # 3x3 conv with 3 output channels (same as input channels) y = Conv2D(3, (3, 3), padding=\\'same\\')(x) # this returns x + y. z = keras.layers.add([x, y]) Shared vision model This model reuses the same image-processing module on two inputs, to classify whether two MNIST digits are the same digit or different digits. from keras.layers import Conv2D, MaxPooling2D, Input, Dense, Flatten from keras.models import Model # First, define the vision modules digit_input = Input(shape=(27, 27, 1)) x = Conv2D(64, (3, 3))(digit_input) x = Conv2D(64, (3, 3))(x) x = MaxPooling2D((2, 2))(x) out = Flatten()(x) vision_model = Model(digit_input, out) # Then define the tell-digits-apart model digit_a = Input(shape=(27, 27, 1)) digit_b = Input(shape=(27, 27, 1)) # The vision model will be shared, weights and all out_a = vision_model(digit_a) out_b = vision_model(digit_b) concatenated = keras.layers.concatenate([out_a, out_b]) out = Dense(1, activation=\\'sigmoid\\')(concatenated) classification_model = Model([digit_a, digit_b], out) Visual question answering model This model can select the correct one-word answer when asked a natural-language question about a picture. It works by encoding the question into a vector, encoding the image into a vector, concatenating the two, and training on top a logistic regression over some vocabulary of potential answers. from keras.layers import Conv2D, MaxPooling2D, Flatten from keras.layers import Input, LSTM, Embedding, Dense from keras.models import Model, Sequential # First, let\\'s define a vision model using a Sequential model. # This model will encode an image into a vector. vision_model = Sequential() vision_model.add(Conv2D(64, (3, 3), activation=\\'relu\\', padding=\\'same\\', input_shape=(224, 224, 3))) vision_model.add(Conv2D(64, (3, 3), activation=\\'relu\\')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Conv2D(128, (3, 3), activation=\\'relu\\', padding=\\'same\\')) vision_model.add(Conv2D(128, (3, 3), activation=\\'relu\\')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Conv2D(256, (3, 3), activation=\\'relu\\', padding=\\'same\\')) vision_model.add(Conv2D(256, (3, 3), activation=\\'relu\\')) vision_model.add(Conv2D(256, (3, 3), activation=\\'relu\\')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Flatten()) # Now let\\'s get a tensor with the output of our vision model: image_input = Input(shape=(224, 224, 3)) encoded_image = vision_model(image_input) # Next, let\\'s define a language model to encode the question into a vector. # Each question will be at most 100 word long, # and we will index words as integers from 1 to 9999. question_input = Input(shape=(100,), dtype=\\'int32\\') embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input) encoded_question = LSTM(256)(embedded_question) # Let\\'s concatenate the question vector and the image vector: merged = keras.layers.concatenate([encoded_question, encoded_image]) # And let\\'s train a logistic regression over 1000 words on top: output = Dense(1000, activation=\\'softmax\\')(merged) # This is our final model: vqa_model = Model(inputs=[image_input, question_input], outputs=output) # The next stage would be training this model on actual data. Video question answering model Now that we have trained our image QA model, we can quickly turn it into a video QA model. With appropriate training, you will be able to show it a short video (e.g. 100-frame human action) and ask a natural language question about the video (e.g. \"what sport is the boy playing?\" -> \"football\"). from keras.layers import TimeDistributed video_input = Input(shape=(100, 224, 224, 3)) # This is our video encoded via the previously trained vision_model (weights are reused) encoded_frame_sequence = TimeDistributed(vision_model)(video_input) # the output will be a sequence of vectors encoded_video = LSTM(256)(encoded_frame_sequence) # the output will be a vector # This is a model-level representation of the question encoder, reusing the same weights as before: question_encoder = Model(inputs=question_input, outputs=encoded_question) # Let\\'s use it to encode the question: video_question_input = Input(shape=(100,), dtype=\\'int32\\') encoded_video_question = question_encoder(video_question_input) # And this is our video question answering model: merged = keras.layers.concatenate([encoded_video, encoded_video_question]) output = Dense(1000, activation=\\'softmax\\')(merged) video_qa_model = Model(inputs=[video_input, video_question_input], outputs=output)'), ('title', 'Guide to the Functional API')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#getting-started-with-the-keras-functional-api'), ('text', \"The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers. This guide assumes that you are already familiar with the Sequential model. Let's start with something simple.\"), ('title', 'Getting started with the Keras functional API')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#first-example-a-densely-connected-network'), ('text', \"The Sequential model is probably a better choice to implement such a network, but it helps to start with something really simple. A layer instance is callable (on a tensor), and it returns a tensor Input tensor(s) and output tensor(s) can then be used to define a Model Such a model can be trained just like Keras Sequential models. from keras.layers import Input, Dense from keras.models import Model # This returns a tensor inputs = Input(shape=(784,)) # a layer instance is callable on a tensor, and returns a tensor x = Dense(64, activation='relu')(inputs) x = Dense(64, activation='relu')(x) predictions = Dense(10, activation='softmax')(x) # This creates a model that includes # the Input layer and three Dense layers model = Model(inputs=inputs, outputs=predictions) model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) model.fit(data, labels) # starts training\"), ('title', 'First example: a densely-connected network')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#all-models-are-callable-just-like-layers'), ('text', \"With the functional API, it is easy to reuse trained models: you can treat any model as if it were a layer, by calling it on a tensor. Note that by calling a model you aren't just reusing the architecture of the model, you are also reusing its weights. x = Input(shape=(784,)) # This works, and returns the 10-way softmax we defined above. y = model(x) This can allow, for instance, to quickly create models that can process sequences of inputs. You could turn an image classification model into a video classification model, in just one line. from keras.layers import TimeDistributed # Input tensor for sequences of 20 timesteps, # each containing a 784-dimensional vector input_sequences = Input(shape=(20, 784)) # This applies our previous model to every timestep in the input sequences. # the output of the previous model was a 10-way softmax, # so the output of the layer below will be a sequence of 20 vectors of size 10. processed_sequences = TimeDistributed(model)(input_sequences)\"), ('title', 'All models are callable, just like layers')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#multi-input-and-multi-output-models'), ('text', 'Here\\'s a good use case for the functional API: models with multiple inputs and outputs. The functional API makes it easy to manipulate a large number of intertwined datastreams. Let\\'s consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter. The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc. The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models. Here\\'s what our model looks like: Let\\'s implement it with the functional API. The main input will receive the headline, as a sequence of integers (each integer encodes a word). The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long. from keras.layers import Input, Embedding, LSTM, Dense from keras.models import Model # Headline input: meant to receive sequences of 100 integers, between 1 and 10000. # Note that we can name any layer by passing it a \"name\" argument. main_input = Input(shape=(100,), dtype=\\'int32\\', name=\\'main_input\\') # This embedding layer will encode the input sequence # into a sequence of dense 512-dimensional vectors. x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input) # A LSTM will transform the vector sequence into a single vector, # containing information about the entire sequence lstm_out = LSTM(32)(x) Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model. auxiliary_output = Dense(1, activation=\\'sigmoid\\', name=\\'aux_output\\')(lstm_out) At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output: auxiliary_input = Input(shape=(5,), name=\\'aux_input\\') x = keras.layers.concatenate([lstm_out, auxiliary_input]) # We stack a deep densely-connected network on top x = Dense(64, activation=\\'relu\\')(x) x = Dense(64, activation=\\'relu\\')(x) x = Dense(64, activation=\\'relu\\')(x) # And finally we add the main logistic regression layer main_output = Dense(1, activation=\\'sigmoid\\', name=\\'main_output\\')(x) This defines a model with two inputs and two outputs: model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output]) We compile the model and assign a weight of 0.2 to the auxiliary loss. To specify different loss_weights or loss for each different output, you can use a list or a dictionary. Here we pass a single loss as the loss argument, so the same loss will be used on all outputs. model.compile(optimizer=\\'rmsprop\\', loss=\\'binary_crossentropy\\', loss_weights=[1., 0.2]) We can train the model by passing it lists of input arrays and target arrays: model.fit([headline_data, additional_data], [labels, labels], epochs=50, batch_size=32) Since our inputs and outputs are named (we passed them a \"name\" argument), we could also have compiled the model via: model.compile(optimizer=\\'rmsprop\\', loss={\\'main_output\\': \\'binary_crossentropy\\', \\'aux_output\\': \\'binary_crossentropy\\'}, loss_weights={\\'main_output\\': 1., \\'aux_output\\': 0.2}) # And trained it via: model.fit({\\'main_input\\': headline_data, \\'aux_input\\': additional_data}, {\\'main_output\\': labels, \\'aux_output\\': labels}, epochs=50, batch_size=32)'), ('title', 'Multi-input and multi-output models')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#shared-layers'), ('text', \"Another good use for the functional API are models that use shared layers. Let's take a look at shared layers. Let's consider a dataset of tweets. We want to build a model that can tell whether two tweets are from the same person or not (this can allow us to compare users by the similarity of their tweets, for instance). One way to achieve this is to build a model that encodes two tweets into two vectors, concatenates the vectors and then adds a logistic regression; this outputs a probability that the two tweets share the same author. The model would then be trained on positive tweet pairs and negative tweet pairs. Because the problem is symmetric, the mechanism that encodes the first tweet should be reused (weights and all) to encode the second tweet. Here we use a shared LSTM layer to encode the tweets. Let's build this with the functional API. We will take as input for a tweet a binary matrix of shape (280, 256) , i.e. a sequence of 280 vectors of size 256, where each dimension in the 256-dimensional vector encodes the presence/absence of a character (out of an alphabet of 256 frequent characters). import keras from keras.layers import Input, LSTM, Dense from keras.models import Model tweet_a = Input(shape=(280, 256)) tweet_b = Input(shape=(280, 256)) To share a layer across different inputs, simply instantiate the layer once, then call it on as many inputs as you want: # This layer can take as input a matrix # and will return a vector of size 64 shared_lstm = LSTM(64) # When we reuse the same layer instance # multiple times, the weights of the layer # are also being reused # (it is effectively *the same* layer) encoded_a = shared_lstm(tweet_a) encoded_b = shared_lstm(tweet_b) # We can then concatenate the two vectors: merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1) # And add a logistic regression on top predictions = Dense(1, activation='sigmoid')(merged_vector) # We define a trainable model linking the # tweet inputs to the predictions model = Model(inputs=[tweet_a, tweet_b], outputs=predictions) model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy']) model.fit([data_a, data_b], labels, epochs=10) Let's pause to take a look at how to read the shared layer's output or output shape.\"), ('title', 'Shared layers')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#the-concept-of-layer-node'), ('text', 'Whenever you are calling a layer on some input, you are creating a new tensor (the output of the layer), and you are adding a \"node\" to the layer, linking the input tensor to the output tensor. When you are calling the same layer multiple times, that layer owns multiple nodes indexed as 0, 1, 2... In previous versions of Keras, you could obtain the output tensor of a layer instance via layer.get_output() , or its output shape via layer.output_shape . You still can (except get_output() has been replaced by the property output ). But what if a layer is connected to multiple inputs? As long as a layer is only connected to one input, there is no confusion, and .output will return the one output of the layer: a = Input(shape=(280, 256)) lstm = LSTM(32) encoded_a = lstm(a) assert lstm.output == encoded_a Not so if the layer has multiple inputs: a = Input(shape=(280, 256)) b = Input(shape=(280, 256)) lstm = LSTM(32) encoded_a = lstm(a) encoded_b = lstm(b) lstm.output >> AttributeError: Layer lstm_1 has multiple inbound nodes, hence the notion of \"layer output\" is ill-defined. Use `get_output_at(node_index)` instead. Okay then. The following works: assert lstm.get_output_at(0) == encoded_a assert lstm.get_output_at(1) == encoded_b Simple enough, right? The same is true for the properties input_shape and output_shape : as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of \"layer output/input shape\" is well defined, and that one shape will be returned by layer.output_shape / layer.input_shape . But if, for instance, you apply the same Conv2D layer to an input of shape (32, 32, 3) , and then to an input of shape (64, 64, 3) , the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to: a = Input(shape=(32, 32, 3)) b = Input(shape=(64, 64, 3)) conv = Conv2D(16, (3, 3), padding=\\'same\\') conved_a = conv(a) # Only one input so far, the following will work: assert conv.input_shape == (None, 32, 32, 3) conved_b = conv(b) # now the `.input_shape` property wouldn\\'t work, but this does: assert conv.get_input_shape_at(0) == (None, 32, 32, 3) assert conv.get_input_shape_at(1) == (None, 64, 64, 3)'), ('title', 'The concept of layer \"node\"')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#more-examples'), ('text', 'Code examples are still the best way to get started, so here are a few more.'), ('title', 'More examples')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#inception-module'), ('text', \"For more information about the Inception architecture, see Going Deeper with Convolutions . from keras.layers import Conv2D, MaxPooling2D, Input input_img = Input(shape=(256, 256, 3)) tower_1 = Conv2D(64, (1, 1), padding='same', activation='relu')(input_img) tower_1 = Conv2D(64, (3, 3), padding='same', activation='relu')(tower_1) tower_2 = Conv2D(64, (1, 1), padding='same', activation='relu')(input_img) tower_2 = Conv2D(64, (5, 5), padding='same', activation='relu')(tower_2) tower_3 = MaxPooling2D((3, 3), strides=(1, 1), padding='same')(input_img) tower_3 = Conv2D(64, (1, 1), padding='same', activation='relu')(tower_3) output = keras.layers.concatenate([tower_1, tower_2, tower_3], axis=1)\"), ('title', 'Inception module')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#residual-connection-on-a-convolution-layer'), ('text', \"For more information about residual networks, see Deep Residual Learning for Image Recognition . from keras.layers import Conv2D, Input # input tensor for a 3-channel 256x256 image x = Input(shape=(256, 256, 3)) # 3x3 conv with 3 output channels (same as input channels) y = Conv2D(3, (3, 3), padding='same')(x) # this returns x + y. z = keras.layers.add([x, y])\"), ('title', 'Residual connection on a convolution layer')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#shared-vision-model'), ('text', \"This model reuses the same image-processing module on two inputs, to classify whether two MNIST digits are the same digit or different digits. from keras.layers import Conv2D, MaxPooling2D, Input, Dense, Flatten from keras.models import Model # First, define the vision modules digit_input = Input(shape=(27, 27, 1)) x = Conv2D(64, (3, 3))(digit_input) x = Conv2D(64, (3, 3))(x) x = MaxPooling2D((2, 2))(x) out = Flatten()(x) vision_model = Model(digit_input, out) # Then define the tell-digits-apart model digit_a = Input(shape=(27, 27, 1)) digit_b = Input(shape=(27, 27, 1)) # The vision model will be shared, weights and all out_a = vision_model(digit_a) out_b = vision_model(digit_b) concatenated = keras.layers.concatenate([out_a, out_b]) out = Dense(1, activation='sigmoid')(concatenated) classification_model = Model([digit_a, digit_b], out)\"), ('title', 'Shared vision model')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#visual-question-answering-model'), ('text', \"This model can select the correct one-word answer when asked a natural-language question about a picture. It works by encoding the question into a vector, encoding the image into a vector, concatenating the two, and training on top a logistic regression over some vocabulary of potential answers. from keras.layers import Conv2D, MaxPooling2D, Flatten from keras.layers import Input, LSTM, Embedding, Dense from keras.models import Model, Sequential # First, let's define a vision model using a Sequential model. # This model will encode an image into a vector. vision_model = Sequential() vision_model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3))) vision_model.add(Conv2D(64, (3, 3), activation='relu')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Conv2D(128, (3, 3), activation='relu', padding='same')) vision_model.add(Conv2D(128, (3, 3), activation='relu')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Conv2D(256, (3, 3), activation='relu', padding='same')) vision_model.add(Conv2D(256, (3, 3), activation='relu')) vision_model.add(Conv2D(256, (3, 3), activation='relu')) vision_model.add(MaxPooling2D((2, 2))) vision_model.add(Flatten()) # Now let's get a tensor with the output of our vision model: image_input = Input(shape=(224, 224, 3)) encoded_image = vision_model(image_input) # Next, let's define a language model to encode the question into a vector. # Each question will be at most 100 word long, # and we will index words as integers from 1 to 9999. question_input = Input(shape=(100,), dtype='int32') embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input) encoded_question = LSTM(256)(embedded_question) # Let's concatenate the question vector and the image vector: merged = keras.layers.concatenate([encoded_question, encoded_image]) # And let's train a logistic regression over 1000 words on top: output = Dense(1000, activation='softmax')(merged) # This is our final model: vqa_model = Model(inputs=[image_input, question_input], outputs=output) # The next stage would be training this model on actual data.\"), ('title', 'Visual question answering model')]), OrderedDict([('location', 'getting-started/functional-api-guide.html#video-question-answering-model'), ('text', 'Now that we have trained our image QA model, we can quickly turn it into a video QA model. With appropriate training, you will be able to show it a short video (e.g. 100-frame human action) and ask a natural language question about the video (e.g. \"what sport is the boy playing?\" -> \"football\"). from keras.layers import TimeDistributed video_input = Input(shape=(100, 224, 224, 3)) # This is our video encoded via the previously trained vision_model (weights are reused) encoded_frame_sequence = TimeDistributed(vision_model)(video_input) # the output will be a sequence of vectors encoded_video = LSTM(256)(encoded_frame_sequence) # the output will be a vector # This is a model-level representation of the question encoder, reusing the same weights as before: question_encoder = Model(inputs=question_input, outputs=encoded_question) # Let\\'s use it to encode the question: video_question_input = Input(shape=(100,), dtype=\\'int32\\') encoded_video_question = question_encoder(video_question_input) # And this is our video question answering model: merged = keras.layers.concatenate([encoded_video, encoded_video_question]) output = Dense(1000, activation=\\'softmax\\')(merged) video_qa_model = Model(inputs=[video_input, video_question_input], outputs=output)'), ('title', 'Video question answering model')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html'), ('text', 'Getting started with the Keras Sequential model The Sequential model is a linear stack of layers. You can create a Sequential model by passing a list of layer instances to the constructor: from keras.models import Sequential from keras.layers import Dense, Activation model = Sequential([ Dense(32, input_shape=(784,)), Activation(\\'relu\\'), Dense(10), Activation(\\'softmax\\'), ]) You can also simply add layers via the .add() method: model = Sequential() model.add(Dense(32, input_dim=784)) model.add(Activation(\\'relu\\')) Specifying the input shape The model needs to know what input shape it should expect. For this reason, the first layer in a Sequential model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. There are several possible ways to do this: Pass an input_shape argument to the first layer. This is a shape tuple (a tuple of integers or None entries, where None indicates that any positive integer may be expected). In input_shape , the batch dimension is not included. Some 2D layers, such as Dense , support the specification of their input shape via the argument input_dim , and some 3D temporal layers support the arguments input_dim and input_length . If you ever need to specify a fixed batch size for your inputs (this is useful for stateful recurrent networks), you can pass a batch_size argument to a layer. If you pass both batch_size=32 and input_shape=(6, 8) to a layer, it will then expect every batch of inputs to have the batch shape (32, 6, 8) . As such, the following snippets are strictly equivalent: model = Sequential() model.add(Dense(32, input_shape=(784,))) model = Sequential() model.add(Dense(32, input_dim=784)) Compilation Before training a model, you need to configure the learning process, which is done via the compile method. It receives three arguments: An optimizer. This could be the string identifier of an existing optimizer (such as rmsprop or adagrad ), or an instance of the Optimizer class. See: optimizers . A loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as categorical_crossentropy or mse ), or it can be an objective function. See: losses . A list of metrics. For any classification problem you will want to set this to metrics=[\\'accuracy\\'] . A metric could be the string identifier of an existing metric or a custom metric function. # For a multi-class classification problem model.compile(optimizer=\\'rmsprop\\', loss=\\'categorical_crossentropy\\', metrics=[\\'accuracy\\']) # For a binary classification problem model.compile(optimizer=\\'rmsprop\\', loss=\\'binary_crossentropy\\', metrics=[\\'accuracy\\']) # For a mean squared error regression problem model.compile(optimizer=\\'rmsprop\\', loss=\\'mse\\') # For custom metrics import keras.backend as K def mean_pred(y_true, y_pred): return K.mean(y_pred) model.compile(optimizer=\\'rmsprop\\', loss=\\'binary_crossentropy\\', metrics=[\\'accuracy\\', mean_pred]) Training Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the fit function. Read its documentation here . # For a single-input model with 2 classes (binary classification): model = Sequential() model.add(Dense(32, activation=\\'relu\\', input_dim=100)) model.add(Dense(1, activation=\\'sigmoid\\')) model.compile(optimizer=\\'rmsprop\\', loss=\\'binary_crossentropy\\', metrics=[\\'accuracy\\']) # Generate dummy data import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(2, size=(1000, 1)) # Train the model, iterating on the data in batches of 32 samples model.fit(data, labels, epochs=10, batch_size=32) # For a single-input model with 10 classes (categorical classification): model = Sequential() model.add(Dense(32, activation=\\'relu\\', input_dim=100)) model.add(Dense(10, activation=\\'softmax\\')) model.compile(optimizer=\\'rmsprop\\', loss=\\'categorical_crossentropy\\', metrics=[\\'accuracy\\']) # Generate dummy data import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(10, size=(1000, 1)) # Convert labels to categorical one-hot encoding one_hot_labels = keras.utils.to_categorical(labels, num_classes=10) # Train the model, iterating on the data in batches of 32 samples model.fit(data, one_hot_labels, epochs=10, batch_size=32) Examples Here are a few examples to get you started! In the examples folder , you will also find example models for real datasets: CIFAR10 small images classification: Convolutional Neural Network (CNN) with realtime data augmentation IMDB movie review sentiment classification: LSTM over sequences of words Reuters newswires topic classification: Multilayer Perceptron (MLP) MNIST handwritten digits classification: MLP & CNN Character-level text generation with LSTM ...and more. Multilayer Perceptron (MLP) for multi-class softmax classification: import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.optimizers import SGD # Generate dummy data import numpy as np x_train = np.random.random((1000, 20)) y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10) x_test = np.random.random((100, 20)) y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10) model = Sequential() # Dense(64) is a fully-connected layer with 64 hidden units. # in the first layer, you must specify the expected input data shape: # here, 20-dimensional vectors. model.add(Dense(64, activation=\\'relu\\', input_dim=20)) model.add(Dropout(0.5)) model.add(Dense(64, activation=\\'relu\\')) model.add(Dropout(0.5)) model.add(Dense(10, activation=\\'softmax\\')) sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss=\\'categorical_crossentropy\\', optimizer=sgd, metrics=[\\'accuracy\\']) model.fit(x_train, y_train, epochs=20, batch_size=128) score = model.evaluate(x_test, y_test, batch_size=128) MLP for binary classification: import numpy as np from keras.models import Sequential from keras.layers import Dense, Dropout # Generate dummy data x_train = np.random.random((1000, 20)) y_train = np.random.randint(2, size=(1000, 1)) x_test = np.random.random((100, 20)) y_test = np.random.randint(2, size=(100, 1)) model = Sequential() model.add(Dense(64, input_dim=20, activation=\\'relu\\')) model.add(Dropout(0.5)) model.add(Dense(64, activation=\\'relu\\')) model.add(Dropout(0.5)) model.add(Dense(1, activation=\\'sigmoid\\')) model.compile(loss=\\'binary_crossentropy\\', optimizer=\\'rmsprop\\', metrics=[\\'accuracy\\']) model.fit(x_train, y_train, epochs=20, batch_size=128) score = model.evaluate(x_test, y_test, batch_size=128) VGG-like convnet: import numpy as np import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.optimizers import SGD # Generate dummy data x_train = np.random.random((100, 100, 100, 3)) y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10) x_test = np.random.random((20, 100, 100, 3)) y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10) model = Sequential() # input: 100x100 images with 3 channels -> (100, 100, 3) tensors. # this applies 32 convolution filters of size 3x3 each. model.add(Conv2D(32, (3, 3), activation=\\'relu\\', input_shape=(100, 100, 3))) model.add(Conv2D(32, (3, 3), activation=\\'relu\\')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Conv2D(64, (3, 3), activation=\\'relu\\')) model.add(Conv2D(64, (3, 3), activation=\\'relu\\')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(256, activation=\\'relu\\')) model.add(Dropout(0.5)) model.add(Dense(10, activation=\\'softmax\\')) sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss=\\'categorical_crossentropy\\', optimizer=sgd) model.fit(x_train, y_train, batch_size=32, epochs=10) score = model.evaluate(x_test, y_test, batch_size=32) Sequence classification with LSTM: from keras.models import Sequential from keras.layers import Dense, Dropout from keras.layers import Embedding from keras.layers import LSTM max_features = 1024 model = Sequential() model.add(Embedding(max_features, output_dim=256)) model.add(LSTM(128)) model.add(Dropout(0.5)) model.add(Dense(1, activation=\\'sigmoid\\')) model.compile(loss=\\'binary_crossentropy\\', optimizer=\\'rmsprop\\', metrics=[\\'accuracy\\']) model.fit(x_train, y_train, batch_size=16, epochs=10) score = model.evaluate(x_test, y_test, batch_size=16) Sequence classification with 1D convolutions: from keras.models import Sequential from keras.layers import Dense, Dropout from keras.layers import Embedding from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D seq_length = 64 model = Sequential() model.add(Conv1D(64, 3, activation=\\'relu\\', input_shape=(seq_length, 100))) model.add(Conv1D(64, 3, activation=\\'relu\\')) model.add(MaxPooling1D(3)) model.add(Conv1D(128, 3, activation=\\'relu\\')) model.add(Conv1D(128, 3, activation=\\'relu\\')) model.add(GlobalAveragePooling1D()) model.add(Dropout(0.5)) model.add(Dense(1, activation=\\'sigmoid\\')) model.compile(loss=\\'binary_crossentropy\\', optimizer=\\'rmsprop\\', metrics=[\\'accuracy\\']) model.fit(x_train, y_train, batch_size=16, epochs=10) score = model.evaluate(x_test, y_test, batch_size=16) Stacked LSTM for sequence classification In this model, we stack 3 LSTM layers on top of each other, making the model capable of learning higher-level temporal representations. The first two LSTMs return their full output sequences, but the last one only returns the last step in its output sequence, thus dropping the temporal dimension (i.e. converting the input sequence into a single vector). from keras.models import Sequential from keras.layers import LSTM, Dense import numpy as np data_dim = 16 timesteps = 8 num_classes = 10 # expected input data shape: (batch_size, timesteps, data_dim) model = Sequential() model.add(LSTM(32, return_sequences=True, input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32 model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32 model.add(LSTM(32)) # return a single vector of dimension 32 model.add(Dense(10, activation=\\'softmax\\')) model.compile(loss=\\'categorical_crossentropy\\', optimizer=\\'rmsprop\\', metrics=[\\'accuracy\\']) # Generate dummy training data x_train = np.random.random((1000, timesteps, data_dim)) y_train = np.random.random((1000, num_classes)) # Generate dummy validation data x_val = np.random.random((100, timesteps, data_dim)) y_val = np.random.random((100, num_classes)) model.fit(x_train, y_train, batch_size=64, epochs=5, validation_data=(x_val, y_val)) Same stacked LSTM model, rendered \"stateful\" A stateful recurrent model is one for which the internal states (memories) obtained after processing a batch of samples are reused as initial states for the samples of the next batch. This allows to process longer sequences while keeping computational complexity manageable. You can read more about stateful RNNs in the FAQ. from keras.models import Sequential from keras.layers import LSTM, Dense import numpy as np data_dim = 16 timesteps = 8 num_classes = 10 batch_size = 32 # Expected input batch shape: (batch_size, timesteps, data_dim) # Note that we have to provide the full batch_input_shape since the network is stateful. # the sample of index i in batch k is the follow-up for the sample i in batch k-1. model = Sequential() model.add(LSTM(32, return_sequences=True, stateful=True, batch_input_shape=(batch_size, timesteps, data_dim))) model.add(LSTM(32, return_sequences=True, stateful=True)) model.add(LSTM(32, stateful=True)) model.add(Dense(10, activation=\\'softmax\\')) model.compile(loss=\\'categorical_crossentropy\\', optimizer=\\'rmsprop\\', metrics=[\\'accuracy\\']) # Generate dummy training data x_train = np.random.random((batch_size * 10, timesteps, data_dim)) y_train = np.random.random((batch_size * 10, num_classes)) # Generate dummy validation data x_val = np.random.random((batch_size * 3, timesteps, data_dim)) y_val = np.random.random((batch_size * 3, num_classes)) model.fit(x_train, y_train, batch_size=batch_size, epochs=5, shuffle=False, validation_data=(x_val, y_val))'), ('title', 'Guide to the Sequential model')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#getting-started-with-the-keras-sequential-model'), ('text', \"The Sequential model is a linear stack of layers. You can create a Sequential model by passing a list of layer instances to the constructor: from keras.models import Sequential from keras.layers import Dense, Activation model = Sequential([ Dense(32, input_shape=(784,)), Activation('relu'), Dense(10), Activation('softmax'), ]) You can also simply add layers via the .add() method: model = Sequential() model.add(Dense(32, input_dim=784)) model.add(Activation('relu'))\"), ('title', 'Getting started with the Keras Sequential model')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#specifying-the-input-shape'), ('text', 'The model needs to know what input shape it should expect. For this reason, the first layer in a Sequential model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. There are several possible ways to do this: Pass an input_shape argument to the first layer. This is a shape tuple (a tuple of integers or None entries, where None indicates that any positive integer may be expected). In input_shape , the batch dimension is not included. Some 2D layers, such as Dense , support the specification of their input shape via the argument input_dim , and some 3D temporal layers support the arguments input_dim and input_length . If you ever need to specify a fixed batch size for your inputs (this is useful for stateful recurrent networks), you can pass a batch_size argument to a layer. If you pass both batch_size=32 and input_shape=(6, 8) to a layer, it will then expect every batch of inputs to have the batch shape (32, 6, 8) . As such, the following snippets are strictly equivalent: model = Sequential() model.add(Dense(32, input_shape=(784,))) model = Sequential() model.add(Dense(32, input_dim=784))'), ('title', 'Specifying the input shape')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#compilation'), ('text', \"Before training a model, you need to configure the learning process, which is done via the compile method. It receives three arguments: An optimizer. This could be the string identifier of an existing optimizer (such as rmsprop or adagrad ), or an instance of the Optimizer class. See: optimizers . A loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as categorical_crossentropy or mse ), or it can be an objective function. See: losses . A list of metrics. For any classification problem you will want to set this to metrics=['accuracy'] . A metric could be the string identifier of an existing metric or a custom metric function. # For a multi-class classification problem model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) # For a binary classification problem model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy']) # For a mean squared error regression problem model.compile(optimizer='rmsprop', loss='mse') # For custom metrics import keras.backend as K def mean_pred(y_true, y_pred): return K.mean(y_pred) model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy', mean_pred])\"), ('title', 'Compilation')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#training'), ('text', \"Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the fit function. Read its documentation here . # For a single-input model with 2 classes (binary classification): model = Sequential() model.add(Dense(32, activation='relu', input_dim=100)) model.add(Dense(1, activation='sigmoid')) model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy']) # Generate dummy data import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(2, size=(1000, 1)) # Train the model, iterating on the data in batches of 32 samples model.fit(data, labels, epochs=10, batch_size=32) # For a single-input model with 10 classes (categorical classification): model = Sequential() model.add(Dense(32, activation='relu', input_dim=100)) model.add(Dense(10, activation='softmax')) model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) # Generate dummy data import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(10, size=(1000, 1)) # Convert labels to categorical one-hot encoding one_hot_labels = keras.utils.to_categorical(labels, num_classes=10) # Train the model, iterating on the data in batches of 32 samples model.fit(data, one_hot_labels, epochs=10, batch_size=32)\"), ('title', 'Training')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#examples'), ('text', 'Here are a few examples to get you started! In the examples folder , you will also find example models for real datasets: CIFAR10 small images classification: Convolutional Neural Network (CNN) with realtime data augmentation IMDB movie review sentiment classification: LSTM over sequences of words Reuters newswires topic classification: Multilayer Perceptron (MLP) MNIST handwritten digits classification: MLP & CNN Character-level text generation with LSTM ...and more.'), ('title', 'Examples')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#multilayer-perceptron-mlp-for-multi-class-softmax-classification'), ('text', \"import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.optimizers import SGD # Generate dummy data import numpy as np x_train = np.random.random((1000, 20)) y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10) x_test = np.random.random((100, 20)) y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10) model = Sequential() # Dense(64) is a fully-connected layer with 64 hidden units. # in the first layer, you must specify the expected input data shape: # here, 20-dimensional vectors. model.add(Dense(64, activation='relu', input_dim=20)) model.add(Dropout(0.5)) model.add(Dense(64, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax')) sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy']) model.fit(x_train, y_train, epochs=20, batch_size=128) score = model.evaluate(x_test, y_test, batch_size=128)\"), ('title', 'Multilayer Perceptron (MLP) for multi-class softmax classification:')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#mlp-for-binary-classification'), ('text', \"import numpy as np from keras.models import Sequential from keras.layers import Dense, Dropout # Generate dummy data x_train = np.random.random((1000, 20)) y_train = np.random.randint(2, size=(1000, 1)) x_test = np.random.random((100, 20)) y_test = np.random.randint(2, size=(100, 1)) model = Sequential() model.add(Dense(64, input_dim=20, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(64, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) model.fit(x_train, y_train, epochs=20, batch_size=128) score = model.evaluate(x_test, y_test, batch_size=128)\"), ('title', 'MLP for binary classification:')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#vgg-like-convnet'), ('text', \"import numpy as np import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.optimizers import SGD # Generate dummy data x_train = np.random.random((100, 100, 100, 3)) y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10) x_test = np.random.random((20, 100, 100, 3)) y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10) model = Sequential() # input: 100x100 images with 3 channels -> (100, 100, 3) tensors. # this applies 32 convolution filters of size 3x3 each. model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3))) model.add(Conv2D(32, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(256, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax')) sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss='categorical_crossentropy', optimizer=sgd) model.fit(x_train, y_train, batch_size=32, epochs=10) score = model.evaluate(x_test, y_test, batch_size=32)\"), ('title', 'VGG-like convnet:')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#sequence-classification-with-lstm'), ('text', \"from keras.models import Sequential from keras.layers import Dense, Dropout from keras.layers import Embedding from keras.layers import LSTM max_features = 1024 model = Sequential() model.add(Embedding(max_features, output_dim=256)) model.add(LSTM(128)) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) model.fit(x_train, y_train, batch_size=16, epochs=10) score = model.evaluate(x_test, y_test, batch_size=16)\"), ('title', 'Sequence classification with LSTM:')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#sequence-classification-with-1d-convolutions'), ('text', \"from keras.models import Sequential from keras.layers import Dense, Dropout from keras.layers import Embedding from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D seq_length = 64 model = Sequential() model.add(Conv1D(64, 3, activation='relu', input_shape=(seq_length, 100))) model.add(Conv1D(64, 3, activation='relu')) model.add(MaxPooling1D(3)) model.add(Conv1D(128, 3, activation='relu')) model.add(Conv1D(128, 3, activation='relu')) model.add(GlobalAveragePooling1D()) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) model.fit(x_train, y_train, batch_size=16, epochs=10) score = model.evaluate(x_test, y_test, batch_size=16)\"), ('title', 'Sequence classification with 1D convolutions:')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#stacked-lstm-for-sequence-classification'), ('text', \"In this model, we stack 3 LSTM layers on top of each other, making the model capable of learning higher-level temporal representations. The first two LSTMs return their full output sequences, but the last one only returns the last step in its output sequence, thus dropping the temporal dimension (i.e. converting the input sequence into a single vector). from keras.models import Sequential from keras.layers import LSTM, Dense import numpy as np data_dim = 16 timesteps = 8 num_classes = 10 # expected input data shape: (batch_size, timesteps, data_dim) model = Sequential() model.add(LSTM(32, return_sequences=True, input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32 model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32 model.add(LSTM(32)) # return a single vector of dimension 32 model.add(Dense(10, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy']) # Generate dummy training data x_train = np.random.random((1000, timesteps, data_dim)) y_train = np.random.random((1000, num_classes)) # Generate dummy validation data x_val = np.random.random((100, timesteps, data_dim)) y_val = np.random.random((100, num_classes)) model.fit(x_train, y_train, batch_size=64, epochs=5, validation_data=(x_val, y_val))\"), ('title', 'Stacked LSTM for sequence classification')]), OrderedDict([('location', 'getting-started/sequential-model-guide.html#same-stacked-lstm-model-rendered-stateful'), ('text', \"A stateful recurrent model is one for which the internal states (memories) obtained after processing a batch of samples are reused as initial states for the samples of the next batch. This allows to process longer sequences while keeping computational complexity manageable. You can read more about stateful RNNs in the FAQ. from keras.models import Sequential from keras.layers import LSTM, Dense import numpy as np data_dim = 16 timesteps = 8 num_classes = 10 batch_size = 32 # Expected input batch shape: (batch_size, timesteps, data_dim) # Note that we have to provide the full batch_input_shape since the network is stateful. # the sample of index i in batch k is the follow-up for the sample i in batch k-1. model = Sequential() model.add(LSTM(32, return_sequences=True, stateful=True, batch_input_shape=(batch_size, timesteps, data_dim))) model.add(LSTM(32, return_sequences=True, stateful=True)) model.add(LSTM(32, stateful=True)) model.add(Dense(10, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy']) # Generate dummy training data x_train = np.random.random((batch_size * 10, timesteps, data_dim)) y_train = np.random.random((batch_size * 10, num_classes)) # Generate dummy validation data x_val = np.random.random((batch_size * 3, timesteps, data_dim)) y_val = np.random.random((batch_size * 3, num_classes)) model.fit(x_train, y_train, batch_size=batch_size, epochs=5, shuffle=False, validation_data=(x_val, y_val))\"), ('title', 'Same stacked LSTM model, rendered \"stateful\"')]), OrderedDict([('location', 'layers/about-keras-layers.html'), ('text', \"About Keras layers All Keras layers have a number of methods in common: layer.get_weights() : returns the weights of the layer as a list of Numpy arrays. layer.set_weights(weights) : sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of get_weights ). layer.get_config() : returns a dictionary containing the configuration of the layer. The layer can be reinstantiated from its config via: layer = Dense(32) config = layer.get_config() reconstructed_layer = Dense.from_config(config) Or: from keras import layers config = layer.get_config() layer = layers.deserialize({'class_name': layer.__class__.__name__, 'config': config}) If a layer has a single node (i.e. if it isn't a shared layer), you can get its input tensor, output tensor, input shape and output shape via: layer.input layer.output layer.input_shape layer.output_shape If the layer has multiple nodes (see: the concept of layer node and shared layers ), you can use the following methods: layer.get_input_at(node_index) layer.get_output_at(node_index) layer.get_input_shape_at(node_index) layer.get_output_shape_at(node_index)\"), ('title', 'About Keras layers')]), OrderedDict([('location', 'layers/about-keras-layers.html#about-keras-layers'), ('text', \"All Keras layers have a number of methods in common: layer.get_weights() : returns the weights of the layer as a list of Numpy arrays. layer.set_weights(weights) : sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of get_weights ). layer.get_config() : returns a dictionary containing the configuration of the layer. The layer can be reinstantiated from its config via: layer = Dense(32) config = layer.get_config() reconstructed_layer = Dense.from_config(config) Or: from keras import layers config = layer.get_config() layer = layers.deserialize({'class_name': layer.__class__.__name__, 'config': config}) If a layer has a single node (i.e. if it isn't a shared layer), you can get its input tensor, output tensor, input shape and output shape via: layer.input layer.output layer.input_shape layer.output_shape If the layer has multiple nodes (see: the concept of layer node and shared layers ), you can use the following methods: layer.get_input_at(node_index) layer.get_output_at(node_index) layer.get_input_shape_at(node_index) layer.get_output_shape_at(node_index)\"), ('title', 'About Keras layers')]), OrderedDict([('location', 'layers/advanced-activations.html'), ('text', \"[source] LeakyReLU keras.layers.LeakyReLU(alpha=0.3) Leaky version of a Rectified Linear Unit. It allows a small gradient when the unit is not active: f(x) = alpha * x for x < 0 , f(x) = x for x >= 0 . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments alpha : float >= 0. Negative slope coefficient. References [Rectifier Nonlinearities Improve Neural Network Acoustic Models] (https://ai.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf) [source] PReLU keras.layers.PReLU(alpha_initializer='zeros', alpha_regularizer=None, alpha_constraint=None, shared_axes=None) Parametric Rectified Linear Unit. It follows: f(x) = alpha * x for x < 0 , f(x) = x for x >= 0 , where alpha is a learned array with the same shape as x. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments alpha_initializer : initializer function for the weights. alpha_regularizer : regularizer for the weights. alpha_constraint : constraint for the weights. shared_axes : the axes along which to share learnable parameters for the activation function. For example, if the incoming feature maps are from a 2D convolution with output shape (batch, height, width, channels) , and you wish to share parameters across space so that each filter only has one set of parameters, set shared_axes=[1, 2] . References Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [source] ELU keras.layers.ELU(alpha=1.0) Exponential Linear Unit. It follows: f(x) = alpha * (exp(x) - 1.) for x < 0 , f(x) = x for x >= 0 . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments alpha : scale for the negative factor. References Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) [source] ThresholdedReLU keras.layers.ThresholdedReLU(theta=1.0) Thresholded Rectified Linear Unit. It follows: f(x) = x for x > theta , f(x) = 0 otherwise . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments theta : float >= 0. Threshold location of activation. References [Zero-Bias Autoencoders and the Benefits of Co-Adapting Features] (https://arxiv.org/abs/1402.3337) [source] Softmax keras.layers.Softmax(axis=-1) Softmax activation function. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments axis : Integer, axis along which the softmax normalization is applied. [source] ReLU keras.layers.ReLU(max_value=None, negative_slope=0.0, threshold=0.0) Rectified Linear Unit activation function. With default values, it returns element-wise max(x, 0) . Otherwise, it follows: f(x) = max_value for x >= max_value , f(x) = x for threshold <= x < max_value , f(x) = negative_slope * (x - threshold) otherwise. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments max_value : float >= 0. Maximum activation value. negative_slope : float >= 0. Negative slope coefficient. threshold : float. Threshold value for thresholded activation.\"), ('title', 'Advanced Activations Layers')]), OrderedDict([('location', 'layers/advanced-activations.html#leakyrelu'), ('text', 'keras.layers.LeakyReLU(alpha=0.3) Leaky version of a Rectified Linear Unit. It allows a small gradient when the unit is not active: f(x) = alpha * x for x < 0 , f(x) = x for x >= 0 . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments alpha : float >= 0. Negative slope coefficient. References [Rectifier Nonlinearities Improve Neural Network Acoustic Models] (https://ai.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf) [source]'), ('title', 'LeakyReLU')]), OrderedDict([('location', 'layers/advanced-activations.html#prelu'), ('text', \"keras.layers.PReLU(alpha_initializer='zeros', alpha_regularizer=None, alpha_constraint=None, shared_axes=None) Parametric Rectified Linear Unit. It follows: f(x) = alpha * x for x < 0 , f(x) = x for x >= 0 , where alpha is a learned array with the same shape as x. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments alpha_initializer : initializer function for the weights. alpha_regularizer : regularizer for the weights. alpha_constraint : constraint for the weights. shared_axes : the axes along which to share learnable parameters for the activation function. For example, if the incoming feature maps are from a 2D convolution with output shape (batch, height, width, channels) , and you wish to share parameters across space so that each filter only has one set of parameters, set shared_axes=[1, 2] . References Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [source]\"), ('title', 'PReLU')]), OrderedDict([('location', 'layers/advanced-activations.html#elu'), ('text', 'keras.layers.ELU(alpha=1.0) Exponential Linear Unit. It follows: f(x) = alpha * (exp(x) - 1.) for x < 0 , f(x) = x for x >= 0 . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments alpha : scale for the negative factor. References Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) [source]'), ('title', 'ELU')]), OrderedDict([('location', 'layers/advanced-activations.html#thresholdedrelu'), ('text', 'keras.layers.ThresholdedReLU(theta=1.0) Thresholded Rectified Linear Unit. It follows: f(x) = x for x > theta , f(x) = 0 otherwise . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments theta : float >= 0. Threshold location of activation. References [Zero-Bias Autoencoders and the Benefits of Co-Adapting Features] (https://arxiv.org/abs/1402.3337) [source]'), ('title', 'ThresholdedReLU')]), OrderedDict([('location', 'layers/advanced-activations.html#softmax'), ('text', 'keras.layers.Softmax(axis=-1) Softmax activation function. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments axis : Integer, axis along which the softmax normalization is applied. [source]'), ('title', 'Softmax')]), OrderedDict([('location', 'layers/advanced-activations.html#relu'), ('text', 'keras.layers.ReLU(max_value=None, negative_slope=0.0, threshold=0.0) Rectified Linear Unit activation function. With default values, it returns element-wise max(x, 0) . Otherwise, it follows: f(x) = max_value for x >= max_value , f(x) = x for threshold <= x < max_value , f(x) = negative_slope * (x - threshold) otherwise. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as the input. Arguments max_value : float >= 0. Maximum activation value. negative_slope : float >= 0. Negative slope coefficient. threshold : float. Threshold value for thresholded activation.'), ('title', 'ReLU')]), OrderedDict([('location', 'layers/convolutional.html'), ('text', '[source] Conv1D keras.layers.Conv1D(filters, kernel_size, strides=1, padding=\\'valid\\', data_format=\\'channels_last\\', dilation_rate=1, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) 1D convolution layer (e.g. temporal convolution). This layer creates a convolution kernel that is convolved with the layer input over a single spatial (or temporal) dimension to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs. Finally, if activation is not None , it is applied to the outputs as well. When using this layer as the first layer in a model, provide an input_shape argument (tuple of integers or None , e.g. (10, 128) for sequences of 10 vectors of 128-dimensional vectors, or (None, 128) for variable-length sequences of 128-dimensional vectors. Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of a single integer, specifying the length of the 1D convolution window. strides : An integer or tuple/list of a single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : One of \"valid\" , \"causal\" or \"same\" (case-insensitive). \"valid\" means \"no padding\". \"same\" results in padding the input such that the output has the same length as the original input. \"causal\" results in causal (dilated) convolutions, e.g. output[t] does not depend on input[t + 1:] . A zero padding is used such that the output has the same length as the original input. Useful when modeling temporal data where the model should not violate the temporal order. See [WaveNet: A Generative Model for Raw Audio, section 2.1] (https://arxiv.org/abs/1609.03499). data_format : A string, one of \"channels_last\" (default) or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, steps, channels) (default format for temporal data in Keras) while \"channels_first\" corresponds to inputs with shape (batch, channels, steps) . dilation_rate : an integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 3D tensor with shape: (batch, steps, channels) Output shape 3D tensor with shape: (batch, new_steps, filters) steps value might have changed due to padding or strides. [source] Conv2D keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) 2D convolution layer (e.g. spatial convolution over images). This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs. Finally, if activation is not None , it is applied to the outputs as well. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). Note that \"same\" is slightly inconsistent across backends with strides != 1, as described here data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (batch, channels, rows, cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is \"channels_last\" . Output shape 4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, new_rows, new_cols, filters) if data_format is \"channels_last\" . rows and cols values might have changed due to padding. [source] SeparableConv1D keras.layers.SeparableConv1D(filters, kernel_size, strides=1, padding=\\'valid\\', data_format=\\'channels_last\\', dilation_rate=1, depth_multiplier=1, activation=None, use_bias=True, depthwise_initializer=\\'glorot_uniform\\', pointwise_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', depthwise_regularizer=None, pointwise_regularizer=None, bias_regularizer=None, activity_regularizer=None, depthwise_constraint=None, pointwise_constraint=None, bias_constraint=None) Depthwise separable 1D convolution. Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels. The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step. Intuitively, separable convolutions can be understood as a way to factorize a convolution kernel into two smaller kernels, or as an extreme version of an Inception block. Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of single integer, specifying the length of the 1D convolution window. strides : An integer or tuple/list of single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, steps, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, steps) . dilation_rate : An integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. depth_multiplier : The number of depthwise convolution output channels for each input channel. The total number of depthwise convolution output channels will be equal to filters_in * depth_multiplier . activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. depthwise_initializer : Initializer for the depthwise kernel matrix (see initializers ). pointwise_initializer : Initializer for the pointwise kernel matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). depthwise_regularizer : Regularizer function applied to the depthwise kernel matrix (see regularizer ). pointwise_regularizer : Regularizer function applied to the pointwise kernel matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). depthwise_constraint : Constraint function applied to the depthwise kernel matrix (see constraints ). pointwise_constraint : Constraint function applied to the pointwise kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 3D tensor with shape: (batch, channels, steps) if data_format is \"channels_first\" or 3D tensor with shape: (batch, steps, channels) if data_format is \"channels_last\" . Output shape 3D tensor with shape: (batch, filters, new_steps) if data_format is \"channels_first\" or 3D tensor with shape: (batch, new_steps, filters) if data_format is \"channels_last\" . new_steps values might have changed due to padding or strides. [source] SeparableConv2D keras.layers.SeparableConv2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1), depth_multiplier=1, activation=None, use_bias=True, depthwise_initializer=\\'glorot_uniform\\', pointwise_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', depthwise_regularizer=None, pointwise_regularizer=None, bias_regularizer=None, activity_regularizer=None, depthwise_constraint=None, pointwise_constraint=None, bias_constraint=None) Depthwise separable 2D convolution. Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels. The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step. Intuitively, separable convolutions can be understood as a way to factorize a convolution kernel into two smaller kernels, or as an extreme version of an Inception block. Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : An integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. depth_multiplier : The number of depthwise convolution output channels for each input channel. The total number of depthwise convolution output channels will be equal to filters_in * depth_multiplier . activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. depthwise_initializer : Initializer for the depthwise kernel matrix (see initializers ). pointwise_initializer : Initializer for the pointwise kernel matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). depthwise_regularizer : Regularizer function applied to the depthwise kernel matrix (see regularizer ). pointwise_regularizer : Regularizer function applied to the pointwise kernel matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). depthwise_constraint : Constraint function applied to the depthwise kernel matrix (see constraints ). pointwise_constraint : Constraint function applied to the pointwise kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (batch, channels, rows, cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is \"channels_last\" . Output shape 4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, new_rows, new_cols, filters) if data_format is \"channels_last\" . rows and cols values might have changed due to padding. [source] Conv2DTranspose keras.layers.Conv2DTranspose(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', output_padding=None, data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Transposed convolution layer (sometimes called Deconvolution). The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). output_padding : An integer or tuple/list of 2 integers, specifying the amount of padding along the height and width of the output tensor. Can be a single integer to specify the same value for all spatial dimensions. The amount of output padding along a given dimension must be lower than the stride along that same dimension. If set to None (default), the output shape is inferred. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (batch, channels, rows, cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is \"channels_last\" . Output shape 4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, new_rows, new_cols, filters) if data_format is \"channels_last\" . rows and cols values might have changed due to padding. If output_padding is specified: new_rows = ((rows - 1) * strides[0] + kernel_size[0] - 2 * padding[0] + output_padding[0]) new_cols = ((cols - 1) * strides[1] + kernel_size[1] - 2 * padding[1] + output_padding[1]) References [A guide to convolution arithmetic for deep learning] (https://arxiv.org/abs/1603.07285v1) [Deconvolutional Networks] (http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf) [source] Conv3D keras.layers.Conv3D(filters, kernel_size, strides=(1, 1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1, 1), activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) 3D convolution layer (e.g. spatial convolution over volumes). This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs. Finally, if activation is not None , it is applied to the outputs as well. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 128, 1) for 128x128x128 volumes with a single channel, in data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 3 integers, specifying the depth, height and width of the 3D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 3 integers, specifying the strides of the convolution along each spatial dimension. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 3 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 5D tensor with shape: (batch, channels, conv_dim1, conv_dim2, conv_dim3) if data_format is \"channels_first\" or 5D tensor with shape: (batch, conv_dim1, conv_dim2, conv_dim3, channels) if data_format is \"channels_last\" . Output shape 5D tensor with shape: (batch, filters, new_conv_dim1, new_conv_dim2, new_conv_dim3) if data_format is \"channels_first\" or 5D tensor with shape: (batch, new_conv_dim1, new_conv_dim2, new_conv_dim3, filters) if data_format is \"channels_last\" . new_conv_dim1 , new_conv_dim2 and new_conv_dim3 values might have changed due to padding. [source] Conv3DTranspose keras.layers.Conv3DTranspose(filters, kernel_size, strides=(1, 1, 1), padding=\\'valid\\', output_padding=None, data_format=None, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Transposed convolution layer (sometimes called Deconvolution). The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 128, 3) for a 128x128x128 volume with 3 channels if data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 3 integers, specifying the depth, height and width of the 3D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 3 integers, specifying the strides of the convolution along the depth, height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). output_padding : An integer or tuple/list of 3 integers, specifying the amount of padding along the depth, height, and width. Can be a single integer to specify the same value for all spatial dimensions. The amount of output padding along a given dimension must be lower than the stride along that same dimension. If set to None (default), the output shape is inferred. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, depth, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, depth, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 3 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 5D tensor with shape: (batch, channels, depth, rows, cols) if data_format is \"channels_first\" or 5D tensor with shape: (batch, depth, rows, cols, channels) if data_format is \"channels_last\" . Output shape 5D tensor with shape: (batch, filters, new_depth, new_rows, new_cols) if data_format is \"channels_first\" or 5D tensor with shape: (batch, new_depth, new_rows, new_cols, filters) if data_format is \"channels_last\" . depth and rows and cols values might have changed due to padding. If output_padding is specified:: new_depth = ((depth - 1) * strides[0] + kernel_size[0] - 2 * padding[0] + output_padding[0]) new_rows = ((rows - 1) * strides[1] + kernel_size[1] - 2 * padding[1] + output_padding[1]) new_cols = ((cols - 1) * strides[2] + kernel_size[2] - 2 * padding[2] + output_padding[2]) References [A guide to convolution arithmetic for deep learning] (https://arxiv.org/abs/1603.07285v1) [Deconvolutional Networks] (http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf) [source] Cropping1D keras.layers.Cropping1D(cropping=(1, 1)) Cropping layer for 1D input (e.g. temporal sequence). It crops along the time dimension (axis 1). Arguments cropping : int or tuple of int (length 2) How many units should be trimmed off at the beginning and end of the cropping dimension (axis 1). If a single int is provided, the same value will be used for both. Input shape 3D tensor with shape (batch, axis_to_crop, features) Output shape 3D tensor with shape (batch, cropped_axis, features) [source] Cropping2D keras.layers.Cropping2D(cropping=((0, 0), (0, 0)), data_format=None) Cropping layer for 2D input (e.g. picture). It crops along spatial dimensions, i.e. height and width. Arguments cropping : int, or tuple of 2 ints, or tuple of 2 tuples of 2 ints. If int: the same symmetric cropping is applied to height and width. If tuple of 2 ints: interpreted as two different symmetric cropping values for height and width: (symmetric_height_crop, symmetric_width_crop) . If tuple of 2 tuples of 2 ints: interpreted as ((top_crop, bottom_crop), (left_crop, right_crop)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, rows, cols, channels) - If data_format is \"channels_first\" : (batch, channels, rows, cols) Output shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, cropped_rows, cropped_cols, channels) - If data_format is \"channels_first\" : (batch, channels, cropped_rows, cropped_cols) Examples # Crop the input 2D images or feature maps model = Sequential() model.add(Cropping2D(cropping=((2, 2), (4, 4)), input_shape=(28, 28, 3))) # now model.output_shape == (None, 24, 20, 3) model.add(Conv2D(64, (3, 3), padding=\\'same\\')) model.add(Cropping2D(cropping=((2, 2), (2, 2)))) # now model.output_shape == (None, 20, 16, 64) [source] Cropping3D keras.layers.Cropping3D(cropping=((1, 1), (1, 1), (1, 1)), data_format=None) Cropping layer for 3D data (e.g. spatial or spatio-temporal). Arguments cropping : int, or tuple of 3 ints, or tuple of 3 tuples of 2 ints. If int: the same symmetric cropping is applied to depth, height, and width. If tuple of 3 ints: interpreted as two different symmetric cropping values for depth, height, and width: (symmetric_dim1_crop, symmetric_dim2_crop, symmetric_dim3_crop) . If tuple of 3 tuples of 2 ints: interpreted as ((left_dim1_crop, right_dim1_crop), (left_dim2_crop, right_dim2_crop), (left_dim3_crop, right_dim3_crop)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop, depth) - If data_format is \"channels_first\" : (batch, depth, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop) Output shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_cropped_axis, second_cropped_axis, third_cropped_axis, depth) - If data_format is \"channels_first\" : (batch, depth, first_cropped_axis, second_cropped_axis, third_cropped_axis) [source] UpSampling1D keras.layers.UpSampling1D(size=2) Upsampling layer for 1D inputs. Repeats each temporal step size times along the time axis. Arguments size : integer. Upsampling factor. Input shape 3D tensor with shape: (batch, steps, features) . Output shape 3D tensor with shape: (batch, upsampled_steps, features) . [source] UpSampling2D keras.layers.UpSampling2D(size=(2, 2), data_format=None, interpolation=\\'nearest\\') Upsampling layer for 2D inputs. Repeats the rows and columns of the data by size[0] and size[1] respectively. Arguments size : int, or tuple of 2 integers. The upsampling factors for rows and columns. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". interpolation : A string, one of nearest or bilinear . Note that CNTK does not support yet the bilinear upscaling and that with Theano, only size=(2, 2) is possible. Input shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, rows, cols, channels) - If data_format is \"channels_first\" : (batch, channels, rows, cols) Output shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, upsampled_rows, upsampled_cols, channels) - If data_format is \"channels_first\" : (batch, channels, upsampled_rows, upsampled_cols) [source] UpSampling3D keras.layers.UpSampling3D(size=(2, 2, 2), data_format=None) Upsampling layer for 3D inputs. Repeats the 1st, 2nd and 3rd dimensions of the data by size[0], size[1] and size[2] respectively. Arguments size : int, or tuple of 3 integers. The upsampling factors for dim1, dim2 and dim3. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, dim1, dim2, dim3, channels) - If data_format is \"channels_first\" : (batch, channels, dim1, dim2, dim3) Output shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, upsampled_dim1, upsampled_dim2, upsampled_dim3, channels) - If data_format is \"channels_first\" : (batch, channels, upsampled_dim1, upsampled_dim2, upsampled_dim3) [source] ZeroPadding1D keras.layers.ZeroPadding1D(padding=1) Zero-padding layer for 1D input (e.g. temporal sequence). Arguments padding : int, or tuple of int (length 2), or dictionary. If int: How many zeros to add at the beginning and end of the padding dimension (axis 1). If tuple of int (length 2): How many zeros to add at the beginning and at the end of the padding dimension ( (left_pad, right_pad) ). Input shape 3D tensor with shape (batch, axis_to_pad, features) Output shape 3D tensor with shape (batch, padded_axis, features) [source] ZeroPadding2D keras.layers.ZeroPadding2D(padding=(1, 1), data_format=None) Zero-padding layer for 2D input (e.g. picture). This layer can add rows and columns of zeros at the top, bottom, left and right side of an image tensor. Arguments padding : int, or tuple of 2 ints, or tuple of 2 tuples of 2 ints. If int: the same symmetric padding is applied to height and width. If tuple of 2 ints: interpreted as two different symmetric padding values for height and width: (symmetric_height_pad, symmetric_width_pad) . If tuple of 2 tuples of 2 ints: interpreted as ((top_pad, bottom_pad), (left_pad, right_pad)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, rows, cols, channels) - If data_format is \"channels_first\" : (batch, channels, rows, cols) Output shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, padded_rows, padded_cols, channels) - If data_format is \"channels_first\" : (batch, channels, padded_rows, padded_cols) [source] ZeroPadding3D keras.layers.ZeroPadding3D(padding=(1, 1, 1), data_format=None) Zero-padding layer for 3D data (spatial or spatio-temporal). Arguments padding : int, or tuple of 3 ints, or tuple of 3 tuples of 2 ints. If int: the same symmetric padding is applied to height and width. If tuple of 3 ints: interpreted as two different symmetric padding values for height and width: (symmetric_dim1_pad, symmetric_dim2_pad, symmetric_dim3_pad) . If tuple of 3 tuples of 2 ints: interpreted as ((left_dim1_pad, right_dim1_pad), (left_dim2_pad, right_dim2_pad), (left_dim3_pad, right_dim3_pad)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad, depth) - If data_format is \"channels_first\" : (batch, depth, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad) Output shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_padded_axis, second_padded_axis, third_axis_to_pad, depth) - If data_format is \"channels_first\" : (batch, depth, first_padded_axis, second_padded_axis, third_axis_to_pad)'), ('title', 'Convolutional Layers')]), OrderedDict([('location', 'layers/convolutional.html#conv1d'), ('text', 'keras.layers.Conv1D(filters, kernel_size, strides=1, padding=\\'valid\\', data_format=\\'channels_last\\', dilation_rate=1, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) 1D convolution layer (e.g. temporal convolution). This layer creates a convolution kernel that is convolved with the layer input over a single spatial (or temporal) dimension to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs. Finally, if activation is not None , it is applied to the outputs as well. When using this layer as the first layer in a model, provide an input_shape argument (tuple of integers or None , e.g. (10, 128) for sequences of 10 vectors of 128-dimensional vectors, or (None, 128) for variable-length sequences of 128-dimensional vectors. Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of a single integer, specifying the length of the 1D convolution window. strides : An integer or tuple/list of a single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : One of \"valid\" , \"causal\" or \"same\" (case-insensitive). \"valid\" means \"no padding\". \"same\" results in padding the input such that the output has the same length as the original input. \"causal\" results in causal (dilated) convolutions, e.g. output[t] does not depend on input[t + 1:] . A zero padding is used such that the output has the same length as the original input. Useful when modeling temporal data where the model should not violate the temporal order. See [WaveNet: A Generative Model for Raw Audio, section 2.1] (https://arxiv.org/abs/1609.03499). data_format : A string, one of \"channels_last\" (default) or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, steps, channels) (default format for temporal data in Keras) while \"channels_first\" corresponds to inputs with shape (batch, channels, steps) . dilation_rate : an integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 3D tensor with shape: (batch, steps, channels) Output shape 3D tensor with shape: (batch, new_steps, filters) steps value might have changed due to padding or strides. [source]'), ('title', 'Conv1D')]), OrderedDict([('location', 'layers/convolutional.html#conv2d'), ('text', 'keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) 2D convolution layer (e.g. spatial convolution over images). This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs. Finally, if activation is not None , it is applied to the outputs as well. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). Note that \"same\" is slightly inconsistent across backends with strides != 1, as described here data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (batch, channels, rows, cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is \"channels_last\" . Output shape 4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, new_rows, new_cols, filters) if data_format is \"channels_last\" . rows and cols values might have changed due to padding. [source]'), ('title', 'Conv2D')]), OrderedDict([('location', 'layers/convolutional.html#separableconv1d'), ('text', 'keras.layers.SeparableConv1D(filters, kernel_size, strides=1, padding=\\'valid\\', data_format=\\'channels_last\\', dilation_rate=1, depth_multiplier=1, activation=None, use_bias=True, depthwise_initializer=\\'glorot_uniform\\', pointwise_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', depthwise_regularizer=None, pointwise_regularizer=None, bias_regularizer=None, activity_regularizer=None, depthwise_constraint=None, pointwise_constraint=None, bias_constraint=None) Depthwise separable 1D convolution. Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels. The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step. Intuitively, separable convolutions can be understood as a way to factorize a convolution kernel into two smaller kernels, or as an extreme version of an Inception block. Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of single integer, specifying the length of the 1D convolution window. strides : An integer or tuple/list of single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, steps, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, steps) . dilation_rate : An integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. depth_multiplier : The number of depthwise convolution output channels for each input channel. The total number of depthwise convolution output channels will be equal to filters_in * depth_multiplier . activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. depthwise_initializer : Initializer for the depthwise kernel matrix (see initializers ). pointwise_initializer : Initializer for the pointwise kernel matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). depthwise_regularizer : Regularizer function applied to the depthwise kernel matrix (see regularizer ). pointwise_regularizer : Regularizer function applied to the pointwise kernel matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). depthwise_constraint : Constraint function applied to the depthwise kernel matrix (see constraints ). pointwise_constraint : Constraint function applied to the pointwise kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 3D tensor with shape: (batch, channels, steps) if data_format is \"channels_first\" or 3D tensor with shape: (batch, steps, channels) if data_format is \"channels_last\" . Output shape 3D tensor with shape: (batch, filters, new_steps) if data_format is \"channels_first\" or 3D tensor with shape: (batch, new_steps, filters) if data_format is \"channels_last\" . new_steps values might have changed due to padding or strides. [source]'), ('title', 'SeparableConv1D')]), OrderedDict([('location', 'layers/convolutional.html#separableconv2d'), ('text', 'keras.layers.SeparableConv2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1), depth_multiplier=1, activation=None, use_bias=True, depthwise_initializer=\\'glorot_uniform\\', pointwise_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', depthwise_regularizer=None, pointwise_regularizer=None, bias_regularizer=None, activity_regularizer=None, depthwise_constraint=None, pointwise_constraint=None, bias_constraint=None) Depthwise separable 2D convolution. Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels. The depth_multiplier argument controls how many output channels are generated per input channel in the depthwise step. Intuitively, separable convolutions can be understood as a way to factorize a convolution kernel into two smaller kernels, or as an extreme version of an Inception block. Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : An integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. depth_multiplier : The number of depthwise convolution output channels for each input channel. The total number of depthwise convolution output channels will be equal to filters_in * depth_multiplier . activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. depthwise_initializer : Initializer for the depthwise kernel matrix (see initializers ). pointwise_initializer : Initializer for the pointwise kernel matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). depthwise_regularizer : Regularizer function applied to the depthwise kernel matrix (see regularizer ). pointwise_regularizer : Regularizer function applied to the pointwise kernel matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). depthwise_constraint : Constraint function applied to the depthwise kernel matrix (see constraints ). pointwise_constraint : Constraint function applied to the pointwise kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (batch, channels, rows, cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is \"channels_last\" . Output shape 4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, new_rows, new_cols, filters) if data_format is \"channels_last\" . rows and cols values might have changed due to padding. [source]'), ('title', 'SeparableConv2D')]), OrderedDict([('location', 'layers/convolutional.html#conv2dtranspose'), ('text', 'keras.layers.Conv2DTranspose(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', output_padding=None, data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Transposed convolution layer (sometimes called Deconvolution). The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). output_padding : An integer or tuple/list of 2 integers, specifying the amount of padding along the height and width of the output tensor. Can be a single integer to specify the same value for all spatial dimensions. The amount of output padding along a given dimension must be lower than the stride along that same dimension. If set to None (default), the output shape is inferred. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (batch, channels, rows, cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is \"channels_last\" . Output shape 4D tensor with shape: (batch, filters, new_rows, new_cols) if data_format is \"channels_first\" or 4D tensor with shape: (batch, new_rows, new_cols, filters) if data_format is \"channels_last\" . rows and cols values might have changed due to padding. If output_padding is specified: new_rows = ((rows - 1) * strides[0] + kernel_size[0] - 2 * padding[0] + output_padding[0]) new_cols = ((cols - 1) * strides[1] + kernel_size[1] - 2 * padding[1] + output_padding[1]) References [A guide to convolution arithmetic for deep learning] (https://arxiv.org/abs/1603.07285v1) [Deconvolutional Networks] (http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf) [source]'), ('title', 'Conv2DTranspose')]), OrderedDict([('location', 'layers/convolutional.html#conv3d'), ('text', 'keras.layers.Conv3D(filters, kernel_size, strides=(1, 1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1, 1), activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) 3D convolution layer (e.g. spatial convolution over volumes). This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs. Finally, if activation is not None , it is applied to the outputs as well. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 128, 1) for 128x128x128 volumes with a single channel, in data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 3 integers, specifying the depth, height and width of the 3D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 3 integers, specifying the strides of the convolution along each spatial dimension. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 3 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 5D tensor with shape: (batch, channels, conv_dim1, conv_dim2, conv_dim3) if data_format is \"channels_first\" or 5D tensor with shape: (batch, conv_dim1, conv_dim2, conv_dim3, channels) if data_format is \"channels_last\" . Output shape 5D tensor with shape: (batch, filters, new_conv_dim1, new_conv_dim2, new_conv_dim3) if data_format is \"channels_first\" or 5D tensor with shape: (batch, new_conv_dim1, new_conv_dim2, new_conv_dim3, filters) if data_format is \"channels_last\" . new_conv_dim1 , new_conv_dim2 and new_conv_dim3 values might have changed due to padding. [source]'), ('title', 'Conv3D')]), OrderedDict([('location', 'layers/convolutional.html#conv3dtranspose'), ('text', 'keras.layers.Conv3DTranspose(filters, kernel_size, strides=(1, 1, 1), padding=\\'valid\\', output_padding=None, data_format=None, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Transposed convolution layer (sometimes called Deconvolution). The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution. When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(128, 128, 128, 3) for a 128x128x128 volume with 3 channels if data_format=\"channels_last\" . Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 3 integers, specifying the depth, height and width of the 3D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 3 integers, specifying the strides of the convolution along the depth, height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : one of \"valid\" or \"same\" (case-insensitive). output_padding : An integer or tuple/list of 3 integers, specifying the amount of padding along the depth, height, and width. Can be a single integer to specify the same value for all spatial dimensions. The amount of output padding along a given dimension must be lower than the stride along that same dimension. If set to None (default), the output shape is inferred. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, depth, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, depth, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". dilation_rate : an integer or tuple/list of 3 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 5D tensor with shape: (batch, channels, depth, rows, cols) if data_format is \"channels_first\" or 5D tensor with shape: (batch, depth, rows, cols, channels) if data_format is \"channels_last\" . Output shape 5D tensor with shape: (batch, filters, new_depth, new_rows, new_cols) if data_format is \"channels_first\" or 5D tensor with shape: (batch, new_depth, new_rows, new_cols, filters) if data_format is \"channels_last\" . depth and rows and cols values might have changed due to padding. If output_padding is specified:: new_depth = ((depth - 1) * strides[0] + kernel_size[0] - 2 * padding[0] + output_padding[0]) new_rows = ((rows - 1) * strides[1] + kernel_size[1] - 2 * padding[1] + output_padding[1]) new_cols = ((cols - 1) * strides[2] + kernel_size[2] - 2 * padding[2] + output_padding[2]) References [A guide to convolution arithmetic for deep learning] (https://arxiv.org/abs/1603.07285v1) [Deconvolutional Networks] (http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf) [source]'), ('title', 'Conv3DTranspose')]), OrderedDict([('location', 'layers/convolutional.html#cropping1d'), ('text', 'keras.layers.Cropping1D(cropping=(1, 1)) Cropping layer for 1D input (e.g. temporal sequence). It crops along the time dimension (axis 1). Arguments cropping : int or tuple of int (length 2) How many units should be trimmed off at the beginning and end of the cropping dimension (axis 1). If a single int is provided, the same value will be used for both. Input shape 3D tensor with shape (batch, axis_to_crop, features) Output shape 3D tensor with shape (batch, cropped_axis, features) [source]'), ('title', 'Cropping1D')]), OrderedDict([('location', 'layers/convolutional.html#cropping2d'), ('text', 'keras.layers.Cropping2D(cropping=((0, 0), (0, 0)), data_format=None) Cropping layer for 2D input (e.g. picture). It crops along spatial dimensions, i.e. height and width. Arguments cropping : int, or tuple of 2 ints, or tuple of 2 tuples of 2 ints. If int: the same symmetric cropping is applied to height and width. If tuple of 2 ints: interpreted as two different symmetric cropping values for height and width: (symmetric_height_crop, symmetric_width_crop) . If tuple of 2 tuples of 2 ints: interpreted as ((top_crop, bottom_crop), (left_crop, right_crop)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, rows, cols, channels) - If data_format is \"channels_first\" : (batch, channels, rows, cols) Output shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, cropped_rows, cropped_cols, channels) - If data_format is \"channels_first\" : (batch, channels, cropped_rows, cropped_cols) Examples # Crop the input 2D images or feature maps model = Sequential() model.add(Cropping2D(cropping=((2, 2), (4, 4)), input_shape=(28, 28, 3))) # now model.output_shape == (None, 24, 20, 3) model.add(Conv2D(64, (3, 3), padding=\\'same\\')) model.add(Cropping2D(cropping=((2, 2), (2, 2)))) # now model.output_shape == (None, 20, 16, 64) [source]'), ('title', 'Cropping2D')]), OrderedDict([('location', 'layers/convolutional.html#cropping3d'), ('text', 'keras.layers.Cropping3D(cropping=((1, 1), (1, 1), (1, 1)), data_format=None) Cropping layer for 3D data (e.g. spatial or spatio-temporal). Arguments cropping : int, or tuple of 3 ints, or tuple of 3 tuples of 2 ints. If int: the same symmetric cropping is applied to depth, height, and width. If tuple of 3 ints: interpreted as two different symmetric cropping values for depth, height, and width: (symmetric_dim1_crop, symmetric_dim2_crop, symmetric_dim3_crop) . If tuple of 3 tuples of 2 ints: interpreted as ((left_dim1_crop, right_dim1_crop), (left_dim2_crop, right_dim2_crop), (left_dim3_crop, right_dim3_crop)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop, depth) - If data_format is \"channels_first\" : (batch, depth, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop) Output shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_cropped_axis, second_cropped_axis, third_cropped_axis, depth) - If data_format is \"channels_first\" : (batch, depth, first_cropped_axis, second_cropped_axis, third_cropped_axis) [source]'), ('title', 'Cropping3D')]), OrderedDict([('location', 'layers/convolutional.html#upsampling1d'), ('text', 'keras.layers.UpSampling1D(size=2) Upsampling layer for 1D inputs. Repeats each temporal step size times along the time axis. Arguments size : integer. Upsampling factor. Input shape 3D tensor with shape: (batch, steps, features) . Output shape 3D tensor with shape: (batch, upsampled_steps, features) . [source]'), ('title', 'UpSampling1D')]), OrderedDict([('location', 'layers/convolutional.html#upsampling2d'), ('text', 'keras.layers.UpSampling2D(size=(2, 2), data_format=None, interpolation=\\'nearest\\') Upsampling layer for 2D inputs. Repeats the rows and columns of the data by size[0] and size[1] respectively. Arguments size : int, or tuple of 2 integers. The upsampling factors for rows and columns. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". interpolation : A string, one of nearest or bilinear . Note that CNTK does not support yet the bilinear upscaling and that with Theano, only size=(2, 2) is possible. Input shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, rows, cols, channels) - If data_format is \"channels_first\" : (batch, channels, rows, cols) Output shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, upsampled_rows, upsampled_cols, channels) - If data_format is \"channels_first\" : (batch, channels, upsampled_rows, upsampled_cols) [source]'), ('title', 'UpSampling2D')]), OrderedDict([('location', 'layers/convolutional.html#upsampling3d'), ('text', 'keras.layers.UpSampling3D(size=(2, 2, 2), data_format=None) Upsampling layer for 3D inputs. Repeats the 1st, 2nd and 3rd dimensions of the data by size[0], size[1] and size[2] respectively. Arguments size : int, or tuple of 3 integers. The upsampling factors for dim1, dim2 and dim3. data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, dim1, dim2, dim3, channels) - If data_format is \"channels_first\" : (batch, channels, dim1, dim2, dim3) Output shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, upsampled_dim1, upsampled_dim2, upsampled_dim3, channels) - If data_format is \"channels_first\" : (batch, channels, upsampled_dim1, upsampled_dim2, upsampled_dim3) [source]'), ('title', 'UpSampling3D')]), OrderedDict([('location', 'layers/convolutional.html#zeropadding1d'), ('text', 'keras.layers.ZeroPadding1D(padding=1) Zero-padding layer for 1D input (e.g. temporal sequence). Arguments padding : int, or tuple of int (length 2), or dictionary. If int: How many zeros to add at the beginning and end of the padding dimension (axis 1). If tuple of int (length 2): How many zeros to add at the beginning and at the end of the padding dimension ( (left_pad, right_pad) ). Input shape 3D tensor with shape (batch, axis_to_pad, features) Output shape 3D tensor with shape (batch, padded_axis, features) [source]'), ('title', 'ZeroPadding1D')]), OrderedDict([('location', 'layers/convolutional.html#zeropadding2d'), ('text', 'keras.layers.ZeroPadding2D(padding=(1, 1), data_format=None) Zero-padding layer for 2D input (e.g. picture). This layer can add rows and columns of zeros at the top, bottom, left and right side of an image tensor. Arguments padding : int, or tuple of 2 ints, or tuple of 2 tuples of 2 ints. If int: the same symmetric padding is applied to height and width. If tuple of 2 ints: interpreted as two different symmetric padding values for height and width: (symmetric_height_pad, symmetric_width_pad) . If tuple of 2 tuples of 2 ints: interpreted as ((top_pad, bottom_pad), (left_pad, right_pad)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, height, width, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, rows, cols, channels) - If data_format is \"channels_first\" : (batch, channels, rows, cols) Output shape 4D tensor with shape: - If data_format is \"channels_last\" : (batch, padded_rows, padded_cols, channels) - If data_format is \"channels_first\" : (batch, channels, padded_rows, padded_cols) [source]'), ('title', 'ZeroPadding2D')]), OrderedDict([('location', 'layers/convolutional.html#zeropadding3d'), ('text', 'keras.layers.ZeroPadding3D(padding=(1, 1, 1), data_format=None) Zero-padding layer for 3D data (spatial or spatio-temporal). Arguments padding : int, or tuple of 3 ints, or tuple of 3 tuples of 2 ints. If int: the same symmetric padding is applied to height and width. If tuple of 3 ints: interpreted as two different symmetric padding values for height and width: (symmetric_dim1_pad, symmetric_dim2_pad, symmetric_dim3_pad) . If tuple of 3 tuples of 2 ints: interpreted as ((left_dim1_pad, right_dim1_pad), (left_dim2_pad, right_dim2_pad), (left_dim3_pad, right_dim3_pad)) data_format : A string, one of \"channels_last\" or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while \"channels_first\" corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad, depth) - If data_format is \"channels_first\" : (batch, depth, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad) Output shape 5D tensor with shape: - If data_format is \"channels_last\" : (batch, first_padded_axis, second_padded_axis, third_axis_to_pad, depth) - If data_format is \"channels_first\" : (batch, depth, first_padded_axis, second_padded_axis, third_axis_to_pad)'), ('title', 'ZeroPadding3D')]), OrderedDict([('location', 'layers/core.html'), ('text', '[source] Dense keras.layers.Dense(units, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Just your regular densely-connected NN layer. Dense implements the operation: output = activation(dot(input, kernel) + bias) where activation is the element-wise activation function passed as the activation argument, kernel is a weights matrix created by the layer, and bias is a bias vector created by the layer (only applicable if use_bias is True ). Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel . Example # as first layer in a sequential model: model = Sequential() model.add(Dense(32, input_shape=(16,))) # now the model will take as input arrays of shape (*, 16) # and output arrays of shape (*, 32) # after the first layer, you don\\'t need to specify # the size of the input anymore: model.add(Dense(32)) Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape nD tensor with shape: (batch_size, ..., input_dim) . The most common situation would be a 2D input with shape (batch_size, input_dim) . Output shape nD tensor with shape: (batch_size, ..., units) . For instance, for a 2D input with shape (batch_size, input_dim) , the output would have shape (batch_size, units) . [source] Activation keras.layers.Activation(activation) Applies an activation function to an output. Arguments activation : name of activation function to use (see: activations ), or alternatively, a Theano or TensorFlow operation. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. [source] Dropout keras.layers.Dropout(rate, noise_shape=None, seed=None) Applies Dropout to the input. Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting. Arguments rate : float between 0 and 1. Fraction of the input units to drop. noise_shape : 1D integer tensor representing the shape of the binary dropout mask that will be multiplied with the input. For instance, if your inputs have shape (batch_size, timesteps, features) and you want the dropout mask to be the same for all timesteps, you can use noise_shape=(batch_size, 1, features) . seed : A Python integer to use as random seed. References [Dropout: A Simple Way to Prevent Neural Networks from Overfitting] (http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf) [source] Flatten keras.layers.Flatten(data_format=None) Flattens the input. Does not affect the batch size. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. The purpose of this argument is to preserve weight ordering when switching a model from one data format to another. channels_last corresponds to inputs with shape (batch, ..., channels) while channels_first corresponds to inputs with shape (batch, channels, ...) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Example model = Sequential() model.add(Conv2D(64, (3, 3), input_shape=(3, 32, 32), padding=\\'same\\',)) # now: model.output_shape == (None, 64, 32, 32) model.add(Flatten()) # now: model.output_shape == (None, 65536) [source] Input keras.engine.input_layer.Input() Input() is used to instantiate a Keras tensor. A Keras tensor is a tensor object from the underlying backend (Theano, TensorFlow or CNTK), which we augment with certain attributes that allow us to build a Keras model just by knowing the inputs and outputs of the model. For instance, if a, b and c are Keras tensors, it becomes possible to do: model = Model(input=[a, b], output=c) The added Keras attributes are: _keras_shape : Integer shape tuple propagated via Keras-side shape inference. _keras_history : Last layer applied to the tensor. the entire layer graph is retrievable from that layer, recursively. Arguments shape : A shape tuple (integer), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. batch_shape : A shape tuple (integer), including the batch size. For instance, batch_shape=(10, 32) indicates that the expected input will be batches of 10 32-dimensional vectors. batch_shape=(None, 32) indicates batches of an arbitrary number of 32-dimensional vectors. name : An optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn\\'t provided. dtype : The data type expected by the input, as a string ( float32 , float64 , int32 ...) sparse : A boolean specifying whether the placeholder to be created is sparse. tensor : Optional existing tensor to wrap into the Input layer. If set, the layer will not create a placeholder tensor. Returns A tensor. Example # this is a logistic regression in Keras x = Input(shape=(32,)) y = Dense(16, activation=\\'softmax\\')(x) model = Model(x, y) [source] Reshape keras.layers.Reshape(target_shape) Reshapes an output to a certain shape. Arguments target_shape : target shape. Tuple of integers. Does not include the batch axis. Input shape Arbitrary, although all dimensions in the input shaped must be fixed. Use the keyword argument input_shape (tuple of integers, does not include the batch axis) when using this layer as the first layer in a model. Output shape (batch_size,) + target_shape Example # as first layer in a Sequential model model = Sequential() model.add(Reshape((3, 4), input_shape=(12,))) # now: model.output_shape == (None, 3, 4) # note: `None` is the batch dimension # as intermediate layer in a Sequential model model.add(Reshape((6, 2))) # now: model.output_shape == (None, 6, 2) # also supports shape inference using `-1` as dimension model.add(Reshape((-1, 2, 2))) # now: model.output_shape == (None, 3, 2, 2) [source] Permute keras.layers.Permute(dims) Permutes the dimensions of the input according to a given pattern. Useful for e.g. connecting RNNs and convnets together. Example model = Sequential() model.add(Permute((2, 1), input_shape=(10, 64))) # now: model.output_shape == (None, 64, 10) # note: `None` is the batch dimension Arguments dims : Tuple of integers. Permutation pattern, does not include the samples dimension. Indexing starts at 1. For instance, (2, 1) permutes the first and second dimension of the input. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same as the input shape, but with the dimensions re-ordered according to the specified pattern. [source] RepeatVector keras.layers.RepeatVector(n) Repeats the input n times. Example model = Sequential() model.add(Dense(32, input_dim=32)) # now: model.output_shape == (None, 32) # note: `None` is the batch dimension model.add(RepeatVector(3)) # now: model.output_shape == (None, 3, 32) Arguments n : integer, repetition factor. Input shape 2D tensor of shape (num_samples, features) . Output shape 3D tensor of shape (num_samples, n, features) . [source] Lambda keras.layers.Lambda(function, output_shape=None, mask=None, arguments=None) Wraps arbitrary expression as a Layer object. Examples # add a x -> x^2 layer model.add(Lambda(lambda x: x ** 2)) # add a layer that returns the concatenation # of the positive part of the input and # the opposite of the negative part def antirectifier(x): x -= K.mean(x, axis=1, keepdims=True) x = K.l2_normalize(x, axis=1) pos = K.relu(x) neg = K.relu(-x) return K.concatenate([pos, neg], axis=1) def antirectifier_output_shape(input_shape): shape = list(input_shape) assert len(shape) == 2 # only valid for 2D tensors shape[-1] *= 2 return tuple(shape) model.add(Lambda(antirectifier, output_shape=antirectifier_output_shape)) Arguments function : The function to be evaluated. Takes input tensor as first argument. output_shape : Expected output shape from function. Only relevant when using Theano. Can be a tuple or function. If a tuple, it only specifies the first dimension onward; sample dimension is assumed either the same as the input: output_shape = (input_shape[0], ) + output_shape or, the input is None and the sample dimension is also None : output_shape = (None, ) + output_shape If a function, it specifies the entire shape as a function of the input shape: output_shape = f(input_shape) arguments : optional dictionary of keyword arguments to be passed to the function. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Specified by output_shape argument (or auto-inferred when using TensorFlow or CNTK). [source] ActivityRegularization keras.layers.ActivityRegularization(l1=0.0, l2=0.0) Layer that applies an update to the cost function based input activity. Arguments l1 : L1 regularization factor (positive float). l2 : L2 regularization factor (positive float). Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. [source] Masking keras.layers.Masking(mask_value=0.0) Masks a sequence by using a mask value to skip timesteps. For each timestep in the input tensor (dimension #1 in the tensor), if all values in the input tensor at that timestep are equal to mask_value , then the timestep will be masked (skipped) in all downstream layers (as long as they support masking). If any downstream layer does not support masking yet receives such an input mask, an exception will be raised. Example Consider a Numpy data array x of shape (samples, timesteps, features) , to be fed to an LSTM layer. You want to mask timestep #3 and #5 because you lack data for these timesteps. You can: set x[:, 3, :] = 0. and x[:, 5, :] = 0. insert a Masking layer with mask_value=0. before the LSTM layer: model = Sequential() model.add(Masking(mask_value=0., input_shape=(timesteps, features))) model.add(LSTM(32)) [source] SpatialDropout1D keras.layers.SpatialDropout1D(rate) Spatial 1D version of Dropout. This version performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements. If adjacent frames within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout1D will help promote independence between feature maps and should be used instead. Arguments rate : float between 0 and 1. Fraction of the input units to drop. Input shape 3D tensor with shape: (samples, timesteps, channels) Output shape Same as input References [Efficient Object Localization Using Convolutional Networks] (https://arxiv.org/abs/1411.4280) [source] SpatialDropout2D keras.layers.SpatialDropout2D(rate, data_format=None) Spatial 2D version of Dropout. This version performs the same function as Dropout, however it drops entire 2D feature maps instead of individual elements. If adjacent pixels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout2D will help promote independence between feature maps and should be used instead. Arguments rate : float between 0 and 1. Fraction of the input units to drop. data_format : \\'channels_first\\' or \\'channels_last\\'. In \\'channels_first\\' mode, the channels dimension (the depth) is at index 1, in \\'channels_last\\' mode is it at index 3. It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 4D tensor with shape: (samples, channels, rows, cols) if data_format=\\'channels_first\\' or 4D tensor with shape: (samples, rows, cols, channels) if data_format=\\'channels_last\\'. Output shape Same as input References [Efficient Object Localization Using Convolutional Networks] (https://arxiv.org/abs/1411.4280) [source] SpatialDropout3D keras.layers.SpatialDropout3D(rate, data_format=None) Spatial 3D version of Dropout. This version performs the same function as Dropout, however it drops entire 3D feature maps instead of individual elements. If adjacent voxels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout3D will help promote independence between feature maps and should be used instead. Arguments rate : float between 0 and 1. Fraction of the input units to drop. data_format : \\'channels_first\\' or \\'channels_last\\'. In \\'channels_first\\' mode, the channels dimension (the depth) is at index 1, in \\'channels_last\\' mode is it at index 4. It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: (samples, channels, dim1, dim2, dim3) if data_format=\\'channels_first\\' or 5D tensor with shape: (samples, dim1, dim2, dim3, channels) if data_format=\\'channels_last\\'. Output shape Same as input References [Efficient Object Localization Using Convolutional Networks] (https://arxiv.org/abs/1411.4280)'), ('title', 'Core Layers')]), OrderedDict([('location', 'layers/core.html#dense'), ('text', 'keras.layers.Dense(units, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Just your regular densely-connected NN layer. Dense implements the operation: output = activation(dot(input, kernel) + bias) where activation is the element-wise activation function passed as the activation argument, kernel is a weights matrix created by the layer, and bias is a bias vector created by the layer (only applicable if use_bias is True ). Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel . Example # as first layer in a sequential model: model = Sequential() model.add(Dense(32, input_shape=(16,))) # now the model will take as input arrays of shape (*, 16) # and output arrays of shape (*, 32) # after the first layer, you don\\'t need to specify # the size of the input anymore: model.add(Dense(32)) Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape nD tensor with shape: (batch_size, ..., input_dim) . The most common situation would be a 2D input with shape (batch_size, input_dim) . Output shape nD tensor with shape: (batch_size, ..., units) . For instance, for a 2D input with shape (batch_size, input_dim) , the output would have shape (batch_size, units) . [source]'), ('title', 'Dense')]), OrderedDict([('location', 'layers/core.html#activation'), ('text', 'keras.layers.Activation(activation) Applies an activation function to an output. Arguments activation : name of activation function to use (see: activations ), or alternatively, a Theano or TensorFlow operation. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. [source]'), ('title', 'Activation')]), OrderedDict([('location', 'layers/core.html#dropout'), ('text', 'keras.layers.Dropout(rate, noise_shape=None, seed=None) Applies Dropout to the input. Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting. Arguments rate : float between 0 and 1. Fraction of the input units to drop. noise_shape : 1D integer tensor representing the shape of the binary dropout mask that will be multiplied with the input. For instance, if your inputs have shape (batch_size, timesteps, features) and you want the dropout mask to be the same for all timesteps, you can use noise_shape=(batch_size, 1, features) . seed : A Python integer to use as random seed. References [Dropout: A Simple Way to Prevent Neural Networks from Overfitting] (http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf) [source]'), ('title', 'Dropout')]), OrderedDict([('location', 'layers/core.html#flatten'), ('text', 'keras.layers.Flatten(data_format=None) Flattens the input. Does not affect the batch size. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. The purpose of this argument is to preserve weight ordering when switching a model from one data format to another. channels_last corresponds to inputs with shape (batch, ..., channels) while channels_first corresponds to inputs with shape (batch, channels, ...) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Example model = Sequential() model.add(Conv2D(64, (3, 3), input_shape=(3, 32, 32), padding=\\'same\\',)) # now: model.output_shape == (None, 64, 32, 32) model.add(Flatten()) # now: model.output_shape == (None, 65536) [source]'), ('title', 'Flatten')]), OrderedDict([('location', 'layers/core.html#input'), ('text', \"keras.engine.input_layer.Input() Input() is used to instantiate a Keras tensor. A Keras tensor is a tensor object from the underlying backend (Theano, TensorFlow or CNTK), which we augment with certain attributes that allow us to build a Keras model just by knowing the inputs and outputs of the model. For instance, if a, b and c are Keras tensors, it becomes possible to do: model = Model(input=[a, b], output=c) The added Keras attributes are: _keras_shape : Integer shape tuple propagated via Keras-side shape inference. _keras_history : Last layer applied to the tensor. the entire layer graph is retrievable from that layer, recursively. Arguments shape : A shape tuple (integer), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. batch_shape : A shape tuple (integer), including the batch size. For instance, batch_shape=(10, 32) indicates that the expected input will be batches of 10 32-dimensional vectors. batch_shape=(None, 32) indicates batches of an arbitrary number of 32-dimensional vectors. name : An optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided. dtype : The data type expected by the input, as a string ( float32 , float64 , int32 ...) sparse : A boolean specifying whether the placeholder to be created is sparse. tensor : Optional existing tensor to wrap into the Input layer. If set, the layer will not create a placeholder tensor. Returns A tensor. Example # this is a logistic regression in Keras x = Input(shape=(32,)) y = Dense(16, activation='softmax')(x) model = Model(x, y) [source]\"), ('title', 'Input')]), OrderedDict([('location', 'layers/core.html#reshape'), ('text', 'keras.layers.Reshape(target_shape) Reshapes an output to a certain shape. Arguments target_shape : target shape. Tuple of integers. Does not include the batch axis. Input shape Arbitrary, although all dimensions in the input shaped must be fixed. Use the keyword argument input_shape (tuple of integers, does not include the batch axis) when using this layer as the first layer in a model. Output shape (batch_size,) + target_shape Example # as first layer in a Sequential model model = Sequential() model.add(Reshape((3, 4), input_shape=(12,))) # now: model.output_shape == (None, 3, 4) # note: `None` is the batch dimension # as intermediate layer in a Sequential model model.add(Reshape((6, 2))) # now: model.output_shape == (None, 6, 2) # also supports shape inference using `-1` as dimension model.add(Reshape((-1, 2, 2))) # now: model.output_shape == (None, 3, 2, 2) [source]'), ('title', 'Reshape')]), OrderedDict([('location', 'layers/core.html#permute'), ('text', 'keras.layers.Permute(dims) Permutes the dimensions of the input according to a given pattern. Useful for e.g. connecting RNNs and convnets together. Example model = Sequential() model.add(Permute((2, 1), input_shape=(10, 64))) # now: model.output_shape == (None, 64, 10) # note: `None` is the batch dimension Arguments dims : Tuple of integers. Permutation pattern, does not include the samples dimension. Indexing starts at 1. For instance, (2, 1) permutes the first and second dimension of the input. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same as the input shape, but with the dimensions re-ordered according to the specified pattern. [source]'), ('title', 'Permute')]), OrderedDict([('location', 'layers/core.html#repeatvector'), ('text', 'keras.layers.RepeatVector(n) Repeats the input n times. Example model = Sequential() model.add(Dense(32, input_dim=32)) # now: model.output_shape == (None, 32) # note: `None` is the batch dimension model.add(RepeatVector(3)) # now: model.output_shape == (None, 3, 32) Arguments n : integer, repetition factor. Input shape 2D tensor of shape (num_samples, features) . Output shape 3D tensor of shape (num_samples, n, features) . [source]'), ('title', 'RepeatVector')]), OrderedDict([('location', 'layers/core.html#lambda'), ('text', 'keras.layers.Lambda(function, output_shape=None, mask=None, arguments=None) Wraps arbitrary expression as a Layer object. Examples # add a x -> x^2 layer model.add(Lambda(lambda x: x ** 2)) # add a layer that returns the concatenation # of the positive part of the input and # the opposite of the negative part def antirectifier(x): x -= K.mean(x, axis=1, keepdims=True) x = K.l2_normalize(x, axis=1) pos = K.relu(x) neg = K.relu(-x) return K.concatenate([pos, neg], axis=1) def antirectifier_output_shape(input_shape): shape = list(input_shape) assert len(shape) == 2 # only valid for 2D tensors shape[-1] *= 2 return tuple(shape) model.add(Lambda(antirectifier, output_shape=antirectifier_output_shape)) Arguments function : The function to be evaluated. Takes input tensor as first argument. output_shape : Expected output shape from function. Only relevant when using Theano. Can be a tuple or function. If a tuple, it only specifies the first dimension onward; sample dimension is assumed either the same as the input: output_shape = (input_shape[0], ) + output_shape or, the input is None and the sample dimension is also None : output_shape = (None, ) + output_shape If a function, it specifies the entire shape as a function of the input shape: output_shape = f(input_shape) arguments : optional dictionary of keyword arguments to be passed to the function. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Specified by output_shape argument (or auto-inferred when using TensorFlow or CNTK). [source]'), ('title', 'Lambda')]), OrderedDict([('location', 'layers/core.html#activityregularization'), ('text', 'keras.layers.ActivityRegularization(l1=0.0, l2=0.0) Layer that applies an update to the cost function based input activity. Arguments l1 : L1 regularization factor (positive float). l2 : L2 regularization factor (positive float). Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. [source]'), ('title', 'ActivityRegularization')]), OrderedDict([('location', 'layers/core.html#masking'), ('text', 'keras.layers.Masking(mask_value=0.0) Masks a sequence by using a mask value to skip timesteps. For each timestep in the input tensor (dimension #1 in the tensor), if all values in the input tensor at that timestep are equal to mask_value , then the timestep will be masked (skipped) in all downstream layers (as long as they support masking). If any downstream layer does not support masking yet receives such an input mask, an exception will be raised. Example Consider a Numpy data array x of shape (samples, timesteps, features) , to be fed to an LSTM layer. You want to mask timestep #3 and #5 because you lack data for these timesteps. You can: set x[:, 3, :] = 0. and x[:, 5, :] = 0. insert a Masking layer with mask_value=0. before the LSTM layer: model = Sequential() model.add(Masking(mask_value=0., input_shape=(timesteps, features))) model.add(LSTM(32)) [source]'), ('title', 'Masking')]), OrderedDict([('location', 'layers/core.html#spatialdropout1d'), ('text', 'keras.layers.SpatialDropout1D(rate) Spatial 1D version of Dropout. This version performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements. If adjacent frames within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout1D will help promote independence between feature maps and should be used instead. Arguments rate : float between 0 and 1. Fraction of the input units to drop. Input shape 3D tensor with shape: (samples, timesteps, channels) Output shape Same as input References [Efficient Object Localization Using Convolutional Networks] (https://arxiv.org/abs/1411.4280) [source]'), ('title', 'SpatialDropout1D')]), OrderedDict([('location', 'layers/core.html#spatialdropout2d'), ('text', 'keras.layers.SpatialDropout2D(rate, data_format=None) Spatial 2D version of Dropout. This version performs the same function as Dropout, however it drops entire 2D feature maps instead of individual elements. If adjacent pixels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout2D will help promote independence between feature maps and should be used instead. Arguments rate : float between 0 and 1. Fraction of the input units to drop. data_format : \\'channels_first\\' or \\'channels_last\\'. In \\'channels_first\\' mode, the channels dimension (the depth) is at index 1, in \\'channels_last\\' mode is it at index 3. It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 4D tensor with shape: (samples, channels, rows, cols) if data_format=\\'channels_first\\' or 4D tensor with shape: (samples, rows, cols, channels) if data_format=\\'channels_last\\'. Output shape Same as input References [Efficient Object Localization Using Convolutional Networks] (https://arxiv.org/abs/1411.4280) [source]'), ('title', 'SpatialDropout2D')]), OrderedDict([('location', 'layers/core.html#spatialdropout3d'), ('text', 'keras.layers.SpatialDropout3D(rate, data_format=None) Spatial 3D version of Dropout. This version performs the same function as Dropout, however it drops entire 3D feature maps instead of individual elements. If adjacent voxels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout3D will help promote independence between feature maps and should be used instead. Arguments rate : float between 0 and 1. Fraction of the input units to drop. data_format : \\'channels_first\\' or \\'channels_last\\'. In \\'channels_first\\' mode, the channels dimension (the depth) is at index 1, in \\'channels_last\\' mode is it at index 4. It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape 5D tensor with shape: (samples, channels, dim1, dim2, dim3) if data_format=\\'channels_first\\' or 5D tensor with shape: (samples, dim1, dim2, dim3, channels) if data_format=\\'channels_last\\'. Output shape Same as input References [Efficient Object Localization Using Convolutional Networks] (https://arxiv.org/abs/1411.4280)'), ('title', 'SpatialDropout3D')]), OrderedDict([('location', 'layers/embeddings.html'), ('text', '[source] Embedding keras.layers.Embedding(input_dim, output_dim, embeddings_initializer=\\'uniform\\', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None) Turns positive integers (indexes) into dense vectors of fixed size. eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]] This layer can only be used as the first layer in a model. Example model = Sequential() model.add(Embedding(1000, 64, input_length=10)) # the model will take as input an integer matrix of size (batch, input_length). # the largest integer (i.e. word index) in the input should be # no larger than 999 (vocabulary size). # now model.output_shape == (None, 10, 64), where None is the batch dimension. input_array = np.random.randint(1000, size=(32, 10)) model.compile(\\'rmsprop\\', \\'mse\\') output_array = model.predict(input_array) assert output_array.shape == (32, 10, 64) Arguments input_dim : int > 0. Size of the vocabulary, i.e. maximum integer index + 1. output_dim : int >= 0. Dimension of the dense embedding. embeddings_initializer : Initializer for the embeddings matrix (see initializers ). embeddings_regularizer : Regularizer function applied to the embeddings matrix (see regularizer ). embeddings_constraint : Constraint function applied to the embeddings matrix (see constraints ). mask_zero : Whether or not the input value 0 is a special \"padding\" value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1). input_length : Length of input sequences, when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed). Input shape 2D tensor with shape: (batch_size, sequence_length) . Output shape 3D tensor with shape: (batch_size, sequence_length, output_dim) . References A Theoretically Grounded Application of Dropout in Recurrent Neural Networks'), ('title', 'Embedding Layers')]), OrderedDict([('location', 'layers/embeddings.html#embedding'), ('text', 'keras.layers.Embedding(input_dim, output_dim, embeddings_initializer=\\'uniform\\', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None) Turns positive integers (indexes) into dense vectors of fixed size. eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]] This layer can only be used as the first layer in a model. Example model = Sequential() model.add(Embedding(1000, 64, input_length=10)) # the model will take as input an integer matrix of size (batch, input_length). # the largest integer (i.e. word index) in the input should be # no larger than 999 (vocabulary size). # now model.output_shape == (None, 10, 64), where None is the batch dimension. input_array = np.random.randint(1000, size=(32, 10)) model.compile(\\'rmsprop\\', \\'mse\\') output_array = model.predict(input_array) assert output_array.shape == (32, 10, 64) Arguments input_dim : int > 0. Size of the vocabulary, i.e. maximum integer index + 1. output_dim : int >= 0. Dimension of the dense embedding. embeddings_initializer : Initializer for the embeddings matrix (see initializers ). embeddings_regularizer : Regularizer function applied to the embeddings matrix (see regularizer ). embeddings_constraint : Constraint function applied to the embeddings matrix (see constraints ). mask_zero : Whether or not the input value 0 is a special \"padding\" value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1). input_length : Length of input sequences, when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed). Input shape 2D tensor with shape: (batch_size, sequence_length) . Output shape 3D tensor with shape: (batch_size, sequence_length, output_dim) . References A Theoretically Grounded Application of Dropout in Recurrent Neural Networks'), ('title', 'Embedding')]), OrderedDict([('location', 'layers/local.html'), ('text', '[source] LocallyConnected1D keras.layers.LocallyConnected1D(filters, kernel_size, strides=1, padding=\\'valid\\', data_format=None, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Locally-connected layer for 1D inputs. The LocallyConnected1D layer works similarly to the Conv1D layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input. Example # apply a unshared weight convolution 1d of length 3 to a sequence with # 10 timesteps, with 64 output filters model = Sequential() model.add(LocallyConnected1D(64, 3, input_shape=(10, 32))) # now model.output_shape == (None, 8, 64) # add a new conv1d on top model.add(LocallyConnected1D(32, 3)) # now model.output_shape == (None, 6, 32) Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of a single integer, specifying the length of the 1D convolution window. strides : An integer or tuple/list of a single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : Currently only supports \"valid\" (case-insensitive). \"same\" may be supported in the future. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 3D tensor with shape: (batch_size, steps, input_dim) Output shape 3D tensor with shape: (batch_size, new_steps, filters) steps value might have changed due to padding or strides. [source] LocallyConnected2D keras.layers.LocallyConnected2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Locally-connected layer for 2D inputs. The LocallyConnected2D layer works similarly to the Conv2D layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input. Examples # apply a 3x3 unshared weights convolution with 64 output filters # on a 32x32 image with `data_format=\"channels_last\"`: model = Sequential() model.add(LocallyConnected2D(64, (3, 3), input_shape=(32, 32, 3))) # now model.output_shape == (None, 30, 30, 64) # notice that this layer will consume (30*30)*(3*3*3*64) # + (30*30)*64 parameters # add a 3x3 unshared weights convolution on top, with 32 output filters: model.add(LocallyConnected2D(32, (3, 3))) # now model.output_shape == (None, 28, 28, 32) Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the width and height of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the width and height. Can be a single integer to specify the same value for all spatial dimensions. padding : Currently only support \"valid\" (case-insensitive). \"same\" will be supported in future. data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (samples, channels, rows, cols) if data_format=\\'channels_first\\' or 4D tensor with shape: (samples, rows, cols, channels) if data_format=\\'channels_last\\'. Output shape 4D tensor with shape: (samples, filters, new_rows, new_cols) if data_format=\\'channels_first\\' or 4D tensor with shape: (samples, new_rows, new_cols, filters) if data_format=\\'channels_last\\'. rows and cols values might have changed due to padding.'), ('title', 'Locally-connected Layers')]), OrderedDict([('location', 'layers/local.html#locallyconnected1d'), ('text', 'keras.layers.LocallyConnected1D(filters, kernel_size, strides=1, padding=\\'valid\\', data_format=None, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Locally-connected layer for 1D inputs. The LocallyConnected1D layer works similarly to the Conv1D layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input. Example # apply a unshared weight convolution 1d of length 3 to a sequence with # 10 timesteps, with 64 output filters model = Sequential() model.add(LocallyConnected1D(64, 3, input_shape=(10, 32))) # now model.output_shape == (None, 8, 64) # add a new conv1d on top model.add(LocallyConnected1D(32, 3)) # now model.output_shape == (None, 6, 32) Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of a single integer, specifying the length of the 1D convolution window. strides : An integer or tuple/list of a single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : Currently only supports \"valid\" (case-insensitive). \"same\" may be supported in the future. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 3D tensor with shape: (batch_size, steps, input_dim) Output shape 3D tensor with shape: (batch_size, new_steps, filters) steps value might have changed due to padding or strides. [source]'), ('title', 'LocallyConnected1D')]), OrderedDict([('location', 'layers/local.html#locallyconnected2d'), ('text', 'keras.layers.LocallyConnected2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, activation=None, use_bias=True, kernel_initializer=\\'glorot_uniform\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) Locally-connected layer for 2D inputs. The LocallyConnected2D layer works similarly to the Conv2D layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input. Examples # apply a 3x3 unshared weights convolution with 64 output filters # on a 32x32 image with `data_format=\"channels_last\"`: model = Sequential() model.add(LocallyConnected2D(64, (3, 3), input_shape=(32, 32, 3))) # now model.output_shape == (None, 30, 30, 64) # notice that this layer will consume (30*30)*(3*3*3*64) # + (30*30)*64 parameters # add a 3x3 unshared weights convolution on top, with 32 output filters: model.add(LocallyConnected2D(32, (3, 3))) # now model.output_shape == (None, 28, 28, 32) Arguments filters : Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size : An integer or tuple/list of 2 integers, specifying the width and height of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides : An integer or tuple/list of 2 integers, specifying the strides of the convolution along the width and height. Can be a single integer to specify the same value for all spatial dimensions. padding : Currently only support \"valid\" (case-insensitive). \"same\" will be supported in future. data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). Input shape 4D tensor with shape: (samples, channels, rows, cols) if data_format=\\'channels_first\\' or 4D tensor with shape: (samples, rows, cols, channels) if data_format=\\'channels_last\\'. Output shape 4D tensor with shape: (samples, filters, new_rows, new_cols) if data_format=\\'channels_first\\' or 4D tensor with shape: (samples, new_rows, new_cols, filters) if data_format=\\'channels_last\\'. rows and cols values might have changed due to padding.'), ('title', 'LocallyConnected2D')]), OrderedDict([('location', 'layers/merge.html'), ('text', \"[source] Add keras.layers.Add() Layer that adds a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) # equivalent to added = keras.layers.add([x1, x2]) added = keras.layers.Add()([x1, x2]) out = keras.layers.Dense(4)(added) model = keras.models.Model(inputs=[input1, input2], outputs=out) [source] Subtract keras.layers.Subtract() Layer that subtracts two inputs. It takes as input a list of tensors of size 2, both of the same shape, and returns a single tensor, (inputs[0] - inputs[1]), also of the same shape. Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) # Equivalent to subtracted = keras.layers.subtract([x1, x2]) subtracted = keras.layers.Subtract()([x1, x2]) out = keras.layers.Dense(4)(subtracted) model = keras.models.Model(inputs=[input1, input2], outputs=out) [source] Multiply keras.layers.Multiply() Layer that multiplies (element-wise) a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). [source] Average keras.layers.Average() Layer that averages a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). [source] Maximum keras.layers.Maximum() Layer that computes the maximum (element-wise) a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). [source] Concatenate keras.layers.Concatenate(axis=-1) Layer that concatenates a list of inputs. It takes as input a list of tensors, all of the same shape except for the concatenation axis, and returns a single tensor, the concatenation of all inputs. Arguments axis : Axis along which to concatenate. **kwargs : standard layer keyword arguments. [source] Dot keras.layers.Dot(axes, normalize=False) Layer that computes a dot product between samples in two tensors. E.g. if applied to a list of two tensors a and b of shape (batch_size, n) , the output will be a tensor of shape (batch_size, 1) where each entry i will be the dot product between a[i] and b[i] . Arguments axes : Integer or tuple of integers, axis or axes along which to take the dot product. normalize : Whether to L2-normalize samples along the dot product axis before taking the dot product. If set to True, then the output of the dot product is the cosine proximity between the two samples. **kwargs : Standard layer keyword arguments. add keras.layers.add(inputs) Functional interface to the Add layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the sum of the inputs. Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) added = keras.layers.add([x1, x2]) out = keras.layers.Dense(4)(added) model = keras.models.Model(inputs=[input1, input2], outputs=out) subtract keras.layers.subtract(inputs) Functional interface to the Subtract layer. Arguments inputs : A list of input tensors (exactly 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the difference of the inputs. Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) subtracted = keras.layers.subtract([x1, x2]) out = keras.layers.Dense(4)(subtracted) model = keras.models.Model(inputs=[input1, input2], outputs=out) multiply keras.layers.multiply(inputs) Functional interface to the Multiply layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the element-wise product of the inputs. average keras.layers.average(inputs) Functional interface to the Average layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the average of the inputs. maximum keras.layers.maximum(inputs) Functional interface to the Maximum layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the element-wise maximum of the inputs. concatenate keras.layers.concatenate(inputs, axis=-1) Functional interface to the Concatenate layer. Arguments inputs : A list of input tensors (at least 2). axis : Concatenation axis. **kwargs : Standard layer keyword arguments. Returns A tensor, the concatenation of the inputs alongside axis axis . dot keras.layers.dot(inputs, axes, normalize=False) Functional interface to the Dot layer. Arguments inputs : A list of input tensors (at least 2). axes : Integer or tuple of integers, axis or axes along which to take the dot product. normalize : Whether to L2-normalize samples along the dot product axis before taking the dot product. If set to True, then the output of the dot product is the cosine proximity between the two samples. **kwargs : Standard layer keyword arguments. Returns A tensor, the dot product of the samples from the inputs.\"), ('title', 'Merge Layers')]), OrderedDict([('location', 'layers/merge.html#add'), ('text', \"keras.layers.Add() Layer that adds a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) # equivalent to added = keras.layers.add([x1, x2]) added = keras.layers.Add()([x1, x2]) out = keras.layers.Dense(4)(added) model = keras.models.Model(inputs=[input1, input2], outputs=out) [source]\"), ('title', 'Add')]), OrderedDict([('location', 'layers/merge.html#subtract'), ('text', \"keras.layers.Subtract() Layer that subtracts two inputs. It takes as input a list of tensors of size 2, both of the same shape, and returns a single tensor, (inputs[0] - inputs[1]), also of the same shape. Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) # Equivalent to subtracted = keras.layers.subtract([x1, x2]) subtracted = keras.layers.Subtract()([x1, x2]) out = keras.layers.Dense(4)(subtracted) model = keras.models.Model(inputs=[input1, input2], outputs=out) [source]\"), ('title', 'Subtract')]), OrderedDict([('location', 'layers/merge.html#multiply'), ('text', 'keras.layers.Multiply() Layer that multiplies (element-wise) a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). [source]'), ('title', 'Multiply')]), OrderedDict([('location', 'layers/merge.html#average'), ('text', 'keras.layers.Average() Layer that averages a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). [source]'), ('title', 'Average')]), OrderedDict([('location', 'layers/merge.html#maximum'), ('text', 'keras.layers.Maximum() Layer that computes the maximum (element-wise) a list of inputs. It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape). [source]'), ('title', 'Maximum')]), OrderedDict([('location', 'layers/merge.html#concatenate'), ('text', 'keras.layers.Concatenate(axis=-1) Layer that concatenates a list of inputs. It takes as input a list of tensors, all of the same shape except for the concatenation axis, and returns a single tensor, the concatenation of all inputs. Arguments axis : Axis along which to concatenate. **kwargs : standard layer keyword arguments. [source]'), ('title', 'Concatenate')]), OrderedDict([('location', 'layers/merge.html#dot'), ('text', 'keras.layers.Dot(axes, normalize=False) Layer that computes a dot product between samples in two tensors. E.g. if applied to a list of two tensors a and b of shape (batch_size, n) , the output will be a tensor of shape (batch_size, 1) where each entry i will be the dot product between a[i] and b[i] . Arguments axes : Integer or tuple of integers, axis or axes along which to take the dot product. normalize : Whether to L2-normalize samples along the dot product axis before taking the dot product. If set to True, then the output of the dot product is the cosine proximity between the two samples. **kwargs : Standard layer keyword arguments.'), ('title', 'Dot')]), OrderedDict([('location', 'layers/merge.html#add_1'), ('text', \"keras.layers.add(inputs) Functional interface to the Add layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the sum of the inputs. Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) added = keras.layers.add([x1, x2]) out = keras.layers.Dense(4)(added) model = keras.models.Model(inputs=[input1, input2], outputs=out)\"), ('title', 'add')]), OrderedDict([('location', 'layers/merge.html#subtract_1'), ('text', \"keras.layers.subtract(inputs) Functional interface to the Subtract layer. Arguments inputs : A list of input tensors (exactly 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the difference of the inputs. Examples import keras input1 = keras.layers.Input(shape=(16,)) x1 = keras.layers.Dense(8, activation='relu')(input1) input2 = keras.layers.Input(shape=(32,)) x2 = keras.layers.Dense(8, activation='relu')(input2) subtracted = keras.layers.subtract([x1, x2]) out = keras.layers.Dense(4)(subtracted) model = keras.models.Model(inputs=[input1, input2], outputs=out)\"), ('title', 'subtract')]), OrderedDict([('location', 'layers/merge.html#multiply_1'), ('text', 'keras.layers.multiply(inputs) Functional interface to the Multiply layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the element-wise product of the inputs.'), ('title', 'multiply')]), OrderedDict([('location', 'layers/merge.html#average_1'), ('text', 'keras.layers.average(inputs) Functional interface to the Average layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the average of the inputs.'), ('title', 'average')]), OrderedDict([('location', 'layers/merge.html#maximum_1'), ('text', 'keras.layers.maximum(inputs) Functional interface to the Maximum layer. Arguments inputs : A list of input tensors (at least 2). **kwargs : Standard layer keyword arguments. Returns A tensor, the element-wise maximum of the inputs.'), ('title', 'maximum')]), OrderedDict([('location', 'layers/merge.html#concatenate_1'), ('text', 'keras.layers.concatenate(inputs, axis=-1) Functional interface to the Concatenate layer. Arguments inputs : A list of input tensors (at least 2). axis : Concatenation axis. **kwargs : Standard layer keyword arguments. Returns A tensor, the concatenation of the inputs alongside axis axis .'), ('title', 'concatenate')]), OrderedDict([('location', 'layers/merge.html#dot_1'), ('text', 'keras.layers.dot(inputs, axes, normalize=False) Functional interface to the Dot layer. Arguments inputs : A list of input tensors (at least 2). axes : Integer or tuple of integers, axis or axes along which to take the dot product. normalize : Whether to L2-normalize samples along the dot product axis before taking the dot product. If set to True, then the output of the dot product is the cosine proximity between the two samples. **kwargs : Standard layer keyword arguments. Returns A tensor, the dot product of the samples from the inputs.'), ('title', 'dot')]), OrderedDict([('location', 'layers/noise.html'), ('text', '[source] GaussianNoise keras.layers.GaussianNoise(stddev) Apply additive zero-centered Gaussian noise. This is useful to mitigate overfitting (you could see it as a form of random data augmentation). Gaussian Noise (GS) is a natural choice as corruption process for real valued inputs. As it is a regularization layer, it is only active at training time. Arguments stddev : float, standard deviation of the noise distribution. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. [source] GaussianDropout keras.layers.GaussianDropout(rate) Apply multiplicative 1-centered Gaussian noise. As it is a regularization layer, it is only active at training time. Arguments rate : float, drop probability (as with Dropout ). The multiplicative noise will have standard deviation sqrt(rate / (1 - rate)) . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. References [Dropout: A Simple Way to Prevent Neural Networks from Overfitting] (http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf) [source] AlphaDropout keras.layers.AlphaDropout(rate, noise_shape=None, seed=None) Applies Alpha Dropout to the input. Alpha Dropout is a Dropout that keeps mean and variance of inputs to their original values, in order to ensure the self-normalizing property even after this dropout. Alpha Dropout fits well to Scaled Exponential Linear Units by randomly setting activations to the negative saturation value. Arguments rate : float, drop probability (as with Dropout ). The multiplicative noise will have standard deviation sqrt(rate / (1 - rate)) . seed : A Python integer to use as random seed. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. References Self-Normalizing Neural Networks'), ('title', 'Noise layers')]), OrderedDict([('location', 'layers/noise.html#gaussiannoise'), ('text', 'keras.layers.GaussianNoise(stddev) Apply additive zero-centered Gaussian noise. This is useful to mitigate overfitting (you could see it as a form of random data augmentation). Gaussian Noise (GS) is a natural choice as corruption process for real valued inputs. As it is a regularization layer, it is only active at training time. Arguments stddev : float, standard deviation of the noise distribution. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. [source]'), ('title', 'GaussianNoise')]), OrderedDict([('location', 'layers/noise.html#gaussiandropout'), ('text', 'keras.layers.GaussianDropout(rate) Apply multiplicative 1-centered Gaussian noise. As it is a regularization layer, it is only active at training time. Arguments rate : float, drop probability (as with Dropout ). The multiplicative noise will have standard deviation sqrt(rate / (1 - rate)) . Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. References [Dropout: A Simple Way to Prevent Neural Networks from Overfitting] (http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf) [source]'), ('title', 'GaussianDropout')]), OrderedDict([('location', 'layers/noise.html#alphadropout'), ('text', 'keras.layers.AlphaDropout(rate, noise_shape=None, seed=None) Applies Alpha Dropout to the input. Alpha Dropout is a Dropout that keeps mean and variance of inputs to their original values, in order to ensure the self-normalizing property even after this dropout. Alpha Dropout fits well to Scaled Exponential Linear Units by randomly setting activations to the negative saturation value. Arguments rate : float, drop probability (as with Dropout ). The multiplicative noise will have standard deviation sqrt(rate / (1 - rate)) . seed : A Python integer to use as random seed. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. References Self-Normalizing Neural Networks'), ('title', 'AlphaDropout')]), OrderedDict([('location', 'layers/normalization.html'), ('text', '[source] BatchNormalization keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer=\\'zeros\\', gamma_initializer=\\'ones\\', moving_mean_initializer=\\'zeros\\', moving_variance_initializer=\\'ones\\', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None) Batch normalization layer (Ioffe and Szegedy, 2014). Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1. Arguments axis : Integer, the axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=\"channels_first\" , set axis=1 in BatchNormalization . momentum : Momentum for the moving mean and the moving variance. epsilon : Small float added to variance to avoid dividing by zero. center : If True, add offset of beta to normalized tensor. If False, beta is ignored. scale : If True, multiply by gamma . If False, gamma is not used. When the next layer is linear (also e.g. nn.relu ), this can be disabled since the scaling will be done by the next layer. beta_initializer : Initializer for the beta weight. gamma_initializer : Initializer for the gamma weight. moving_mean_initializer : Initializer for the moving mean. moving_variance_initializer : Initializer for the moving variance. beta_regularizer : Optional regularizer for the beta weight. gamma_regularizer : Optional regularizer for the gamma weight. beta_constraint : Optional constraint for the beta weight. gamma_constraint : Optional constraint for the gamma weight. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. References Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift'), ('title', 'Normalization Layers')]), OrderedDict([('location', 'layers/normalization.html#batchnormalization'), ('text', 'keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer=\\'zeros\\', gamma_initializer=\\'ones\\', moving_mean_initializer=\\'zeros\\', moving_variance_initializer=\\'ones\\', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None) Batch normalization layer (Ioffe and Szegedy, 2014). Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1. Arguments axis : Integer, the axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=\"channels_first\" , set axis=1 in BatchNormalization . momentum : Momentum for the moving mean and the moving variance. epsilon : Small float added to variance to avoid dividing by zero. center : If True, add offset of beta to normalized tensor. If False, beta is ignored. scale : If True, multiply by gamma . If False, gamma is not used. When the next layer is linear (also e.g. nn.relu ), this can be disabled since the scaling will be done by the next layer. beta_initializer : Initializer for the beta weight. gamma_initializer : Initializer for the gamma weight. moving_mean_initializer : Initializer for the moving mean. moving_variance_initializer : Initializer for the moving variance. beta_regularizer : Optional regularizer for the beta weight. gamma_regularizer : Optional regularizer for the gamma weight. beta_constraint : Optional constraint for the beta weight. gamma_constraint : Optional constraint for the gamma weight. Input shape Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape Same shape as input. References Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift'), ('title', 'BatchNormalization')]), OrderedDict([('location', 'layers/pooling.html'), ('text', '[source] MaxPooling1D keras.layers.MaxPooling1D(pool_size=2, strides=None, padding=\\'valid\\', data_format=\\'channels_last\\') Max pooling operation for temporal data. Arguments pool_size : Integer, size of the max pooling windows. strides : Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, steps) Output shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, downsampled_steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, downsampled_steps) [source] MaxPooling2D keras.layers.MaxPooling2D(pool_size=(2, 2), strides=None, padding=\\'valid\\', data_format=None) Max pooling operation for spatial data. Arguments pool_size : integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions. strides : Integer, tuple of 2 integers, or None. Strides values. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, pooled_rows, pooled_cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, pooled_rows, pooled_cols) [source] MaxPooling3D keras.layers.MaxPooling3D(pool_size=(2, 2, 2), strides=None, padding=\\'valid\\', data_format=None) Max pooling operation for 3D data (spatial or spatio-temporal). Arguments pool_size : tuple of 3 integers, factors by which to downscale (dim1, dim2, dim3). (2, 2, 2) will halve the size of the 3D input in each dimension. strides : tuple of 3 integers, or None. Strides values. padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3) [source] AveragePooling1D keras.layers.AveragePooling1D(pool_size=2, strides=None, padding=\\'valid\\', data_format=\\'channels_last\\') Average pooling for temporal data. Arguments pool_size : Integer, size of the average pooling windows. strides : Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, steps) Output shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, downsampled_steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, downsampled_steps) [source] AveragePooling2D keras.layers.AveragePooling2D(pool_size=(2, 2), strides=None, padding=\\'valid\\', data_format=None) Average pooling operation for spatial data. Arguments pool_size : integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions. strides : Integer, tuple of 2 integers, or None. Strides values. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, pooled_rows, pooled_cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, pooled_rows, pooled_cols) [source] AveragePooling3D keras.layers.AveragePooling3D(pool_size=(2, 2, 2), strides=None, padding=\\'valid\\', data_format=None) Average pooling operation for 3D data (spatial or spatio-temporal). Arguments pool_size : tuple of 3 integers, factors by which to downscale (dim1, dim2, dim3). (2, 2, 2) will halve the size of the 3D input in each dimension. strides : tuple of 3 integers, or None. Strides values. padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3) [source] GlobalMaxPooling1D keras.layers.GlobalMaxPooling1D(data_format=\\'channels_last\\') Global max pooling operation for temporal data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, steps) Output shape 2D tensor with shape: (batch_size, features) [source] GlobalAveragePooling1D keras.layers.GlobalAveragePooling1D(data_format=\\'channels_last\\') Global average pooling operation for temporal data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, steps) Output shape 2D tensor with shape: (batch_size, features) [source] GlobalMaxPooling2D keras.layers.GlobalMaxPooling2D(data_format=None) Global max pooling operation for spatial data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape 2D tensor with shape: (batch_size, channels) [source] GlobalAveragePooling2D keras.layers.GlobalAveragePooling2D(data_format=None) Global average pooling operation for spatial data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape 2D tensor with shape: (batch_size, channels) [source] GlobalMaxPooling3D keras.layers.GlobalMaxPooling3D(data_format=None) Global Max pooling operation for 3D data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape 2D tensor with shape: (batch_size, channels) [source] GlobalAveragePooling3D keras.layers.GlobalAveragePooling3D(data_format=None) Global Average pooling operation for 3D data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape 2D tensor with shape: (batch_size, channels)'), ('title', 'Pooling Layers')]), OrderedDict([('location', 'layers/pooling.html#maxpooling1d'), ('text', 'keras.layers.MaxPooling1D(pool_size=2, strides=None, padding=\\'valid\\', data_format=\\'channels_last\\') Max pooling operation for temporal data. Arguments pool_size : Integer, size of the max pooling windows. strides : Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, steps) Output shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, downsampled_steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, downsampled_steps) [source]'), ('title', 'MaxPooling1D')]), OrderedDict([('location', 'layers/pooling.html#maxpooling2d'), ('text', 'keras.layers.MaxPooling2D(pool_size=(2, 2), strides=None, padding=\\'valid\\', data_format=None) Max pooling operation for spatial data. Arguments pool_size : integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions. strides : Integer, tuple of 2 integers, or None. Strides values. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, pooled_rows, pooled_cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, pooled_rows, pooled_cols) [source]'), ('title', 'MaxPooling2D')]), OrderedDict([('location', 'layers/pooling.html#maxpooling3d'), ('text', 'keras.layers.MaxPooling3D(pool_size=(2, 2, 2), strides=None, padding=\\'valid\\', data_format=None) Max pooling operation for 3D data (spatial or spatio-temporal). Arguments pool_size : tuple of 3 integers, factors by which to downscale (dim1, dim2, dim3). (2, 2, 2) will halve the size of the 3D input in each dimension. strides : tuple of 3 integers, or None. Strides values. padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3) [source]'), ('title', 'MaxPooling3D')]), OrderedDict([('location', 'layers/pooling.html#averagepooling1d'), ('text', 'keras.layers.AveragePooling1D(pool_size=2, strides=None, padding=\\'valid\\', data_format=\\'channels_last\\') Average pooling for temporal data. Arguments pool_size : Integer, size of the average pooling windows. strides : Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, steps) Output shape If data_format=\\'channels_last\\' : 3D tensor with shape: (batch_size, downsampled_steps, features) If data_format=\\'channels_first\\' : 3D tensor with shape: (batch_size, features, downsampled_steps) [source]'), ('title', 'AveragePooling1D')]), OrderedDict([('location', 'layers/pooling.html#averagepooling2d'), ('text', 'keras.layers.AveragePooling2D(pool_size=(2, 2), strides=None, padding=\\'valid\\', data_format=None) Average pooling operation for spatial data. Arguments pool_size : integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions. strides : Integer, tuple of 2 integers, or None. Strides values. If None, it will default to pool_size . padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, pooled_rows, pooled_cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, pooled_rows, pooled_cols) [source]'), ('title', 'AveragePooling2D')]), OrderedDict([('location', 'layers/pooling.html#averagepooling3d'), ('text', 'keras.layers.AveragePooling3D(pool_size=(2, 2, 2), strides=None, padding=\\'valid\\', data_format=None) Average pooling operation for 3D data (spatial or spatio-temporal). Arguments pool_size : tuple of 3 integers, factors by which to downscale (dim1, dim2, dim3). (2, 2, 2) will halve the size of the 3D input in each dimension. strides : tuple of 3 integers, or None. Strides values. padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3) [source]'), ('title', 'AveragePooling3D')]), OrderedDict([('location', 'layers/pooling.html#globalmaxpooling1d'), ('text', \"keras.layers.GlobalMaxPooling1D(data_format='channels_last') Global max pooling operation for temporal data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format='channels_last' : 3D tensor with shape: (batch_size, steps, features) If data_format='channels_first' : 3D tensor with shape: (batch_size, features, steps) Output shape 2D tensor with shape: (batch_size, features) [source]\"), ('title', 'GlobalMaxPooling1D')]), OrderedDict([('location', 'layers/pooling.html#globalaveragepooling1d'), ('text', \"keras.layers.GlobalAveragePooling1D(data_format='channels_last') Global average pooling operation for temporal data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps) . Input shape If data_format='channels_last' : 3D tensor with shape: (batch_size, steps, features) If data_format='channels_first' : 3D tensor with shape: (batch_size, features, steps) Output shape 2D tensor with shape: (batch_size, features) [source]\"), ('title', 'GlobalAveragePooling1D')]), OrderedDict([('location', 'layers/pooling.html#globalmaxpooling2d'), ('text', 'keras.layers.GlobalMaxPooling2D(data_format=None) Global max pooling operation for spatial data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape 2D tensor with shape: (batch_size, channels) [source]'), ('title', 'GlobalMaxPooling2D')]), OrderedDict([('location', 'layers/pooling.html#globalaveragepooling2d'), ('text', 'keras.layers.GlobalAveragePooling2D(data_format=None) Global average pooling operation for spatial data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=\\'channels_first\\' : 4D tensor with shape: (batch_size, channels, rows, cols) Output shape 2D tensor with shape: (batch_size, channels) [source]'), ('title', 'GlobalAveragePooling2D')]), OrderedDict([('location', 'layers/pooling.html#globalmaxpooling3d'), ('text', 'keras.layers.GlobalMaxPooling3D(data_format=None) Global Max pooling operation for 3D data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape 2D tensor with shape: (batch_size, channels) [source]'), ('title', 'GlobalMaxPooling3D')]), OrderedDict([('location', 'layers/pooling.html#globalaveragepooling3d'), ('text', 'keras.layers.GlobalAveragePooling3D(data_format=None) Global Average pooling operation for 3D data. Arguments data_format : A string, one of channels_last (default) or channels_first . The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels) while channels_first corresponds to inputs with shape (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". Input shape If data_format=\\'channels_last\\' : 5D tensor with shape: (batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels) If data_format=\\'channels_first\\' : 5D tensor with shape: (batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3) Output shape 2D tensor with shape: (batch_size, channels)'), ('title', 'GlobalAveragePooling3D')]), OrderedDict([('location', 'layers/recurrent.html'), ('text', '[source] RNN keras.layers.RNN(cell, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) Base class for recurrent layers. Arguments cell : A RNN cell instance. A RNN cell is a class that has: a call(input_at_t, states_at_t) method, returning (output_at_t, states_at_t_plus_1) . The call method of the cell can also take the optional argument constants , see section \"Note on passing external constants\" below. a state_size attribute. This can be a single integer (single state) in which case it is the size of the recurrent state (which should be the same as the size of the cell output). This can also be a list/tuple of integers (one size per state). a output_size attribute. This can be a single integer or a TensorShape, which represent the shape of the output. For backward compatible reason, if this attribute is not available for the cell, the value will be inferred by the first element of the state_size . It is also possible for cell to be a list of RNN cell instances, in which cases the cells get stacked on after the other in the RNN, implementing an efficient stacked RNN. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. input_dim : dimensionality of the input (integer). This argument (or alternatively, the keyword argument input_shape ) is required when using this layer as the first layer in a model. input_length : Length of input sequences, to be specified when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed). Note that if the recurrent layer is not the first layer in your model, you would need to specify the input length at the level of the first layer (e.g. via the input_shape argument) Input shape 3D tensor with shape (batch_size, timesteps, input_dim) . Output shape if return_state : a list of tensors. The first tensor is the output. The remaining tensors are the last states, each with shape (batch_size, units) . if return_sequences : 3D tensor with shape (batch_size, timesteps, units) . else, 2D tensor with shape (batch_size, units) . Masking This layer supports masking for input data with a variable number of timesteps. To introduce masks to your data, use an Embedding layer with the mask_zero parameter set to True . Note on using statefulness in RNNs You can set RNN layers to be \\'stateful\\', which means that the states computed for the samples in one batch will be reused as initial states for the samples in the next batch. This assumes a one-to-one mapping between samples in different successive batches. To enable statefulness: - specify stateful=True in the layer constructor. - specify a fixed batch size for your model, by passing if sequential model: batch_input_shape=(...) to the first layer in your model. else for functional model with 1 or more Input layers: batch_shape=(...) to all the first layers in your model. This is the expected shape of your inputs including the batch size . It should be a tuple of integers, e.g. (32, 10, 100) . - specify shuffle=False when calling fit(). To reset the states of your model, call .reset_states() on either a specific layer, or on your entire model. Note on specifying the initial state of RNNs You can specify the initial state of RNN layers symbolically by calling them with the keyword argument initial_state . The value of initial_state should be a tensor or list of tensors representing the initial state of the RNN layer. You can specify the initial state of RNN layers numerically by calling reset_states with the keyword argument states . The value of states should be a numpy array or list of numpy arrays representing the initial state of the RNN layer. Note on passing external constants to RNNs You can pass \"external\" constants to the cell using the constants keyword argument of RNN.__call__ (as well as RNN.call ) method. This requires that the cell.call method accepts the same keyword argument constants . Such constants can be used to condition the cell transformation on additional static inputs (not changing over time), a.k.a. an attention mechanism. Examples # First, let\\'s define a RNN Cell, as a layer subclass. class MinimalRNNCell(keras.layers.Layer): def __init__(self, units, **kwargs): self.units = units self.state_size = units super(MinimalRNNCell, self).__init__(**kwargs) def build(self, input_shape): self.kernel = self.add_weight(shape=(input_shape[-1], self.units), initializer=\\'uniform\\', name=\\'kernel\\') self.recurrent_kernel = self.add_weight( shape=(self.units, self.units), initializer=\\'uniform\\', name=\\'recurrent_kernel\\') self.built = True def call(self, inputs, states): prev_output = states[0] h = K.dot(inputs, self.kernel) output = h + K.dot(prev_output, self.recurrent_kernel) return output, [output] # Let\\'s use this cell in a RNN layer: cell = MinimalRNNCell(32) x = keras.Input((None, 5)) layer = RNN(cell) y = layer(x) # Here\\'s how to use the cell to build a stacked RNN: cells = [MinimalRNNCell(32), MinimalRNNCell(64)] x = keras.Input((None, 5)) layer = RNN(cells) y = layer(x) [source] SimpleRNN keras.layers.SimpleRNN(units, activation=\\'tanh\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) Fully-connected RNN where the output is to be fed back to input. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. [source] GRU keras.layers.GRU(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False, reset_after=False) Gated Recurrent Unit - Cho et al. 2014. There are two variants. The default one is based on 1406.1078v3 and has reset gate applied to hidden state before matrix multiplication. The other one is based on original 1406.1078v1 and has the order reversed. The second variant is compatible with CuDNNGRU (GPU-only) and allows inference on CPU. Thus it has separate biases for kernel and recurrent_kernel . Use \\'reset_after\\'=True and recurrent_activation=\\'sigmoid\\' . Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. reset_after : GRU convention (whether to apply reset gate after or before matrix multiplication). False = \"before\" (default), True = \"after\" (CuDNN compatible). References Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation On the Properties of Neural Machine Translation: Encoder-Decoder Approaches Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling A Theoretically Grounded Application of Dropout in Recurrent Neural Networks [source] LSTM keras.layers.LSTM(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) Long Short-Term Memory layer - Hochreiter 1997. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. References [Long short-term memory] (http://www.bioinf.jku.at/publications/older/2604.pdf) [Learning to forget: Continual prediction with LSTM] (http://www.mitpressjournals.org/doi/pdf/10.1162/089976600300015015) [Supervised sequence labeling with recurrent neural networks] (http://www.cs.toronto.edu/~graves/preprint.pdf) A Theoretically Grounded Application of Dropout in Recurrent Neural Networks [source] ConvLSTM2D keras.layers.ConvLSTM2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1), activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, go_backwards=False, stateful=False, dropout=0.0, recurrent_dropout=0.0) Convolutional LSTM. It is similar to an LSTM layer, but the input transformations and recurrent transformations are both convolutional. Arguments filters : Integer, the dimensionality of the output space (i.e. the number output of filters in the convolution). kernel_size : An integer or tuple/list of n integers, specifying the dimensions of the convolution window. strides : An integer or tuple/list of n integers, specifying the strides of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" (default) or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, time, ..., channels) while \"channels_first\" corresponds to inputs with shape (batch, time, channels, ...) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\" . dilation_rate : An integer or tuple/list of n integers, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Use in combination with bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. go_backwards : Boolean (default False). If True, process the input sequence backwards. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. Input shape if data_format=\\'channels_first\\' 5D tensor with shape: (samples, time, channels, rows, cols) if data_format=\\'channels_last\\' 5D tensor with shape: (samples, time, rows, cols, channels) Output shape if return_sequences if data_format=\\'channels_first\\' 5D tensor with shape: (samples, time, filters, output_row, output_col) if data_format=\\'channels_last\\' 5D tensor with shape: (samples, time, output_row, output_col, filters) else if data_format=\\'channels_first\\' 4D tensor with shape: (samples, filters, output_row, output_col) if data_format=\\'channels_last\\' 4D tensor with shape: (samples, output_row, output_col, filters) where o_row and o_col depend on the shape of the filter and the padding Raises ValueError : in case of invalid constructor arguments. References [Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting](http://arxiv.org/abs/1506.04214v1) The current implementation does not include the feedback loop on the cells output [source] SimpleRNNCell keras.layers.SimpleRNNCell(units, activation=\\'tanh\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0) Cell class for SimpleRNN. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. [source] GRUCell keras.layers.GRUCell(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1, reset_after=False) Cell class for the GRU layer. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. reset_after : GRU convention (whether to apply reset gate after or before matrix multiplication). False = \"before\" (default), True = \"after\" (CuDNN compatible). [source] LSTMCell keras.layers.LSTMCell(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1) Cell class for the LSTM layer. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ).x use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. [source] CuDNNGRU keras.layers.CuDNNGRU(units, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, return_state=False, stateful=False) Fast GRU implementation backed by CuDNN . Can only be run on GPU, with the TensorFlow backend. Arguments units : Positive integer, dimensionality of the output space. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). return_sequences : Boolean. Whether to return the last output. in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. [source] CuDNNLSTM keras.layers.CuDNNLSTM(units, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, return_state=False, stateful=False) Fast LSTM implementation with CuDNN . Can only be run on GPU, with the TensorFlow backend. Arguments units : Positive integer, dimensionality of the output space. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). return_sequences : Boolean. Whether to return the last output. in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.'), ('title', 'Recurrent Layers')]), OrderedDict([('location', 'layers/recurrent.html#rnn'), ('text', 'keras.layers.RNN(cell, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) Base class for recurrent layers. Arguments cell : A RNN cell instance. A RNN cell is a class that has: a call(input_at_t, states_at_t) method, returning (output_at_t, states_at_t_plus_1) . The call method of the cell can also take the optional argument constants , see section \"Note on passing external constants\" below. a state_size attribute. This can be a single integer (single state) in which case it is the size of the recurrent state (which should be the same as the size of the cell output). This can also be a list/tuple of integers (one size per state). a output_size attribute. This can be a single integer or a TensorShape, which represent the shape of the output. For backward compatible reason, if this attribute is not available for the cell, the value will be inferred by the first element of the state_size . It is also possible for cell to be a list of RNN cell instances, in which cases the cells get stacked on after the other in the RNN, implementing an efficient stacked RNN. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. input_dim : dimensionality of the input (integer). This argument (or alternatively, the keyword argument input_shape ) is required when using this layer as the first layer in a model. input_length : Length of input sequences, to be specified when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed). Note that if the recurrent layer is not the first layer in your model, you would need to specify the input length at the level of the first layer (e.g. via the input_shape argument) Input shape 3D tensor with shape (batch_size, timesteps, input_dim) . Output shape if return_state : a list of tensors. The first tensor is the output. The remaining tensors are the last states, each with shape (batch_size, units) . if return_sequences : 3D tensor with shape (batch_size, timesteps, units) . else, 2D tensor with shape (batch_size, units) . Masking This layer supports masking for input data with a variable number of timesteps. To introduce masks to your data, use an Embedding layer with the mask_zero parameter set to True . Note on using statefulness in RNNs You can set RNN layers to be \\'stateful\\', which means that the states computed for the samples in one batch will be reused as initial states for the samples in the next batch. This assumes a one-to-one mapping between samples in different successive batches. To enable statefulness: - specify stateful=True in the layer constructor. - specify a fixed batch size for your model, by passing if sequential model: batch_input_shape=(...) to the first layer in your model. else for functional model with 1 or more Input layers: batch_shape=(...) to all the first layers in your model. This is the expected shape of your inputs including the batch size . It should be a tuple of integers, e.g. (32, 10, 100) . - specify shuffle=False when calling fit(). To reset the states of your model, call .reset_states() on either a specific layer, or on your entire model. Note on specifying the initial state of RNNs You can specify the initial state of RNN layers symbolically by calling them with the keyword argument initial_state . The value of initial_state should be a tensor or list of tensors representing the initial state of the RNN layer. You can specify the initial state of RNN layers numerically by calling reset_states with the keyword argument states . The value of states should be a numpy array or list of numpy arrays representing the initial state of the RNN layer. Note on passing external constants to RNNs You can pass \"external\" constants to the cell using the constants keyword argument of RNN.__call__ (as well as RNN.call ) method. This requires that the cell.call method accepts the same keyword argument constants . Such constants can be used to condition the cell transformation on additional static inputs (not changing over time), a.k.a. an attention mechanism. Examples # First, let\\'s define a RNN Cell, as a layer subclass. class MinimalRNNCell(keras.layers.Layer): def __init__(self, units, **kwargs): self.units = units self.state_size = units super(MinimalRNNCell, self).__init__(**kwargs) def build(self, input_shape): self.kernel = self.add_weight(shape=(input_shape[-1], self.units), initializer=\\'uniform\\', name=\\'kernel\\') self.recurrent_kernel = self.add_weight( shape=(self.units, self.units), initializer=\\'uniform\\', name=\\'recurrent_kernel\\') self.built = True def call(self, inputs, states): prev_output = states[0] h = K.dot(inputs, self.kernel) output = h + K.dot(prev_output, self.recurrent_kernel) return output, [output] # Let\\'s use this cell in a RNN layer: cell = MinimalRNNCell(32) x = keras.Input((None, 5)) layer = RNN(cell) y = layer(x) # Here\\'s how to use the cell to build a stacked RNN: cells = [MinimalRNNCell(32), MinimalRNNCell(64)] x = keras.Input((None, 5)) layer = RNN(cells) y = layer(x) [source]'), ('title', 'RNN')]), OrderedDict([('location', 'layers/recurrent.html#simplernn'), ('text', 'keras.layers.SimpleRNN(units, activation=\\'tanh\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) Fully-connected RNN where the output is to be fed back to input. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. [source]'), ('title', 'SimpleRNN')]), OrderedDict([('location', 'layers/recurrent.html#gru'), ('text', 'keras.layers.GRU(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False, reset_after=False) Gated Recurrent Unit - Cho et al. 2014. There are two variants. The default one is based on 1406.1078v3 and has reset gate applied to hidden state before matrix multiplication. The other one is based on original 1406.1078v1 and has the order reversed. The second variant is compatible with CuDNNGRU (GPU-only) and allows inference on CPU. Thus it has separate biases for kernel and recurrent_kernel . Use \\'reset_after\\'=True and recurrent_activation=\\'sigmoid\\' . Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. reset_after : GRU convention (whether to apply reset gate after or before matrix multiplication). False = \"before\" (default), True = \"after\" (CuDNN compatible). References Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation On the Properties of Neural Machine Translation: Encoder-Decoder Approaches Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling A Theoretically Grounded Application of Dropout in Recurrent Neural Networks [source]'), ('title', 'GRU')]), OrderedDict([('location', 'layers/recurrent.html#lstm'), ('text', 'keras.layers.LSTM(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) Long Short-Term Memory layer - Hochreiter 1997. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. go_backwards : Boolean (default False). If True, process the input sequence backwards and return the reversed sequence. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. unroll : Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences. References [Long short-term memory] (http://www.bioinf.jku.at/publications/older/2604.pdf) [Learning to forget: Continual prediction with LSTM] (http://www.mitpressjournals.org/doi/pdf/10.1162/089976600300015015) [Supervised sequence labeling with recurrent neural networks] (http://www.cs.toronto.edu/~graves/preprint.pdf) A Theoretically Grounded Application of Dropout in Recurrent Neural Networks [source]'), ('title', 'LSTM')]), OrderedDict([('location', 'layers/recurrent.html#convlstm2d'), ('text', 'keras.layers.ConvLSTM2D(filters, kernel_size, strides=(1, 1), padding=\\'valid\\', data_format=None, dilation_rate=(1, 1), activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, go_backwards=False, stateful=False, dropout=0.0, recurrent_dropout=0.0) Convolutional LSTM. It is similar to an LSTM layer, but the input transformations and recurrent transformations are both convolutional. Arguments filters : Integer, the dimensionality of the output space (i.e. the number output of filters in the convolution). kernel_size : An integer or tuple/list of n integers, specifying the dimensions of the convolution window. strides : An integer or tuple/list of n integers, specifying the strides of the convolution. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1. padding : One of \"valid\" or \"same\" (case-insensitive). data_format : A string, one of \"channels_last\" (default) or \"channels_first\" . The ordering of the dimensions in the inputs. \"channels_last\" corresponds to inputs with shape (batch, time, ..., channels) while \"channels_first\" corresponds to inputs with shape (batch, time, channels, ...) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\" . dilation_rate : An integer or tuple/list of n integers, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. activation : Activation function to use (see activations ). If you don\\'t specify anything, no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Use in combination with bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). return_sequences : Boolean. Whether to return the last output in the output sequence, or the full sequence. go_backwards : Boolean (default False). If True, process the input sequence backwards. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. Input shape if data_format=\\'channels_first\\' 5D tensor with shape: (samples, time, channels, rows, cols) if data_format=\\'channels_last\\' 5D tensor with shape: (samples, time, rows, cols, channels) Output shape if return_sequences if data_format=\\'channels_first\\' 5D tensor with shape: (samples, time, filters, output_row, output_col) if data_format=\\'channels_last\\' 5D tensor with shape: (samples, time, output_row, output_col, filters) else if data_format=\\'channels_first\\' 4D tensor with shape: (samples, filters, output_row, output_col) if data_format=\\'channels_last\\' 4D tensor with shape: (samples, output_row, output_col, filters) where o_row and o_col depend on the shape of the filter and the padding Raises ValueError : in case of invalid constructor arguments. References [Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting](http://arxiv.org/abs/1506.04214v1) The current implementation does not include the feedback loop on the cells output [source]'), ('title', 'ConvLSTM2D')]), OrderedDict([('location', 'layers/recurrent.html#simplernncell'), ('text', 'keras.layers.SimpleRNNCell(units, activation=\\'tanh\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0) Cell class for SimpleRNN. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. [source]'), ('title', 'SimpleRNNCell')]), OrderedDict([('location', 'layers/recurrent.html#grucell'), ('text', 'keras.layers.GRUCell(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1, reset_after=False) Cell class for the GRU layer. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. reset_after : GRU convention (whether to apply reset gate after or before matrix multiplication). False = \"before\" (default), True = \"after\" (CuDNN compatible). [source]'), ('title', 'GRUCell')]), OrderedDict([('location', 'layers/recurrent.html#lstmcell'), ('text', 'keras.layers.LSTMCell(units, activation=\\'tanh\\', recurrent_activation=\\'hard_sigmoid\\', use_bias=True, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, implementation=1) Cell class for the LSTM layer. Arguments units : Positive integer, dimensionality of the output space. activation : Activation function to use (see activations ). Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ). recurrent_activation : Activation function to use for the recurrent step (see activations ). Default: hard sigmoid ( hard_sigmoid ). If you pass None , no activation is applied (ie. \"linear\" activation: a(x) = x ).x use_bias : Boolean, whether the layer uses a bias vector. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. recurrent_dropout : Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. implementation : Implementation mode, either 1 or 2. Mode 1 will structure its operations as a larger number of smaller dot products and additions, whereas mode 2 will batch them into fewer, larger operations. These modes will have different performance profiles on different hardware and for different applications. [source]'), ('title', 'LSTMCell')]), OrderedDict([('location', 'layers/recurrent.html#cudnngru'), ('text', 'keras.layers.CuDNNGRU(units, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, return_state=False, stateful=False) Fast GRU implementation backed by CuDNN . Can only be run on GPU, with the TensorFlow backend. Arguments units : Positive integer, dimensionality of the output space. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). return_sequences : Boolean. Whether to return the last output. in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. [source]'), ('title', 'CuDNNGRU')]), OrderedDict([('location', 'layers/recurrent.html#cudnnlstm'), ('text', 'keras.layers.CuDNNLSTM(units, kernel_initializer=\\'glorot_uniform\\', recurrent_initializer=\\'orthogonal\\', bias_initializer=\\'zeros\\', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, return_sequences=False, return_state=False, stateful=False) Fast LSTM implementation with CuDNN . Can only be run on GPU, with the TensorFlow backend. Arguments units : Positive integer, dimensionality of the output space. kernel_initializer : Initializer for the kernel weights matrix, used for the linear transformation of the inputs. (see initializers ). unit_forget_bias : Boolean. If True, add 1 to the bias of the forget gate at initialization. Setting it to true will also force bias_initializer=\"zeros\" . This is recommended in [Jozefowicz et al.] (http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf). recurrent_initializer : Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state. (see initializers ). bias_initializer : Initializer for the bias vector (see initializers ). kernel_regularizer : Regularizer function applied to the kernel weights matrix (see regularizer ). recurrent_regularizer : Regularizer function applied to the recurrent_kernel weights matrix (see regularizer ). bias_regularizer : Regularizer function applied to the bias vector (see regularizer ). activity_regularizer : Regularizer function applied to the output of the layer (its \"activation\"). (see regularizer ). kernel_constraint : Constraint function applied to the kernel weights matrix (see constraints ). recurrent_constraint : Constraint function applied to the recurrent_kernel weights matrix (see constraints ). bias_constraint : Constraint function applied to the bias vector (see constraints ). return_sequences : Boolean. Whether to return the last output. in the output sequence, or the full sequence. return_state : Boolean. Whether to return the last state in addition to the output. stateful : Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.'), ('title', 'CuDNNLSTM')]), OrderedDict([('location', 'layers/wrappers.html'), ('text', \"[source] TimeDistributed keras.layers.TimeDistributed(layer) This wrapper applies a layer to every temporal slice of an input. The input should be at least 3D, and the dimension of index one will be considered to be the temporal dimension. Consider a batch of 32 samples, where each sample is a sequence of 10 vectors of 16 dimensions. The batch input shape of the layer is then (32, 10, 16) , and the input_shape , not including the samples dimension, is (10, 16) . You can then use TimeDistributed to apply a Dense layer to each of the 10 timesteps, independently: # as the first layer in a model model = Sequential() model.add(TimeDistributed(Dense(8), input_shape=(10, 16))) # now model.output_shape == (None, 10, 8) The output will then have shape (32, 10, 8) . In subsequent layers, there is no need for the input_shape : model.add(TimeDistributed(Dense(32))) # now model.output_shape == (None, 10, 32) The output will then have shape (32, 10, 32) . TimeDistributed can be used with arbitrary layers, not just Dense , for instance with a Conv2D layer: model = Sequential() model.add(TimeDistributed(Conv2D(64, (3, 3)), input_shape=(10, 299, 299, 3))) Arguments layer : a layer instance. [source] Bidirectional keras.layers.Bidirectional(layer, merge_mode='concat', weights=None) Bidirectional wrapper for RNNs. Arguments layer : Recurrent instance. merge_mode : Mode by which outputs of the forward and backward RNNs will be combined. One of {'sum', 'mul', 'concat', 'ave', None}. If None, the outputs will not be combined, they will be returned as a list. Raises ValueError : In case of invalid merge_mode argument. Examples model = Sequential() model.add(Bidirectional(LSTM(10, return_sequences=True), input_shape=(5, 10))) model.add(Bidirectional(LSTM(10))) model.add(Dense(5)) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop')\"), ('title', 'Layer wrappers')]), OrderedDict([('location', 'layers/wrappers.html#timedistributed'), ('text', 'keras.layers.TimeDistributed(layer) This wrapper applies a layer to every temporal slice of an input. The input should be at least 3D, and the dimension of index one will be considered to be the temporal dimension. Consider a batch of 32 samples, where each sample is a sequence of 10 vectors of 16 dimensions. The batch input shape of the layer is then (32, 10, 16) , and the input_shape , not including the samples dimension, is (10, 16) . You can then use TimeDistributed to apply a Dense layer to each of the 10 timesteps, independently: # as the first layer in a model model = Sequential() model.add(TimeDistributed(Dense(8), input_shape=(10, 16))) # now model.output_shape == (None, 10, 8) The output will then have shape (32, 10, 8) . In subsequent layers, there is no need for the input_shape : model.add(TimeDistributed(Dense(32))) # now model.output_shape == (None, 10, 32) The output will then have shape (32, 10, 32) . TimeDistributed can be used with arbitrary layers, not just Dense , for instance with a Conv2D layer: model = Sequential() model.add(TimeDistributed(Conv2D(64, (3, 3)), input_shape=(10, 299, 299, 3))) Arguments layer : a layer instance. [source]'), ('title', 'TimeDistributed')]), OrderedDict([('location', 'layers/wrappers.html#bidirectional'), ('text', \"keras.layers.Bidirectional(layer, merge_mode='concat', weights=None) Bidirectional wrapper for RNNs. Arguments layer : Recurrent instance. merge_mode : Mode by which outputs of the forward and backward RNNs will be combined. One of {'sum', 'mul', 'concat', 'ave', None}. If None, the outputs will not be combined, they will be returned as a list. Raises ValueError : In case of invalid merge_mode argument. Examples model = Sequential() model.add(Bidirectional(LSTM(10, return_sequences=True), input_shape=(5, 10))) model.add(Bidirectional(LSTM(10))) model.add(Dense(5)) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop')\"), ('title', 'Bidirectional')]), OrderedDict([('location', 'layers/writing-your-own-keras-layers.html'), ('text', \"Writing your own Keras layers For simple, stateless custom operations, you are probably better off using layers.core.Lambda layers. But for any custom operation that has trainable weights, you should implement your own layer. Here is the skeleton of a Keras layer, as of Keras 2.0 (if you have an older version, please upgrade). There are only three methods you need to implement: build(input_shape) : this is where you will define your weights. This method must set self.built = True at the end, which can be done by calling super([Layer], self).build() . call(x) : this is where the layer's logic lives. Unless you want your layer to support masking, you only have to care about the first argument passed to call : the input tensor. compute_output_shape(input_shape) : in case your layer modifies the shape of its input, you should specify here the shape transformation logic. This allows Keras to do automatic shape inference. from keras import backend as K from keras.engine.topology import Layer class MyLayer(Layer): def __init__(self, output_dim, **kwargs): self.output_dim = output_dim super(MyLayer, self).__init__(**kwargs) def build(self, input_shape): # Create a trainable weight variable for this layer. self.kernel = self.add_weight(name='kernel', shape=(input_shape[1], self.output_dim), initializer='uniform', trainable=True) super(MyLayer, self).build(input_shape) # Be sure to call this at the end def call(self, x): return K.dot(x, self.kernel) def compute_output_shape(self, input_shape): return (input_shape[0], self.output_dim) It is also possible to define Keras layers which have multiple input tensors and multiple ouput tensors. To do this, you should assume that the inputs and outputs of the methods build(input_shape) , call(x) and compute_output_shape(input_shape) are lists. Here is an example, similar to the one above: from keras import backend as K from keras.engine.topology import Layer class MyLayer(Layer): def __init__(self, output_dim, **kwargs): self.output_dim = output_dim super(MyLayer, self).__init__(**kwargs) def build(self, input_shape): assert isinstance(input_shape, list) # Create a trainable weight variable for this layer. self.kernel = self.add_weight(name='kernel', shape=(input_shape[0][1], self.output_dim), initializer='uniform', trainable=True) super(MyLayer, self).build(input_shape) # Be sure to call this at the end def call(self, x): assert isinstance(x, list) a, b = x return [K.dot(a, self.kernel) + b, K.mean(b, axis=-1)] def compute_output_shape(self, input_shape): assert isinstance(input_shape, list) shape_a, shape_b = input_shape return [(shape_a[0], self.output_dim), shape_b[:-1]] The existing Keras layers provide examples of how to implement almost anything. Never hesitate to read the source code!\"), ('title', 'Writing your own Keras layers')]), OrderedDict([('location', 'layers/writing-your-own-keras-layers.html#writing-your-own-keras-layers'), ('text', \"For simple, stateless custom operations, you are probably better off using layers.core.Lambda layers. But for any custom operation that has trainable weights, you should implement your own layer. Here is the skeleton of a Keras layer, as of Keras 2.0 (if you have an older version, please upgrade). There are only three methods you need to implement: build(input_shape) : this is where you will define your weights. This method must set self.built = True at the end, which can be done by calling super([Layer], self).build() . call(x) : this is where the layer's logic lives. Unless you want your layer to support masking, you only have to care about the first argument passed to call : the input tensor. compute_output_shape(input_shape) : in case your layer modifies the shape of its input, you should specify here the shape transformation logic. This allows Keras to do automatic shape inference. from keras import backend as K from keras.engine.topology import Layer class MyLayer(Layer): def __init__(self, output_dim, **kwargs): self.output_dim = output_dim super(MyLayer, self).__init__(**kwargs) def build(self, input_shape): # Create a trainable weight variable for this layer. self.kernel = self.add_weight(name='kernel', shape=(input_shape[1], self.output_dim), initializer='uniform', trainable=True) super(MyLayer, self).build(input_shape) # Be sure to call this at the end def call(self, x): return K.dot(x, self.kernel) def compute_output_shape(self, input_shape): return (input_shape[0], self.output_dim) It is also possible to define Keras layers which have multiple input tensors and multiple ouput tensors. To do this, you should assume that the inputs and outputs of the methods build(input_shape) , call(x) and compute_output_shape(input_shape) are lists. Here is an example, similar to the one above: from keras import backend as K from keras.engine.topology import Layer class MyLayer(Layer): def __init__(self, output_dim, **kwargs): self.output_dim = output_dim super(MyLayer, self).__init__(**kwargs) def build(self, input_shape): assert isinstance(input_shape, list) # Create a trainable weight variable for this layer. self.kernel = self.add_weight(name='kernel', shape=(input_shape[0][1], self.output_dim), initializer='uniform', trainable=True) super(MyLayer, self).build(input_shape) # Be sure to call this at the end def call(self, x): assert isinstance(x, list) a, b = x return [K.dot(a, self.kernel) + b, K.mean(b, axis=-1)] def compute_output_shape(self, input_shape): assert isinstance(input_shape, list) shape_a, shape_b = input_shape return [(shape_a[0], self.output_dim), shape_b[:-1]] The existing Keras layers provide examples of how to implement almost anything. Never hesitate to read the source code!\"), ('title', 'Writing your own Keras layers')]), OrderedDict([('location', 'models/about-keras-models.html'), ('text', \"About Keras models There are two main types of models available in Keras: the Sequential model , and the Model class used with the functional API . These models have a number of methods and attributes in common: model.layers is a flattened list of the layers comprising the model. model.inputs is the list of input tensors of the model. model.outputs is the list of output tensors of the model. model.summary() prints a summary representation of your model. Shortcut for utils.print_summary model.get_config() returns a dictionary containing the configuration of the model. The model can be reinstantiated from its config via: config = model.get_config() model = Model.from_config(config) # or, for Sequential: model = Sequential.from_config(config) model.get_weights() returns a list of all weight tensors in the model, as Numpy arrays. model.set_weights(weights) sets the values of the weights of the model, from a list of Numpy arrays. The arrays in the list should have the same shape as those returned by get_weights() . model.to_json() returns a representation of the model as a JSON string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the JSON string via: from keras.models import model_from_json json_string = model.to_json() model = model_from_json(json_string) model.to_yaml() returns a representation of the model as a YAML string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the YAML string via: from keras.models import model_from_yaml yaml_string = model.to_yaml() model = model_from_yaml(yaml_string) model.save_weights(filepath) saves the weights of the model as a HDF5 file. model.load_weights(filepath, by_name=False) loads the weights of the model from a HDF5 file (created by save_weights ). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use by_name=True to load only those layers with the same name. Note: Please also see How can I install HDF5 or h5py to save my models in Keras? in the FAQ for instructions on how to install h5py . Model subclassing In addition to these two types of models, you may create your own fully-customizable models by subclassing the Model class and implementing your own forward pass in the call method (the Model subclassing API was introduced in Keras 2.2.0). Here's an example of a simple multi-layer perceptron model written as a Model subclass: import keras class SimpleMLP(keras.Model): def __init__(self, use_bn=False, use_dp=False, num_classes=10): super(SimpleMLP, self).__init__(name='mlp') self.use_bn = use_bn self.use_dp = use_dp self.num_classes = num_classes self.dense1 = keras.layers.Dense(32, activation='relu') self.dense2 = keras.layers.Dense(num_classes, activation='softmax') if self.use_dp: self.dp = keras.layers.Dropout(0.5) if self.use_bn: self.bn = keras.layers.BatchNormalization(axis=-1) def call(self, inputs): x = self.dense1(inputs) if self.use_dp: x = self.dp(x) if self.use_bn: x = self.bn(x) return self.dense2(x) model = SimpleMLP() model.compile(...) model.fit(...) Layers are defined in __init__(self, ...) , and the forward pass is specified in call(self, inputs) . In call , you may specify custom losses by calling self.add_loss(loss_tensor) (like you would in a custom layer). In subclassed models, the model's topology is defined as Python code (rather than as a static graph of layers). That means the model's topology cannot be inspected or serialized. As a result, the following methods and attributes are not available for subclassed models : model.inputs and model.outputs . model.to_yaml() and model.to_json() model.get_config() and model.save() . Key point: use the right API for the job. The Model subclassing API can provide you with greater flexbility for implementing complex models, but it comes at a cost (in addition to these missing features): it is more verbose, more complex, and has more opportunities for user errors. If possible, prefer using the functional API, which is more user-friendly.\"), ('title', 'About Keras models')]), OrderedDict([('location', 'models/about-keras-models.html#about-keras-models'), ('text', 'There are two main types of models available in Keras: the Sequential model , and the Model class used with the functional API . These models have a number of methods and attributes in common: model.layers is a flattened list of the layers comprising the model. model.inputs is the list of input tensors of the model. model.outputs is the list of output tensors of the model. model.summary() prints a summary representation of your model. Shortcut for utils.print_summary model.get_config() returns a dictionary containing the configuration of the model. The model can be reinstantiated from its config via: config = model.get_config() model = Model.from_config(config) # or, for Sequential: model = Sequential.from_config(config) model.get_weights() returns a list of all weight tensors in the model, as Numpy arrays. model.set_weights(weights) sets the values of the weights of the model, from a list of Numpy arrays. The arrays in the list should have the same shape as those returned by get_weights() . model.to_json() returns a representation of the model as a JSON string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the JSON string via: from keras.models import model_from_json json_string = model.to_json() model = model_from_json(json_string) model.to_yaml() returns a representation of the model as a YAML string. Note that the representation does not include the weights, only the architecture. You can reinstantiate the same model (with reinitialized weights) from the YAML string via: from keras.models import model_from_yaml yaml_string = model.to_yaml() model = model_from_yaml(yaml_string) model.save_weights(filepath) saves the weights of the model as a HDF5 file. model.load_weights(filepath, by_name=False) loads the weights of the model from a HDF5 file (created by save_weights ). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use by_name=True to load only those layers with the same name. Note: Please also see How can I install HDF5 or h5py to save my models in Keras? in the FAQ for instructions on how to install h5py .'), ('title', 'About Keras models')]), OrderedDict([('location', 'models/about-keras-models.html#model-subclassing'), ('text', \"In addition to these two types of models, you may create your own fully-customizable models by subclassing the Model class and implementing your own forward pass in the call method (the Model subclassing API was introduced in Keras 2.2.0). Here's an example of a simple multi-layer perceptron model written as a Model subclass: import keras class SimpleMLP(keras.Model): def __init__(self, use_bn=False, use_dp=False, num_classes=10): super(SimpleMLP, self).__init__(name='mlp') self.use_bn = use_bn self.use_dp = use_dp self.num_classes = num_classes self.dense1 = keras.layers.Dense(32, activation='relu') self.dense2 = keras.layers.Dense(num_classes, activation='softmax') if self.use_dp: self.dp = keras.layers.Dropout(0.5) if self.use_bn: self.bn = keras.layers.BatchNormalization(axis=-1) def call(self, inputs): x = self.dense1(inputs) if self.use_dp: x = self.dp(x) if self.use_bn: x = self.bn(x) return self.dense2(x) model = SimpleMLP() model.compile(...) model.fit(...) Layers are defined in __init__(self, ...) , and the forward pass is specified in call(self, inputs) . In call , you may specify custom losses by calling self.add_loss(loss_tensor) (like you would in a custom layer). In subclassed models, the model's topology is defined as Python code (rather than as a static graph of layers). That means the model's topology cannot be inspected or serialized. As a result, the following methods and attributes are not available for subclassed models : model.inputs and model.outputs . model.to_yaml() and model.to_json() model.get_config() and model.save() . Key point: use the right API for the job. The Model subclassing API can provide you with greater flexbility for implementing complex models, but it comes at a cost (in addition to these missing features): it is more verbose, more complex, and has more opportunities for user errors. If possible, prefer using the functional API, which is more user-friendly.\"), ('title', 'Model subclassing')]), OrderedDict([('location', 'models/model.html'), ('text', 'Model class API In the functional API, given some input tensor(s) and output tensor(s), you can instantiate a Model via: from keras.models import Model from keras.layers import Input, Dense a = Input(shape=(32,)) b = Dense(32)(a) model = Model(inputs=a, outputs=b) This model will include all layers required in the computation of b given a . In the case of multi-input or multi-output models, you can use lists as well: model = Model(inputs=[a1, a2], outputs=[b1, b2, b3]) For a detailed introduction of what Model can do, read this guide to the Keras functional API . Methods compile compile(optimizer, loss=None, metrics=None, loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None) Configures the model for training. Arguments optimizer : String (name of optimizer) or optimizer instance. See optimizers . loss : String (name of objective function) or objective function. See losses . If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses. metrics : List of metrics to be evaluated by the model during training and testing. Typically you will use metrics=[\\'accuracy\\'] . To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={\\'output_a\\': \\'accuracy\\'} . loss_weights : Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model\\'s outputs. If a tensor, it is expected to map output names (strings) to scalar coefficients. sample_weight_mode : If you need to do timestep-wise sample weighting (2D weights), set this to \"temporal\" . None defaults to sample-wise weights (1D). If the model has multiple outputs, you can use a different sample_weight_mode on each output by passing a dictionary or a list of modes. weighted_metrics : List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing. target_tensors : By default, Keras will create placeholders for the model\\'s target, which will be fed with the target data during training. If instead you would like to use your own target tensors (in turn, Keras will not expect external Numpy data for these targets at training time), you can specify them via the target_tensors argument. It can be a single tensor (for a single-output model), a list of tensors, or a dict mapping output names to target tensors. **kwargs : When using the Theano/CNTK backends, these arguments are passed into K.function . When using the TensorFlow backend, these arguments are passed into tf.Session.run . Raises ValueError : In case of invalid arguments for optimizer , loss , metrics or sample_weight_mode . fit fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None) Trains the model for a given number of epochs (iterations on a dataset). Arguments x : Numpy array of training data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per gradient update. If unspecified, batch_size will default to 32. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided. Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_split : Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling. validation_data : tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_data will override validation_split . shuffle : Boolean (whether to shuffle the training data before each epoch) or str (for \\'batch\\'). \\'batch\\' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None . class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. sample_weight : Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). steps_per_epoch : Integer or None . Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. validation_steps : Only relevant if steps_per_epoch is specified. Total number of steps (batches of samples) to validate before stopping. Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises RuntimeError : If the model was never compiled. ValueError : In case of mismatch between the provided input data and what the model expects. evaluate evaluate(x=None, y=None, batch_size=None, verbose=1, sample_weight=None, steps=None) Returns the loss value & metrics values for the model in test mode. Computation is done in batches. Arguments x : Numpy array of test data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per evaluation step. If unspecified, batch_size will default to 32. verbose : 0 or 1. Verbosity mode. 0 = silent, 1 = progress bar. sample_weight : Optional Numpy array of weights for the test samples, used for weighting the loss function. You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . steps : Integer or None . Total number of steps (batches of samples) before declaring the evaluation round finished. Ignored with the default value of None . Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. predict predict(x, batch_size=None, verbose=0, steps=None) Generates output predictions for the input samples. Computation is done in batches. Arguments x : The input data, as a Numpy array (or list of Numpy arrays if the model has multiple inputs). batch_size : Integer. If unspecified, it will default to 32. verbose : Verbosity mode, 0 or 1. steps : Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None . Returns Numpy array(s) of predictions. Raises ValueError : In case of mismatch between the provided input data and the model\\'s expectations, or in case a stateful model receives a number of samples that is not a multiple of the batch size. train_on_batch train_on_batch(x, y, sample_weight=None, class_weight=None) Runs a single gradient update on a single batch of data. Arguments x : Numpy array of training data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). class_weight : Optional dictionary mapping class indices (integers) to a weight (float) to apply to the model\\'s loss for the samples from this class during training. This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. Returns Scalar training loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. test_on_batch test_on_batch(x, y, sample_weight=None) Test the model on a single batch of samples. Arguments x : Numpy array of test data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. predict_on_batch predict_on_batch(x) Returns predictions for a single batch of samples. Arguments x : Input samples, as a Numpy array. Returns Numpy array(s) of predictions. fit_generator fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True, initial_epoch=0) Trains the model on data generated batch-by-batch by a Python generator (or an instance of Sequence ). The generator is run in parallel to the model, for efficiency. For instance, this allows you to do real-time data augmentation on images on CPU in parallel to training your model on GPU. The use of keras.utils.Sequence guarantees the ordering and guarantees the single use of every input per epoch when using use_multiprocessing=True . Arguments generator : A generator or an instance of Sequence ( keras.utils.Sequence ) object in order to avoid duplicate data when using multiprocessing. The output of the generator must be either a tuple (inputs, targets) a tuple (inputs, targets, sample_weights) . This tuple (a single output of the generator) makes a single batch. Therefore, all arrays in this tuple must have the same length (equal to the size of this batch). Different batches may have different sizes. For example, the last batch of the epoch is commonly smaller than the others, if the size of the dataset is not divisible by the batch size. The generator is expected to loop over its data indefinitely. An epoch finishes when steps_per_epoch batches have been seen by the model. steps_per_epoch : Integer. Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of samples of your dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire data provided, as defined by steps_per_epoch . Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_data : This can be either a generator or a Sequence object for the validation data tuple (x_val, y_val) tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_steps : Only relevant if validation_data is a generator. Total number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch. It should typically be equal to the number of samples of your validation dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(validation_data) as a number of steps. class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. max_queue_size : Integer. Maximum size for the generator queue. If unspecified, max_queue_size will default to 10. workers : Integer. Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : Boolean. If True , use process-based threading. If unspecified, use_multiprocessing will default to False . Note that because this implementation relies on multiprocessing, you should not pass non-picklable arguments to the generator as they can\\'t be passed easily to children processes. shuffle : Boolean. Whether to shuffle the order of the batches at the beginning of each epoch. Only used with instances of Sequence ( keras.utils.Sequence ). Has no effect when steps_per_epoch is not None . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises ValueError : In case the generator yields data in an invalid format. Example def generate_arrays_from_file(path): while True: with open(path) as f: for line in f: # create numpy arrays of input data # and labels, from each line in the file x1, x2, y = process_line(line) yield ({\\'input_1\\': x1, \\'input_2\\': x2}, {\\'output\\': y}) model.fit_generator(generate_arrays_from_file(\\'/my_file.txt\\'), steps_per_epoch=10000, epochs=10) evaluate_generator evaluate_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Evaluates the model on a data generator. The generator should return the same kind of data as accepted by test_on_batch . Arguments generator : Generator yielding tuples (inputs, targets) or (inputs, targets, sample_weights) or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : maximum size for the generator queue workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : if True, use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can\\'t be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. Raises ValueError : In case the generator yields data in an invalid format. predict_generator predict_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Generates predictions for the input samples from a data generator. The generator should return the same kind of data as accepted by predict_on_batch . Arguments generator : Generator yielding batches of input samples or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : Maximum size for the generator queue. workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : If True , use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can\\'t be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Numpy array(s) of predictions. Raises ValueError : In case the generator yields data in an invalid format. get_layer get_layer(name=None, index=None) Retrieves a layer based on either its name (unique) or index. If name and index are both provided, index will take precedence. Indices are based on order of horizontal graph traversal (bottom-up). Arguments name : String, name of layer. index : Integer, index of layer. Returns A layer instance. Raises ValueError : In case of invalid layer name or index.'), ('title', 'Model (functional API)')]), OrderedDict([('location', 'models/model.html#model-class-api'), ('text', 'In the functional API, given some input tensor(s) and output tensor(s), you can instantiate a Model via: from keras.models import Model from keras.layers import Input, Dense a = Input(shape=(32,)) b = Dense(32)(a) model = Model(inputs=a, outputs=b) This model will include all layers required in the computation of b given a . In the case of multi-input or multi-output models, you can use lists as well: model = Model(inputs=[a1, a2], outputs=[b1, b2, b3]) For a detailed introduction of what Model can do, read this guide to the Keras functional API .'), ('title', 'Model class API')]), OrderedDict([('location', 'models/model.html#methods'), ('text', ''), ('title', 'Methods')]), OrderedDict([('location', 'models/model.html#compile'), ('text', 'compile(optimizer, loss=None, metrics=None, loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None) Configures the model for training. Arguments optimizer : String (name of optimizer) or optimizer instance. See optimizers . loss : String (name of objective function) or objective function. See losses . If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses. metrics : List of metrics to be evaluated by the model during training and testing. Typically you will use metrics=[\\'accuracy\\'] . To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={\\'output_a\\': \\'accuracy\\'} . loss_weights : Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model\\'s outputs. If a tensor, it is expected to map output names (strings) to scalar coefficients. sample_weight_mode : If you need to do timestep-wise sample weighting (2D weights), set this to \"temporal\" . None defaults to sample-wise weights (1D). If the model has multiple outputs, you can use a different sample_weight_mode on each output by passing a dictionary or a list of modes. weighted_metrics : List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing. target_tensors : By default, Keras will create placeholders for the model\\'s target, which will be fed with the target data during training. If instead you would like to use your own target tensors (in turn, Keras will not expect external Numpy data for these targets at training time), you can specify them via the target_tensors argument. It can be a single tensor (for a single-output model), a list of tensors, or a dict mapping output names to target tensors. **kwargs : When using the Theano/CNTK backends, these arguments are passed into K.function . When using the TensorFlow backend, these arguments are passed into tf.Session.run . Raises ValueError : In case of invalid arguments for optimizer , loss , metrics or sample_weight_mode .'), ('title', 'compile')]), OrderedDict([('location', 'models/model.html#fit'), ('text', 'fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None) Trains the model for a given number of epochs (iterations on a dataset). Arguments x : Numpy array of training data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per gradient update. If unspecified, batch_size will default to 32. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided. Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_split : Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling. validation_data : tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_data will override validation_split . shuffle : Boolean (whether to shuffle the training data before each epoch) or str (for \\'batch\\'). \\'batch\\' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None . class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. sample_weight : Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). steps_per_epoch : Integer or None . Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. validation_steps : Only relevant if steps_per_epoch is specified. Total number of steps (batches of samples) to validate before stopping. Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises RuntimeError : If the model was never compiled. ValueError : In case of mismatch between the provided input data and what the model expects.'), ('title', 'fit')]), OrderedDict([('location', 'models/model.html#evaluate'), ('text', 'evaluate(x=None, y=None, batch_size=None, verbose=1, sample_weight=None, steps=None) Returns the loss value & metrics values for the model in test mode. Computation is done in batches. Arguments x : Numpy array of test data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per evaluation step. If unspecified, batch_size will default to 32. verbose : 0 or 1. Verbosity mode. 0 = silent, 1 = progress bar. sample_weight : Optional Numpy array of weights for the test samples, used for weighting the loss function. You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . steps : Integer or None . Total number of steps (batches of samples) before declaring the evaluation round finished. Ignored with the default value of None . Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.'), ('title', 'evaluate')]), OrderedDict([('location', 'models/model.html#predict'), ('text', \"predict(x, batch_size=None, verbose=0, steps=None) Generates output predictions for the input samples. Computation is done in batches. Arguments x : The input data, as a Numpy array (or list of Numpy arrays if the model has multiple inputs). batch_size : Integer. If unspecified, it will default to 32. verbose : Verbosity mode, 0 or 1. steps : Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None . Returns Numpy array(s) of predictions. Raises ValueError : In case of mismatch between the provided input data and the model's expectations, or in case a stateful model receives a number of samples that is not a multiple of the batch size.\"), ('title', 'predict')]), OrderedDict([('location', 'models/model.html#train_on_batch'), ('text', 'train_on_batch(x, y, sample_weight=None, class_weight=None) Runs a single gradient update on a single batch of data. Arguments x : Numpy array of training data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). class_weight : Optional dictionary mapping class indices (integers) to a weight (float) to apply to the model\\'s loss for the samples from this class during training. This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. Returns Scalar training loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.'), ('title', 'train_on_batch')]), OrderedDict([('location', 'models/model.html#test_on_batch'), ('text', 'test_on_batch(x, y, sample_weight=None) Test the model on a single batch of samples. Arguments x : Numpy array of test data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.'), ('title', 'test_on_batch')]), OrderedDict([('location', 'models/model.html#predict_on_batch'), ('text', 'predict_on_batch(x) Returns predictions for a single batch of samples. Arguments x : Input samples, as a Numpy array. Returns Numpy array(s) of predictions.'), ('title', 'predict_on_batch')]), OrderedDict([('location', 'models/model.html#fit_generator'), ('text', 'fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True, initial_epoch=0) Trains the model on data generated batch-by-batch by a Python generator (or an instance of Sequence ). The generator is run in parallel to the model, for efficiency. For instance, this allows you to do real-time data augmentation on images on CPU in parallel to training your model on GPU. The use of keras.utils.Sequence guarantees the ordering and guarantees the single use of every input per epoch when using use_multiprocessing=True . Arguments generator : A generator or an instance of Sequence ( keras.utils.Sequence ) object in order to avoid duplicate data when using multiprocessing. The output of the generator must be either a tuple (inputs, targets) a tuple (inputs, targets, sample_weights) . This tuple (a single output of the generator) makes a single batch. Therefore, all arrays in this tuple must have the same length (equal to the size of this batch). Different batches may have different sizes. For example, the last batch of the epoch is commonly smaller than the others, if the size of the dataset is not divisible by the batch size. The generator is expected to loop over its data indefinitely. An epoch finishes when steps_per_epoch batches have been seen by the model. steps_per_epoch : Integer. Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of samples of your dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire data provided, as defined by steps_per_epoch . Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_data : This can be either a generator or a Sequence object for the validation data tuple (x_val, y_val) tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_steps : Only relevant if validation_data is a generator. Total number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch. It should typically be equal to the number of samples of your validation dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(validation_data) as a number of steps. class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. max_queue_size : Integer. Maximum size for the generator queue. If unspecified, max_queue_size will default to 10. workers : Integer. Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : Boolean. If True , use process-based threading. If unspecified, use_multiprocessing will default to False . Note that because this implementation relies on multiprocessing, you should not pass non-picklable arguments to the generator as they can\\'t be passed easily to children processes. shuffle : Boolean. Whether to shuffle the order of the batches at the beginning of each epoch. Only used with instances of Sequence ( keras.utils.Sequence ). Has no effect when steps_per_epoch is not None . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises ValueError : In case the generator yields data in an invalid format. Example def generate_arrays_from_file(path): while True: with open(path) as f: for line in f: # create numpy arrays of input data # and labels, from each line in the file x1, x2, y = process_line(line) yield ({\\'input_1\\': x1, \\'input_2\\': x2}, {\\'output\\': y}) model.fit_generator(generate_arrays_from_file(\\'/my_file.txt\\'), steps_per_epoch=10000, epochs=10)'), ('title', 'fit_generator')]), OrderedDict([('location', 'models/model.html#evaluate_generator'), ('text', \"evaluate_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Evaluates the model on a data generator. The generator should return the same kind of data as accepted by test_on_batch . Arguments generator : Generator yielding tuples (inputs, targets) or (inputs, targets, sample_weights) or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : maximum size for the generator queue workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : if True, use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can't be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. Raises ValueError : In case the generator yields data in an invalid format.\"), ('title', 'evaluate_generator')]), OrderedDict([('location', 'models/model.html#predict_generator'), ('text', \"predict_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Generates predictions for the input samples from a data generator. The generator should return the same kind of data as accepted by predict_on_batch . Arguments generator : Generator yielding batches of input samples or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : Maximum size for the generator queue. workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : If True , use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can't be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Numpy array(s) of predictions. Raises ValueError : In case the generator yields data in an invalid format.\"), ('title', 'predict_generator')]), OrderedDict([('location', 'models/model.html#get_layer'), ('text', 'get_layer(name=None, index=None) Retrieves a layer based on either its name (unique) or index. If name and index are both provided, index will take precedence. Indices are based on order of horizontal graph traversal (bottom-up). Arguments name : String, name of layer. index : Integer, index of layer. Returns A layer instance. Raises ValueError : In case of invalid layer name or index.'), ('title', 'get_layer')]), OrderedDict([('location', 'models/sequential.html'), ('text', 'The Sequential model API To get started, read this guide to the Keras Sequential model . Sequential model methods compile compile(optimizer, loss=None, metrics=None, loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None) Configures the model for training. Arguments optimizer : String (name of optimizer) or optimizer instance. See optimizers . loss : String (name of objective function) or objective function. See losses . If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses. metrics : List of metrics to be evaluated by the model during training and testing. Typically you will use metrics=[\\'accuracy\\'] . To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={\\'output_a\\': \\'accuracy\\'} . loss_weights : Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model\\'s outputs. If a tensor, it is expected to map output names (strings) to scalar coefficients. sample_weight_mode : If you need to do timestep-wise sample weighting (2D weights), set this to \"temporal\" . None defaults to sample-wise weights (1D). If the model has multiple outputs, you can use a different sample_weight_mode on each output by passing a dictionary or a list of modes. weighted_metrics : List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing. target_tensors : By default, Keras will create placeholders for the model\\'s target, which will be fed with the target data during training. If instead you would like to use your own target tensors (in turn, Keras will not expect external Numpy data for these targets at training time), you can specify them via the target_tensors argument. It can be a single tensor (for a single-output model), a list of tensors, or a dict mapping output names to target tensors. **kwargs : When using the Theano/CNTK backends, these arguments are passed into K.function . When using the TensorFlow backend, these arguments are passed into tf.Session.run . Raises ValueError : In case of invalid arguments for optimizer , loss , metrics or sample_weight_mode . fit fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None) Trains the model for a given number of epochs (iterations on a dataset). Arguments x : Numpy array of training data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per gradient update. If unspecified, batch_size will default to 32. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided. Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_split : Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling. validation_data : tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_data will override validation_split . shuffle : Boolean (whether to shuffle the training data before each epoch) or str (for \\'batch\\'). \\'batch\\' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None . class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. sample_weight : Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). steps_per_epoch : Integer or None . Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. validation_steps : Only relevant if steps_per_epoch is specified. Total number of steps (batches of samples) to validate before stopping. Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises RuntimeError : If the model was never compiled. ValueError : In case of mismatch between the provided input data and what the model expects. evaluate evaluate(x=None, y=None, batch_size=None, verbose=1, sample_weight=None, steps=None) Returns the loss value & metrics values for the model in test mode. Computation is done in batches. Arguments x : Numpy array of test data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per evaluation step. If unspecified, batch_size will default to 32. verbose : 0 or 1. Verbosity mode. 0 = silent, 1 = progress bar. sample_weight : Optional Numpy array of weights for the test samples, used for weighting the loss function. You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . steps : Integer or None . Total number of steps (batches of samples) before declaring the evaluation round finished. Ignored with the default value of None . Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. predict predict(x, batch_size=None, verbose=0, steps=None) Generates output predictions for the input samples. Computation is done in batches. Arguments x : The input data, as a Numpy array (or list of Numpy arrays if the model has multiple inputs). batch_size : Integer. If unspecified, it will default to 32. verbose : Verbosity mode, 0 or 1. steps : Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None . Returns Numpy array(s) of predictions. Raises ValueError : In case of mismatch between the provided input data and the model\\'s expectations, or in case a stateful model receives a number of samples that is not a multiple of the batch size. train_on_batch train_on_batch(x, y, sample_weight=None, class_weight=None) Runs a single gradient update on a single batch of data. Arguments x : Numpy array of training data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). class_weight : Optional dictionary mapping class indices (integers) to a weight (float) to apply to the model\\'s loss for the samples from this class during training. This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. Returns Scalar training loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. test_on_batch test_on_batch(x, y, sample_weight=None) Test the model on a single batch of samples. Arguments x : Numpy array of test data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. predict_on_batch predict_on_batch(x) Returns predictions for a single batch of samples. Arguments x : Input samples, as a Numpy array. Returns Numpy array(s) of predictions. fit_generator fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True, initial_epoch=0) Trains the model on data generated batch-by-batch by a Python generator (or an instance of Sequence ). The generator is run in parallel to the model, for efficiency. For instance, this allows you to do real-time data augmentation on images on CPU in parallel to training your model on GPU. The use of keras.utils.Sequence guarantees the ordering and guarantees the single use of every input per epoch when using use_multiprocessing=True . Arguments generator : A generator or an instance of Sequence ( keras.utils.Sequence ) object in order to avoid duplicate data when using multiprocessing. The output of the generator must be either a tuple (inputs, targets) a tuple (inputs, targets, sample_weights) . This tuple (a single output of the generator) makes a single batch. Therefore, all arrays in this tuple must have the same length (equal to the size of this batch). Different batches may have different sizes. For example, the last batch of the epoch is commonly smaller than the others, if the size of the dataset is not divisible by the batch size. The generator is expected to loop over its data indefinitely. An epoch finishes when steps_per_epoch batches have been seen by the model. steps_per_epoch : Integer. Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of samples of your dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire data provided, as defined by steps_per_epoch . Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_data : This can be either a generator or a Sequence object for the validation data tuple (x_val, y_val) tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_steps : Only relevant if validation_data is a generator. Total number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch. It should typically be equal to the number of samples of your validation dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(validation_data) as a number of steps. class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. max_queue_size : Integer. Maximum size for the generator queue. If unspecified, max_queue_size will default to 10. workers : Integer. Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : Boolean. If True , use process-based threading. If unspecified, use_multiprocessing will default to False . Note that because this implementation relies on multiprocessing, you should not pass non-picklable arguments to the generator as they can\\'t be passed easily to children processes. shuffle : Boolean. Whether to shuffle the order of the batches at the beginning of each epoch. Only used with instances of Sequence ( keras.utils.Sequence ). Has no effect when steps_per_epoch is not None . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises ValueError : In case the generator yields data in an invalid format. Example def generate_arrays_from_file(path): while True: with open(path) as f: for line in f: # create numpy arrays of input data # and labels, from each line in the file x1, x2, y = process_line(line) yield ({\\'input_1\\': x1, \\'input_2\\': x2}, {\\'output\\': y}) model.fit_generator(generate_arrays_from_file(\\'/my_file.txt\\'), steps_per_epoch=10000, epochs=10) evaluate_generator evaluate_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Evaluates the model on a data generator. The generator should return the same kind of data as accepted by test_on_batch . Arguments generator : Generator yielding tuples (inputs, targets) or (inputs, targets, sample_weights) or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : maximum size for the generator queue workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : if True, use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can\\'t be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. Raises ValueError : In case the generator yields data in an invalid format. predict_generator predict_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Generates predictions for the input samples from a data generator. The generator should return the same kind of data as accepted by predict_on_batch . Arguments generator : Generator yielding batches of input samples or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : Maximum size for the generator queue. workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : If True , use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can\\'t be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Numpy array(s) of predictions. Raises ValueError : In case the generator yields data in an invalid format. get_layer get_layer(name=None, index=None) Retrieves a layer based on either its name (unique) or index. If name and index are both provided, index will take precedence. Indices are based on order of horizontal graph traversal (bottom-up). Arguments name : String, name of layer. index : Integer, index of layer. Returns A layer instance. Raises ValueError : In case of invalid layer name or index.'), ('title', 'Sequential')]), OrderedDict([('location', 'models/sequential.html#the-sequential-model-api'), ('text', 'To get started, read this guide to the Keras Sequential model .'), ('title', 'The Sequential model API')]), OrderedDict([('location', 'models/sequential.html#sequential-model-methods'), ('text', ''), ('title', 'Sequential model methods')]), OrderedDict([('location', 'models/sequential.html#compile'), ('text', 'compile(optimizer, loss=None, metrics=None, loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None) Configures the model for training. Arguments optimizer : String (name of optimizer) or optimizer instance. See optimizers . loss : String (name of objective function) or objective function. See losses . If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses. metrics : List of metrics to be evaluated by the model during training and testing. Typically you will use metrics=[\\'accuracy\\'] . To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={\\'output_a\\': \\'accuracy\\'} . loss_weights : Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model\\'s outputs. If a tensor, it is expected to map output names (strings) to scalar coefficients. sample_weight_mode : If you need to do timestep-wise sample weighting (2D weights), set this to \"temporal\" . None defaults to sample-wise weights (1D). If the model has multiple outputs, you can use a different sample_weight_mode on each output by passing a dictionary or a list of modes. weighted_metrics : List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing. target_tensors : By default, Keras will create placeholders for the model\\'s target, which will be fed with the target data during training. If instead you would like to use your own target tensors (in turn, Keras will not expect external Numpy data for these targets at training time), you can specify them via the target_tensors argument. It can be a single tensor (for a single-output model), a list of tensors, or a dict mapping output names to target tensors. **kwargs : When using the Theano/CNTK backends, these arguments are passed into K.function . When using the TensorFlow backend, these arguments are passed into tf.Session.run . Raises ValueError : In case of invalid arguments for optimizer , loss , metrics or sample_weight_mode .'), ('title', 'compile')]), OrderedDict([('location', 'models/sequential.html#fit'), ('text', 'fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None) Trains the model for a given number of epochs (iterations on a dataset). Arguments x : Numpy array of training data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per gradient update. If unspecified, batch_size will default to 32. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided. Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_split : Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling. validation_data : tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_data will override validation_split . shuffle : Boolean (whether to shuffle the training data before each epoch) or str (for \\'batch\\'). \\'batch\\' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None . class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. sample_weight : Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). steps_per_epoch : Integer or None . Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. validation_steps : Only relevant if steps_per_epoch is specified. Total number of steps (batches of samples) to validate before stopping. Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises RuntimeError : If the model was never compiled. ValueError : In case of mismatch between the provided input data and what the model expects.'), ('title', 'fit')]), OrderedDict([('location', 'models/sequential.html#evaluate'), ('text', 'evaluate(x=None, y=None, batch_size=None, verbose=1, sample_weight=None, steps=None) Returns the loss value & metrics values for the model in test mode. Computation is done in batches. Arguments x : Numpy array of test data (if the model has a single input), or list of Numpy arrays (if the model has multiple inputs). If input layers in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. x can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). y : Numpy array of target (label) data (if the model has a single output), or list of Numpy arrays (if the model has multiple outputs). If output layers in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. y can be None (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors). batch_size : Integer or None . Number of samples per evaluation step. If unspecified, batch_size will default to 32. verbose : 0 or 1. Verbosity mode. 0 = silent, 1 = progress bar. sample_weight : Optional Numpy array of weights for the test samples, used for weighting the loss function. You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length) , to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile() . steps : Integer or None . Total number of steps (batches of samples) before declaring the evaluation round finished. Ignored with the default value of None . Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.'), ('title', 'evaluate')]), OrderedDict([('location', 'models/sequential.html#predict'), ('text', \"predict(x, batch_size=None, verbose=0, steps=None) Generates output predictions for the input samples. Computation is done in batches. Arguments x : The input data, as a Numpy array (or list of Numpy arrays if the model has multiple inputs). batch_size : Integer. If unspecified, it will default to 32. verbose : Verbosity mode, 0 or 1. steps : Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None . Returns Numpy array(s) of predictions. Raises ValueError : In case of mismatch between the provided input data and the model's expectations, or in case a stateful model receives a number of samples that is not a multiple of the batch size.\"), ('title', 'predict')]), OrderedDict([('location', 'models/sequential.html#train_on_batch'), ('text', 'train_on_batch(x, y, sample_weight=None, class_weight=None) Runs a single gradient update on a single batch of data. Arguments x : Numpy array of training data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). class_weight : Optional dictionary mapping class indices (integers) to a weight (float) to apply to the model\\'s loss for the samples from this class during training. This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. Returns Scalar training loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.'), ('title', 'train_on_batch')]), OrderedDict([('location', 'models/sequential.html#test_on_batch'), ('text', 'test_on_batch(x, y, sample_weight=None) Test the model on a single batch of samples. Arguments x : Numpy array of test data, or list of Numpy arrays if the model has multiple inputs. If all inputs in the model are named, you can also pass a dictionary mapping input names to Numpy arrays. y : Numpy array of target data, or list of Numpy arrays if the model has multiple outputs. If all outputs in the model are named, you can also pass a dictionary mapping output names to Numpy arrays. sample_weight : Optional array of the same length as x, containing weights to apply to the model\\'s loss for each sample. In the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode=\"temporal\" in compile(). Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.'), ('title', 'test_on_batch')]), OrderedDict([('location', 'models/sequential.html#predict_on_batch'), ('text', 'predict_on_batch(x) Returns predictions for a single batch of samples. Arguments x : Input samples, as a Numpy array. Returns Numpy array(s) of predictions.'), ('title', 'predict_on_batch')]), OrderedDict([('location', 'models/sequential.html#fit_generator'), ('text', 'fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True, initial_epoch=0) Trains the model on data generated batch-by-batch by a Python generator (or an instance of Sequence ). The generator is run in parallel to the model, for efficiency. For instance, this allows you to do real-time data augmentation on images on CPU in parallel to training your model on GPU. The use of keras.utils.Sequence guarantees the ordering and guarantees the single use of every input per epoch when using use_multiprocessing=True . Arguments generator : A generator or an instance of Sequence ( keras.utils.Sequence ) object in order to avoid duplicate data when using multiprocessing. The output of the generator must be either a tuple (inputs, targets) a tuple (inputs, targets, sample_weights) . This tuple (a single output of the generator) makes a single batch. Therefore, all arrays in this tuple must have the same length (equal to the size of this batch). Different batches may have different sizes. For example, the last batch of the epoch is commonly smaller than the others, if the size of the dataset is not divisible by the batch size. The generator is expected to loop over its data indefinitely. An epoch finishes when steps_per_epoch batches have been seen by the model. steps_per_epoch : Integer. Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of samples of your dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. epochs : Integer. Number of epochs to train the model. An epoch is an iteration over the entire data provided, as defined by steps_per_epoch . Note that in conjunction with initial_epoch , epochs is to be understood as \"final epoch\". The model is not trained for a number of iterations given by epochs , but merely until the epoch of index epochs is reached. verbose : Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. callbacks : List of keras.callbacks.Callback instances. List of callbacks to apply during training. See callbacks . validation_data : This can be either a generator or a Sequence object for the validation data tuple (x_val, y_val) tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_steps : Only relevant if validation_data is a generator. Total number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch. It should typically be equal to the number of samples of your validation dataset divided by the batch size. Optional for Sequence : if unspecified, will use the len(validation_data) as a number of steps. class_weight : Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to \"pay more attention\" to samples from an under-represented class. max_queue_size : Integer. Maximum size for the generator queue. If unspecified, max_queue_size will default to 10. workers : Integer. Maximum number of processes to spin up when using process-based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : Boolean. If True , use process-based threading. If unspecified, use_multiprocessing will default to False . Note that because this implementation relies on multiprocessing, you should not pass non-picklable arguments to the generator as they can\\'t be passed easily to children processes. shuffle : Boolean. Whether to shuffle the order of the batches at the beginning of each epoch. Only used with instances of Sequence ( keras.utils.Sequence ). Has no effect when steps_per_epoch is not None . initial_epoch : Integer. Epoch at which to start training (useful for resuming a previous training run). Returns A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Raises ValueError : In case the generator yields data in an invalid format. Example def generate_arrays_from_file(path): while True: with open(path) as f: for line in f: # create numpy arrays of input data # and labels, from each line in the file x1, x2, y = process_line(line) yield ({\\'input_1\\': x1, \\'input_2\\': x2}, {\\'output\\': y}) model.fit_generator(generate_arrays_from_file(\\'/my_file.txt\\'), steps_per_epoch=10000, epochs=10)'), ('title', 'fit_generator')]), OrderedDict([('location', 'models/sequential.html#evaluate_generator'), ('text', \"evaluate_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Evaluates the model on a data generator. The generator should return the same kind of data as accepted by test_on_batch . Arguments generator : Generator yielding tuples (inputs, targets) or (inputs, targets, sample_weights) or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : maximum size for the generator queue workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : if True, use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can't be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs. Raises ValueError : In case the generator yields data in an invalid format.\"), ('title', 'evaluate_generator')]), OrderedDict([('location', 'models/sequential.html#predict_generator'), ('text', \"predict_generator(generator, steps=None, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0) Generates predictions for the input samples from a data generator. The generator should return the same kind of data as accepted by predict_on_batch . Arguments generator : Generator yielding batches of input samples or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. steps : Total number of steps (batches of samples) to yield from generator before stopping. Optional for Sequence : if unspecified, will use the len(generator) as a number of steps. max_queue_size : Maximum size for the generator queue. workers : Integer. Maximum number of processes to spin up when using process based threading. If unspecified, workers will default to 1. If 0, will execute the generator on the main thread. use_multiprocessing : If True , use process based threading. Note that because this implementation relies on multiprocessing, you should not pass non picklable arguments to the generator as they can't be passed easily to children processes. verbose : verbosity mode, 0 or 1. Returns Numpy array(s) of predictions. Raises ValueError : In case the generator yields data in an invalid format.\"), ('title', 'predict_generator')]), OrderedDict([('location', 'models/sequential.html#get_layer'), ('text', 'get_layer(name=None, index=None) Retrieves a layer based on either its name (unique) or index. If name and index are both provided, index will take precedence. Indices are based on order of horizontal graph traversal (bottom-up). Arguments name : String, name of layer. index : Integer, index of layer. Returns A layer instance. Raises ValueError : In case of invalid layer name or index.'), ('title', 'get_layer')]), OrderedDict([('location', 'preprocessing/image.html'), ('text', 'Image Preprocessing [source] ImageDataGenerator class keras.preprocessing.image.ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False, samplewise_std_normalization=False, zca_whitening=False, zca_epsilon=1e-06, rotation_range=0, width_shift_range=0.0, height_shift_range=0.0, brightness_range=None, shear_range=0.0, zoom_range=0.0, channel_shift_range=0.0, fill_mode=\\'nearest\\', cval=0.0, horizontal_flip=False, vertical_flip=False, rescale=None, preprocessing_function=None, data_format=None, validation_split=0.0, dtype=None) Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches). Arguments featurewise_center : Boolean. Set input mean to 0 over the dataset, feature-wise. samplewise_center : Boolean. Set each sample mean to 0. featurewise_std_normalization : Boolean. Divide inputs by std of the dataset, feature-wise. samplewise_std_normalization : Boolean. Divide each input by its std. zca_epsilon : epsilon for ZCA whitening. Default is 1e-6. zca_whitening : Boolean. Apply ZCA whitening. rotation_range : Int. Degree range for random rotations. width_shift_range : Float, 1-D array-like or int float: fraction of total width, if < 1, or pixels if >= 1. 1-D array-like: random elements from the array. int: integer number of pixels from interval (-width_shift_range, +width_shift_range) With width_shift_range=2 possible values are integers [-1, 0, +1] , same as with width_shift_range=[-1, 0, +1] , while with width_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0). height_shift_range : Float, 1-D array-like or int float: fraction of total height, if < 1, or pixels if >= 1. 1-D array-like: random elements from the array. int: integer number of pixels from interval (-height_shift_range, +height_shift_range) With height_shift_range=2 possible values are integers [-1, 0, +1] , same as with height_shift_range=[-1, 0, +1] , while with height_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0). brightness_range : Tuple or list of two floats. Range for picking a brightness shift value from. shear_range : Float. Shear Intensity (Shear angle in counter-clockwise direction in degrees) zoom_range : Float or [lower, upper]. Range for random zoom. If a float, [lower, upper] = [1-zoom_range, 1+zoom_range] . channel_shift_range : Float. Range for random channel shifts. fill_mode : One of {\"constant\", \"nearest\", \"reflect\" or \"wrap\"}. Default is \\'nearest\\'. Points outside the boundaries of the input are filled according to the given mode: \\'constant\\': kkkkkkkk|abcd|kkkkkkkk (cval=k) \\'nearest\\': aaaaaaaa|abcd|dddddddd \\'reflect\\': abcddcba|abcd|dcbaabcd \\'wrap\\': abcdabcd|abcd|abcdabcd cval : Float or Int. Value used for points outside the boundaries when fill_mode = \"constant\" . horizontal_flip : Boolean. Randomly flip inputs horizontally. vertical_flip : Boolean. Randomly flip inputs vertically. rescale : rescaling factor. Defaults to None. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (after applying all other transformations). preprocessing_function : function that will be implied on each input. The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape. data_format : Image data format, either \"channels_first\" or \"channels_last\". \"channels_last\" mode means that the images should have shape (samples, height, width, channels) , \"channels_first\" mode means that the images should have shape (samples, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". validation_split : Float. Fraction of images reserved for validation (strictly between 0 and 1). dtype : Dtype to use for the generated arrays. Examples Example of using .flow(x, y) : (x_train, y_train), (x_test, y_test) = cifar10.load_data() y_train = np_utils.to_categorical(y_train, num_classes) y_test = np_utils.to_categorical(y_test, num_classes) datagen = ImageDataGenerator( featurewise_center=True, featurewise_std_normalization=True, rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True) # compute quantities required for featurewise normalization # (std, mean, and principal components if ZCA whitening is applied) datagen.fit(x_train) # fits the model on batches with real-time data augmentation: model.fit_generator(datagen.flow(x_train, y_train, batch_size=32), steps_per_epoch=len(x_train) / 32, epochs=epochs) # here\\'s a more \"manual\" example for e in range(epochs): print(\\'Epoch\\', e) batches = 0 for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32): model.fit(x_batch, y_batch) batches += 1 if batches >= len(x_train) / 32: # we need to break the loop by hand because # the generator loops indefinitely break Example of using .flow_from_directory(directory) : train_datagen = ImageDataGenerator( rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) test_datagen = ImageDataGenerator(rescale=1./255) train_generator = train_datagen.flow_from_directory( \\'data/train\\', target_size=(150, 150), batch_size=32, class_mode=\\'binary\\') validation_generator = test_datagen.flow_from_directory( \\'data/validation\\', target_size=(150, 150), batch_size=32, class_mode=\\'binary\\') model.fit_generator( train_generator, steps_per_epoch=2000, epochs=50, validation_data=validation_generator, validation_steps=800) Example of transforming images and masks together. # we create two instances with the same arguments data_gen_args = dict(featurewise_center=True, featurewise_std_normalization=True, rotation_range=90, width_shift_range=0.1, height_shift_range=0.1, zoom_range=0.2) image_datagen = ImageDataGenerator(**data_gen_args) mask_datagen = ImageDataGenerator(**data_gen_args) # Provide the same seed and keyword arguments to the fit and flow methods seed = 1 image_datagen.fit(images, augment=True, seed=seed) mask_datagen.fit(masks, augment=True, seed=seed) image_generator = image_datagen.flow_from_directory( \\'data/images\\', class_mode=None, seed=seed) mask_generator = mask_datagen.flow_from_directory( \\'data/masks\\', class_mode=None, seed=seed) # combine generators into one which yields image and masks train_generator = zip(image_generator, mask_generator) model.fit_generator( train_generator, steps_per_epoch=2000, epochs=50) ImageDataGenerator methods apply_transform apply_transform(x, transform_parameters) Applies a transformation to an image according to given parameters. Arguments x : 3D tensor, single image. transform_parameters : Dictionary with string - parameter pairs describing the transformation. Currently, the following parameters from the dictionary are used: \\'theta\\' : Float. Rotation angle in degrees. \\'tx\\' : Float. Shift in the x direction. \\'ty\\' : Float. Shift in the y direction. \\'shear\\' : Float. Shear angle in degrees. \\'zx\\' : Float. Zoom in the x direction. \\'zy\\' : Float. Zoom in the y direction. \\'flip_horizontal\\' : Boolean. Horizontal flip. \\'flip_vertical\\' : Boolean. Vertical flip. \\'channel_shift_intencity\\' : Float. Channel shift intensity. \\'brightness\\' : Float. Brightness shift intensity. Returns A transformed version of the input (same shape). fit fit(x, augment=False, rounds=1, seed=None) Fits the data generator to some sample data. This computes the internal data stats related to the data-dependent transformations, based on an array of sample data. Only required if featurewise_center or featurewise_std_normalization or zca_whitening are set to True. Arguments x : Sample data. Should have rank 4. In case of grayscale data, the channels axis should have value 1, in case of RGB data, it should have value 3, and in case of RGBA data, it should have value 4. augment : Boolean (default: False). Whether to fit on randomly augmented samples. rounds : Int (default: 1). If using data augmentation ( augment=True ), this is how many augmentation passes over the data to use. seed : Int (default: None). Random seed. flow flow(x, y=None, batch_size=32, shuffle=True, sample_weight=None, seed=None, save_to_dir=None, save_prefix=\\'\\', save_format=\\'png\\', subset=None) Takes data & label arrays, generates batches of augmented data. Arguments x : Input data. Numpy array of rank 4 or a tuple. If tuple, the first element should contain the images and the second element another numpy array or a list of numpy arrays that gets passed to the output without any modifications. Can be used to feed the model miscellaneous data along with the images. In case of grayscale data, the channels axis of the image array should have value 1, in case of RGB data, it should have value 3, and in case of RGBA data, it should have value 4. y : Labels. batch_size : Int (default: 32). shuffle : Boolean (default: True). sample_weight : Sample weights. seed : Int (default: None). save_to_dir : None or str (default: None). This allows you to optionally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing). save_prefix : Str (default: \\'\\' ). Prefix to use for filenames of saved pictures (only relevant if save_to_dir is set). save_format : one of \"png\", \"jpeg\" (only relevant if save_to_dir is set). Default: \"png\". subset : Subset of data ( \"training\" or \"validation\" ) if validation_split is set in ImageDataGenerator . Returns An Iterator yielding tuples of (x, y) where x is a numpy array of image data (in the case of a single image input) or a list of numpy arrays (in the case with additional inputs) and y is a numpy array of corresponding labels. If \\'sample_weight\\' is not None, the yielded tuples are of the form (x, y, sample_weight) . If y is None, only the numpy array x is returned. flow_from_dataframe flow_from_dataframe(dataframe, directory, x_col=\\'filename\\', y_col=\\'class\\', has_ext=True, target_size=(256, 256), color_mode=\\'rgb\\', classes=None, class_mode=\\'categorical\\', batch_size=32, shuffle=True, seed=None, save_to_dir=None, save_prefix=\\'\\', save_format=\\'png\\', subset=None, interpolation=\\'nearest\\') Takes the dataframe and the path to a directory and generates batches of augmented/normalized data. A simple tutorial can be found at: http://bit.ly/keras_flow_from_dataframe Arguments dataframe: Pandas dataframe containing the filenames of the images in a column and classes in another or column/s that can be fed as raw target data. directory: string, path to the target directory that contains all the images mapped in the dataframe. x_col: string, column in the dataframe that contains the filenames of the target images. y_col: string or list of strings,columns in the dataframe that will be the target data. has_ext: bool, True if filenames in dataframe[x_col] has filename extensions,else False. target_size: tuple of integers `(height, width)`, default: `(256, 256)`. The dimensions to which all images found will be resized. color_mode: one of \"grayscale\", \"rbg\". Default: \"rgb\". Whether the images will be converted to have 1 or 3 color channels. classes: optional list of classes (e.g. `[\\'dogs\\', \\'cats\\']`). Default: None. If not provided, the list of classes will be automatically inferred from the y_col, which will map to the label indices, will be alphanumeric). The dictionary containing the mapping from class names to class indices can be obtained via the attribute `class_indices`. class_mode: one of \"categorical\", \"binary\", \"sparse\", \"input\", \"other\" or None. Default: \"categorical\". Determines the type of label arrays that are returned: - `\"categorical\"` will be 2D one-hot encoded labels, - `\"binary\"` will be 1D binary labels, - `\"sparse\"` will be 1D integer labels, - `\"input\"` will be images identical to input images (mainly used to work with autoencoders). - `\"other\"` will be numpy array of y_col data - None, no labels are returned (the generator will only yield batches of image data, which is useful to use `model.predict_generator()`, `model.evaluate_generator()`, etc.). batch_size: size of the batches of data (default: 32). shuffle: whether to shuffle the data (default: True) seed: optional random seed for shuffling and transformations. save_to_dir: None or str (default: None). This allows you to optionally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing). save_prefix: str. Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set). save_format: one of \"png\", \"jpeg\" (only relevant if `save_to_dir` is set). Default: \"png\". follow_links: whether to follow symlinks inside class subdirectories (default: False). subset: Subset of data (`\"training\"` or `\"validation\"`) if `validation_split` is set in `ImageDataGenerator`. interpolation: Interpolation method used to resample the image if the target size is different from that of the loaded image. Supported methods are `\"nearest\"`, `\"bilinear\"`, and `\"bicubic\"`. If PIL version 1.1.3 or newer is installed, `\"lanczos\"` is also supported. If PIL version 3.4.0 or newer is installed, `\"box\"` and `\"hamming\"` are also supported. By default, `\"nearest\"` is used. Returns A DataFrameIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) and y is a numpy array of corresponding labels. flow_from_directory flow_from_directory(directory, target_size=(256, 256), color_mode=\\'rgb\\', classes=None, class_mode=\\'categorical\\', batch_size=32, shuffle=True, seed=None, save_to_dir=None, save_prefix=\\'\\', save_format=\\'png\\', follow_links=False, subset=None, interpolation=\\'nearest\\') Takes the path to a directory & generates batches of augmented data. Arguments directory : Path to the target directory. It should contain one subdirectory per class. Any PNG, JPG, BMP, PPM or TIF images inside each of the subdirectories directory tree will be included in the generator. See this script for more details. target_size : Tuple of integers (height, width) , default: (256, 256) . The dimensions to which all images found will be resized. color_mode : One of \"grayscale\", \"rbg\", \"rgba\". Default: \"rgb\". Whether the images will be converted to have 1, 3, or 4 channels. classes : Optional list of class subdirectories (e.g. [\\'dogs\\', \\'cats\\'] ). Default: None. If not provided, the list of classes will be automatically inferred from the subdirectory names/structure under directory , where each subdirectory will be treated as a different class (and the order of the classes, which will map to the label indices, will be alphanumeric). The dictionary containing the mapping from class names to class indices can be obtained via the attribute class_indices . class_mode : One of \"categorical\", \"binary\", \"sparse\", \"input\", or None. Default: \"categorical\". Determines the type of label arrays that are returned: \"categorical\" will be 2D one-hot encoded labels, \"binary\" will be 1D binary labels, \"sparse\" will be 1D integer labels, \"input\" will be images identical to input images (mainly used to work with autoencoders). If None, no labels are returned (the generator will only yield batches of image data, which is useful to use with model.predict_generator() , model.evaluate_generator() , etc.). Please note that in case of class_mode None, the data still needs to reside in a subdirectory of directory for it to work correctly. batch_size : Size of the batches of data (default: 32). shuffle : Whether to shuffle the data (default: True) seed : Optional random seed for shuffling and transformations. save_to_dir : None or str (default: None). This allows you to optionally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing). save_prefix : Str. Prefix to use for filenames of saved pictures (only relevant if save_to_dir is set). save_format : One of \"png\", \"jpeg\" (only relevant if save_to_dir is set). Default: \"png\". follow_links : Whether to follow symlinks inside class subdirectories (default: False). subset : Subset of data ( \"training\" or \"validation\" ) if validation_split is set in ImageDataGenerator . interpolation : Interpolation method used to resample the image if the target size is different from that of the loaded image. Supported methods are \"nearest\" , \"bilinear\" , and \"bicubic\" . If PIL version 1.1.3 or newer is installed, \"lanczos\" is also supported. If PIL version 3.4.0 or newer is installed, \"box\" and \"hamming\" are also supported. By default, \"nearest\" is used. Returns A DirectoryIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) and y is a numpy array of corresponding labels. get_random_transform get_random_transform(img_shape, seed=None) Generates random parameters for a transformation. Arguments seed : Random seed. img_shape : Tuple of integers. Shape of the image that is transformed. Returns A dictionary containing randomly chosen parameters describing the transformation. random_transform random_transform(x, seed=None) Applies a random transformation to an image. Arguments x : 3D tensor, single image. seed : Random seed. Returns A randomly transformed version of the input (same shape). standardize standardize(x) Applies the normalization configuration to a batch of inputs. Arguments x : Batch of inputs to be normalized. Returns The inputs, normalized.'), ('title', 'Image Preprocessing')]), OrderedDict([('location', 'preprocessing/image.html#image-preprocessing'), ('text', '[source]'), ('title', 'Image Preprocessing')]), OrderedDict([('location', 'preprocessing/image.html#imagedatagenerator-class'), ('text', 'keras.preprocessing.image.ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False, samplewise_std_normalization=False, zca_whitening=False, zca_epsilon=1e-06, rotation_range=0, width_shift_range=0.0, height_shift_range=0.0, brightness_range=None, shear_range=0.0, zoom_range=0.0, channel_shift_range=0.0, fill_mode=\\'nearest\\', cval=0.0, horizontal_flip=False, vertical_flip=False, rescale=None, preprocessing_function=None, data_format=None, validation_split=0.0, dtype=None) Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches). Arguments featurewise_center : Boolean. Set input mean to 0 over the dataset, feature-wise. samplewise_center : Boolean. Set each sample mean to 0. featurewise_std_normalization : Boolean. Divide inputs by std of the dataset, feature-wise. samplewise_std_normalization : Boolean. Divide each input by its std. zca_epsilon : epsilon for ZCA whitening. Default is 1e-6. zca_whitening : Boolean. Apply ZCA whitening. rotation_range : Int. Degree range for random rotations. width_shift_range : Float, 1-D array-like or int float: fraction of total width, if < 1, or pixels if >= 1. 1-D array-like: random elements from the array. int: integer number of pixels from interval (-width_shift_range, +width_shift_range) With width_shift_range=2 possible values are integers [-1, 0, +1] , same as with width_shift_range=[-1, 0, +1] , while with width_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0). height_shift_range : Float, 1-D array-like or int float: fraction of total height, if < 1, or pixels if >= 1. 1-D array-like: random elements from the array. int: integer number of pixels from interval (-height_shift_range, +height_shift_range) With height_shift_range=2 possible values are integers [-1, 0, +1] , same as with height_shift_range=[-1, 0, +1] , while with height_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0). brightness_range : Tuple or list of two floats. Range for picking a brightness shift value from. shear_range : Float. Shear Intensity (Shear angle in counter-clockwise direction in degrees) zoom_range : Float or [lower, upper]. Range for random zoom. If a float, [lower, upper] = [1-zoom_range, 1+zoom_range] . channel_shift_range : Float. Range for random channel shifts. fill_mode : One of {\"constant\", \"nearest\", \"reflect\" or \"wrap\"}. Default is \\'nearest\\'. Points outside the boundaries of the input are filled according to the given mode: \\'constant\\': kkkkkkkk|abcd|kkkkkkkk (cval=k) \\'nearest\\': aaaaaaaa|abcd|dddddddd \\'reflect\\': abcddcba|abcd|dcbaabcd \\'wrap\\': abcdabcd|abcd|abcdabcd cval : Float or Int. Value used for points outside the boundaries when fill_mode = \"constant\" . horizontal_flip : Boolean. Randomly flip inputs horizontally. vertical_flip : Boolean. Randomly flip inputs vertically. rescale : rescaling factor. Defaults to None. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (after applying all other transformations). preprocessing_function : function that will be implied on each input. The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape. data_format : Image data format, either \"channels_first\" or \"channels_last\". \"channels_last\" mode means that the images should have shape (samples, height, width, channels) , \"channels_first\" mode means that the images should have shape (samples, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json . If you never set it, then it will be \"channels_last\". validation_split : Float. Fraction of images reserved for validation (strictly between 0 and 1). dtype : Dtype to use for the generated arrays. Examples Example of using .flow(x, y) : (x_train, y_train), (x_test, y_test) = cifar10.load_data() y_train = np_utils.to_categorical(y_train, num_classes) y_test = np_utils.to_categorical(y_test, num_classes) datagen = ImageDataGenerator( featurewise_center=True, featurewise_std_normalization=True, rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True) # compute quantities required for featurewise normalization # (std, mean, and principal components if ZCA whitening is applied) datagen.fit(x_train) # fits the model on batches with real-time data augmentation: model.fit_generator(datagen.flow(x_train, y_train, batch_size=32), steps_per_epoch=len(x_train) / 32, epochs=epochs) # here\\'s a more \"manual\" example for e in range(epochs): print(\\'Epoch\\', e) batches = 0 for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32): model.fit(x_batch, y_batch) batches += 1 if batches >= len(x_train) / 32: # we need to break the loop by hand because # the generator loops indefinitely break Example of using .flow_from_directory(directory) : train_datagen = ImageDataGenerator( rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) test_datagen = ImageDataGenerator(rescale=1./255) train_generator = train_datagen.flow_from_directory( \\'data/train\\', target_size=(150, 150), batch_size=32, class_mode=\\'binary\\') validation_generator = test_datagen.flow_from_directory( \\'data/validation\\', target_size=(150, 150), batch_size=32, class_mode=\\'binary\\') model.fit_generator( train_generator, steps_per_epoch=2000, epochs=50, validation_data=validation_generator, validation_steps=800) Example of transforming images and masks together. # we create two instances with the same arguments data_gen_args = dict(featurewise_center=True, featurewise_std_normalization=True, rotation_range=90, width_shift_range=0.1, height_shift_range=0.1, zoom_range=0.2) image_datagen = ImageDataGenerator(**data_gen_args) mask_datagen = ImageDataGenerator(**data_gen_args) # Provide the same seed and keyword arguments to the fit and flow methods seed = 1 image_datagen.fit(images, augment=True, seed=seed) mask_datagen.fit(masks, augment=True, seed=seed) image_generator = image_datagen.flow_from_directory( \\'data/images\\', class_mode=None, seed=seed) mask_generator = mask_datagen.flow_from_directory( \\'data/masks\\', class_mode=None, seed=seed) # combine generators into one which yields image and masks train_generator = zip(image_generator, mask_generator) model.fit_generator( train_generator, steps_per_epoch=2000, epochs=50)'), ('title', 'ImageDataGenerator class')]), OrderedDict([('location', 'preprocessing/image.html#imagedatagenerator-methods'), ('text', ''), ('title', 'ImageDataGenerator methods')]), OrderedDict([('location', 'preprocessing/image.html#apply_transform'), ('text', \"apply_transform(x, transform_parameters) Applies a transformation to an image according to given parameters. Arguments x : 3D tensor, single image. transform_parameters : Dictionary with string - parameter pairs describing the transformation. Currently, the following parameters from the dictionary are used: 'theta' : Float. Rotation angle in degrees. 'tx' : Float. Shift in the x direction. 'ty' : Float. Shift in the y direction. 'shear' : Float. Shear angle in degrees. 'zx' : Float. Zoom in the x direction. 'zy' : Float. Zoom in the y direction. 'flip_horizontal' : Boolean. Horizontal flip. 'flip_vertical' : Boolean. Vertical flip. 'channel_shift_intencity' : Float. Channel shift intensity. 'brightness' : Float. Brightness shift intensity. Returns A transformed version of the input (same shape).\"), ('title', 'apply_transform')]), OrderedDict([('location', 'preprocessing/image.html#fit'), ('text', 'fit(x, augment=False, rounds=1, seed=None) Fits the data generator to some sample data. This computes the internal data stats related to the data-dependent transformations, based on an array of sample data. Only required if featurewise_center or featurewise_std_normalization or zca_whitening are set to True. Arguments x : Sample data. Should have rank 4. In case of grayscale data, the channels axis should have value 1, in case of RGB data, it should have value 3, and in case of RGBA data, it should have value 4. augment : Boolean (default: False). Whether to fit on randomly augmented samples. rounds : Int (default: 1). If using data augmentation ( augment=True ), this is how many augmentation passes over the data to use. seed : Int (default: None). Random seed.'), ('title', 'fit')]), OrderedDict([('location', 'preprocessing/image.html#flow'), ('text', 'flow(x, y=None, batch_size=32, shuffle=True, sample_weight=None, seed=None, save_to_dir=None, save_prefix=\\'\\', save_format=\\'png\\', subset=None) Takes data & label arrays, generates batches of augmented data. Arguments x : Input data. Numpy array of rank 4 or a tuple. If tuple, the first element should contain the images and the second element another numpy array or a list of numpy arrays that gets passed to the output without any modifications. Can be used to feed the model miscellaneous data along with the images. In case of grayscale data, the channels axis of the image array should have value 1, in case of RGB data, it should have value 3, and in case of RGBA data, it should have value 4. y : Labels. batch_size : Int (default: 32). shuffle : Boolean (default: True). sample_weight : Sample weights. seed : Int (default: None). save_to_dir : None or str (default: None). This allows you to optionally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing). save_prefix : Str (default: \\'\\' ). Prefix to use for filenames of saved pictures (only relevant if save_to_dir is set). save_format : one of \"png\", \"jpeg\" (only relevant if save_to_dir is set). Default: \"png\". subset : Subset of data ( \"training\" or \"validation\" ) if validation_split is set in ImageDataGenerator . Returns An Iterator yielding tuples of (x, y) where x is a numpy array of image data (in the case of a single image input) or a list of numpy arrays (in the case with additional inputs) and y is a numpy array of corresponding labels. If \\'sample_weight\\' is not None, the yielded tuples are of the form (x, y, sample_weight) . If y is None, only the numpy array x is returned.'), ('title', 'flow')]), OrderedDict([('location', 'preprocessing/image.html#flow_from_dataframe'), ('text', \"flow_from_dataframe(dataframe, directory, x_col='filename', y_col='class', has_ext=True, target_size=(256, 256), color_mode='rgb', classes=None, class_mode='categorical', batch_size=32, shuffle=True, seed=None, save_to_dir=None, save_prefix='', save_format='png', subset=None, interpolation='nearest') Takes the dataframe and the path to a directory and generates batches of augmented/normalized data. A simple tutorial can be found at: http://bit.ly/keras_flow_from_dataframe\"), ('title', 'flow_from_dataframe')]), OrderedDict([('location', 'preprocessing/image.html#arguments'), ('text', 'dataframe: Pandas dataframe containing the filenames of the images in a column and classes in another or column/s that can be fed as raw target data. directory: string, path to the target directory that contains all the images mapped in the dataframe. x_col: string, column in the dataframe that contains the filenames of the target images. y_col: string or list of strings,columns in the dataframe that will be the target data. has_ext: bool, True if filenames in dataframe[x_col] has filename extensions,else False. target_size: tuple of integers `(height, width)`, default: `(256, 256)`. The dimensions to which all images found will be resized. color_mode: one of \"grayscale\", \"rbg\". Default: \"rgb\". Whether the images will be converted to have 1 or 3 color channels. classes: optional list of classes (e.g. `[\\'dogs\\', \\'cats\\']`). Default: None. If not provided, the list of classes will be automatically inferred from the y_col, which will map to the label indices, will be alphanumeric). The dictionary containing the mapping from class names to class indices can be obtained via the attribute `class_indices`. class_mode: one of \"categorical\", \"binary\", \"sparse\", \"input\", \"other\" or None. Default: \"categorical\". Determines the type of label arrays that are returned: - `\"categorical\"` will be 2D one-hot encoded labels, - `\"binary\"` will be 1D binary labels, - `\"sparse\"` will be 1D integer labels, - `\"input\"` will be images identical to input images (mainly used to work with autoencoders). - `\"other\"` will be numpy array of y_col data - None, no labels are returned (the generator will only yield batches of image data, which is useful to use `model.predict_generator()`, `model.evaluate_generator()`, etc.). batch_size: size of the batches of data (default: 32). shuffle: whether to shuffle the data (default: True) seed: optional random seed for shuffling and transformations. save_to_dir: None or str (default: None). This allows you to optionally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing). save_prefix: str. Prefix to use for filenames of saved pictures (only relevant if `save_to_dir` is set). save_format: one of \"png\", \"jpeg\" (only relevant if `save_to_dir` is set). Default: \"png\". follow_links: whether to follow symlinks inside class subdirectories (default: False). subset: Subset of data (`\"training\"` or `\"validation\"`) if `validation_split` is set in `ImageDataGenerator`. interpolation: Interpolation method used to resample the image if the target size is different from that of the loaded image. Supported methods are `\"nearest\"`, `\"bilinear\"`, and `\"bicubic\"`. If PIL version 1.1.3 or newer is installed, `\"lanczos\"` is also supported. If PIL version 3.4.0 or newer is installed, `\"box\"` and `\"hamming\"` are also supported. By default, `\"nearest\"` is used. Returns A DataFrameIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) and y is a numpy array of corresponding labels.'), ('title', 'Arguments')]), OrderedDict([('location', 'preprocessing/image.html#flow_from_directory'), ('text', 'flow_from_directory(directory, target_size=(256, 256), color_mode=\\'rgb\\', classes=None, class_mode=\\'categorical\\', batch_size=32, shuffle=True, seed=None, save_to_dir=None, save_prefix=\\'\\', save_format=\\'png\\', follow_links=False, subset=None, interpolation=\\'nearest\\') Takes the path to a directory & generates batches of augmented data. Arguments directory : Path to the target directory. It should contain one subdirectory per class. Any PNG, JPG, BMP, PPM or TIF images inside each of the subdirectories directory tree will be included in the generator. See this script for more details. target_size : Tuple of integers (height, width) , default: (256, 256) . The dimensions to which all images found will be resized. color_mode : One of \"grayscale\", \"rbg\", \"rgba\". Default: \"rgb\". Whether the images will be converted to have 1, 3, or 4 channels. classes : Optional list of class subdirectories (e.g. [\\'dogs\\', \\'cats\\'] ). Default: None. If not provided, the list of classes will be automatically inferred from the subdirectory names/structure under directory , where each subdirectory will be treated as a different class (and the order of the classes, which will map to the label indices, will be alphanumeric). The dictionary containing the mapping from class names to class indices can be obtained via the attribute class_indices . class_mode : One of \"categorical\", \"binary\", \"sparse\", \"input\", or None. Default: \"categorical\". Determines the type of label arrays that are returned: \"categorical\" will be 2D one-hot encoded labels, \"binary\" will be 1D binary labels, \"sparse\" will be 1D integer labels, \"input\" will be images identical to input images (mainly used to work with autoencoders). If None, no labels are returned (the generator will only yield batches of image data, which is useful to use with model.predict_generator() , model.evaluate_generator() , etc.). Please note that in case of class_mode None, the data still needs to reside in a subdirectory of directory for it to work correctly. batch_size : Size of the batches of data (default: 32). shuffle : Whether to shuffle the data (default: True) seed : Optional random seed for shuffling and transformations. save_to_dir : None or str (default: None). This allows you to optionally specify a directory to which to save the augmented pictures being generated (useful for visualizing what you are doing). save_prefix : Str. Prefix to use for filenames of saved pictures (only relevant if save_to_dir is set). save_format : One of \"png\", \"jpeg\" (only relevant if save_to_dir is set). Default: \"png\". follow_links : Whether to follow symlinks inside class subdirectories (default: False). subset : Subset of data ( \"training\" or \"validation\" ) if validation_split is set in ImageDataGenerator . interpolation : Interpolation method used to resample the image if the target size is different from that of the loaded image. Supported methods are \"nearest\" , \"bilinear\" , and \"bicubic\" . If PIL version 1.1.3 or newer is installed, \"lanczos\" is also supported. If PIL version 3.4.0 or newer is installed, \"box\" and \"hamming\" are also supported. By default, \"nearest\" is used. Returns A DirectoryIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) and y is a numpy array of corresponding labels.'), ('title', 'flow_from_directory')]), OrderedDict([('location', 'preprocessing/image.html#get_random_transform'), ('text', 'get_random_transform(img_shape, seed=None) Generates random parameters for a transformation. Arguments seed : Random seed. img_shape : Tuple of integers. Shape of the image that is transformed. Returns A dictionary containing randomly chosen parameters describing the transformation.'), ('title', 'get_random_transform')]), OrderedDict([('location', 'preprocessing/image.html#random_transform'), ('text', 'random_transform(x, seed=None) Applies a random transformation to an image. Arguments x : 3D tensor, single image. seed : Random seed. Returns A randomly transformed version of the input (same shape).'), ('title', 'random_transform')]), OrderedDict([('location', 'preprocessing/image.html#standardize'), ('text', 'standardize(x) Applies the normalization configuration to a batch of inputs. Arguments x : Batch of inputs to be normalized. Returns The inputs, normalized.'), ('title', 'standardize')]), OrderedDict([('location', 'preprocessing/sequence.html'), ('text', \"[source] TimeseriesGenerator keras.preprocessing.sequence.TimeseriesGenerator(data, targets, length, sampling_rate=1, stride=1, start_index=0, end_index=None, shuffle=False, reverse=False, batch_size=128) Utility class for generating batches of temporal data. This class takes in a sequence of data-points gathered at equal intervals, along with time series parameters such as stride, length of history, etc., to produce batches for training/validation. Arguments data : Indexable generator (such as list or Numpy array) containing consecutive data points (timesteps). The data should be at 2D, and axis 0 is expected to be the time dimension. targets : Targets corresponding to timesteps in data . It should have same length as data . length : Length of the output sequences (in number of timesteps). sampling_rate : Period between successive individual timesteps within sequences. For rate r , timesteps data[i] , data[i-r] , ... data[i - length] are used for create a sample sequence. stride : Period between successive output sequences. For stride s , consecutive output samples would be centered around data[i] , data[i+s] , data[i+2*s] , etc. start_index : Data points earlier than start_index will not be used in the output sequences. This is useful to reserve part of the data for test or validation. end_index : Data points later than end_index will not be used in the output sequences. This is useful to reserve part of the data for test or validation. shuffle : Whether to shuffle output samples, or instead draw them in chronological order. reverse : Boolean: if true , timesteps in each output sample will be in reverse chronological order. batch_size : Number of timeseries samples in each batch (except maybe the last one). Returns A Sequence instance. Examples from keras.preprocessing.sequence import TimeseriesGenerator import numpy as np data = np.array([[i] for i in range(50)]) targets = np.array([[i] for i in range(50)]) data_gen = TimeseriesGenerator(data, targets, length=10, sampling_rate=2, batch_size=2) assert len(data_gen) == 20 batch_0 = data_gen[0] x, y = batch_0 assert np.array_equal(x, np.array([[[0], [2], [4], [6], [8]], [[1], [3], [5], [7], [9]]])) assert np.array_equal(y, np.array([[10], [11]])) pad_sequences keras.preprocessing.sequence.pad_sequences(sequences, maxlen=None, dtype='int32', padding='pre', truncating='pre', value=0.0) Pads sequences to the same length. This function transforms a list of num_samples sequences (lists of integers) into a 2D Numpy array of shape (num_samples, num_timesteps) . num_timesteps is either the maxlen argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than num_timesteps are padded with value at the end. Sequences longer than num_timesteps are truncated so that they fit the desired length. The position where padding or truncation happens is determined by the arguments padding and truncating , respectively. Pre-padding is the default. Arguments sequences : List of lists, where each element is a sequence. maxlen : Int, maximum length of all sequences. dtype : Type of the output sequences. To pad sequences with variable length strings, you can use object . padding : String, 'pre' or 'post': pad either before or after each sequence. truncating : String, 'pre' or 'post': remove values from sequences larger than maxlen , either at the beginning or at the end of the sequences. value : Float or String, padding value. Returns x : Numpy array with shape (len(sequences), maxlen) Raises ValueError : In case of invalid values for truncating or padding , or in case of invalid shape for a sequences entry. skipgrams keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size, window_size=4, negative_samples=1.0, shuffle=True, categorical=False, sampling_table=None, seed=None) Generates skipgram word pairs. This function transforms a sequence of word indexes (list of integers) into tuples of words of the form: (word, word in the same window), with label 1 (positive samples). (word, random word from the vocabulary), with label 0 (negative samples). Read more about Skipgram in this gnomic paper by Mikolov et al.: Efficient Estimation of Word Representations in Vector Space Arguments sequence : A word sequence (sentence), encoded as a list of word indices (integers). If using a sampling_table , word indices are expected to match the rank of the words in a reference dataset (e.g. 10 would encode the 10-th most frequently occurring token). Note that index 0 is expected to be a non-word and will be skipped. vocabulary_size : Int, maximum possible word index + 1 window_size : Int, size of sampling windows (technically half-window). The window of a word w_i will be [i - window_size, i + window_size+1] . negative_samples : Float >= 0. 0 for no negative (i.e. random) samples. 1 for same number as positive samples. shuffle : Whether to shuffle the word couples before returning them. categorical : bool. if False, labels will be integers (eg. [0, 1, 1 .. ] ), if True , labels will be categorical, e.g. [[1,0],[0,1],[0,1] .. ] . sampling_table : 1D array of size vocabulary_size where the entry i encodes the probability to sample a word of rank i. seed : Random seed. Returns couples, labels: where couples are int pairs and labels are either 0 or 1. Note By convention, index 0 in the vocabulary is a non-word and will be skipped. make_sampling_table keras.preprocessing.sequence.make_sampling_table(size, sampling_factor=1e-05) Generates a word rank-based probabilistic sampling table. Used for generating the sampling_table argument for skipgrams . sampling_table[i] is the probability of sampling the word i-th most common word in a dataset (more common words should be sampled less frequently, for balance). The sampling probabilities are generated according to the sampling distribution used in word2vec: p(word) = (min(1, sqrt(word_frequency / sampling_factor) / (word_frequency / sampling_factor))) We assume that the word frequencies follow Zipf's law (s=1) to derive a numerical approximation of frequency(rank): frequency(rank) ~ 1/(rank * (log(rank) + gamma) + 1/2 - 1/(12*rank)) where gamma is the Euler-Mascheroni constant. Arguments size : Int, number of possible words to sample. sampling_factor : The sampling factor in the word2vec formula. Returns A 1D Numpy array of length size where the ith entry is the probability that a word of rank i should be sampled.\"), ('title', 'Sequence Preprocessing')]), OrderedDict([('location', 'preprocessing/sequence.html#timeseriesgenerator'), ('text', 'keras.preprocessing.sequence.TimeseriesGenerator(data, targets, length, sampling_rate=1, stride=1, start_index=0, end_index=None, shuffle=False, reverse=False, batch_size=128) Utility class for generating batches of temporal data. This class takes in a sequence of data-points gathered at equal intervals, along with time series parameters such as stride, length of history, etc., to produce batches for training/validation. Arguments data : Indexable generator (such as list or Numpy array) containing consecutive data points (timesteps). The data should be at 2D, and axis 0 is expected to be the time dimension. targets : Targets corresponding to timesteps in data . It should have same length as data . length : Length of the output sequences (in number of timesteps). sampling_rate : Period between successive individual timesteps within sequences. For rate r , timesteps data[i] , data[i-r] , ... data[i - length] are used for create a sample sequence. stride : Period between successive output sequences. For stride s , consecutive output samples would be centered around data[i] , data[i+s] , data[i+2*s] , etc. start_index : Data points earlier than start_index will not be used in the output sequences. This is useful to reserve part of the data for test or validation. end_index : Data points later than end_index will not be used in the output sequences. This is useful to reserve part of the data for test or validation. shuffle : Whether to shuffle output samples, or instead draw them in chronological order. reverse : Boolean: if true , timesteps in each output sample will be in reverse chronological order. batch_size : Number of timeseries samples in each batch (except maybe the last one). Returns A Sequence instance. Examples from keras.preprocessing.sequence import TimeseriesGenerator import numpy as np data = np.array([[i] for i in range(50)]) targets = np.array([[i] for i in range(50)]) data_gen = TimeseriesGenerator(data, targets, length=10, sampling_rate=2, batch_size=2) assert len(data_gen) == 20 batch_0 = data_gen[0] x, y = batch_0 assert np.array_equal(x, np.array([[[0], [2], [4], [6], [8]], [[1], [3], [5], [7], [9]]])) assert np.array_equal(y, np.array([[10], [11]]))'), ('title', 'TimeseriesGenerator')]), OrderedDict([('location', 'preprocessing/sequence.html#pad_sequences'), ('text', \"keras.preprocessing.sequence.pad_sequences(sequences, maxlen=None, dtype='int32', padding='pre', truncating='pre', value=0.0) Pads sequences to the same length. This function transforms a list of num_samples sequences (lists of integers) into a 2D Numpy array of shape (num_samples, num_timesteps) . num_timesteps is either the maxlen argument if provided, or the length of the longest sequence otherwise. Sequences that are shorter than num_timesteps are padded with value at the end. Sequences longer than num_timesteps are truncated so that they fit the desired length. The position where padding or truncation happens is determined by the arguments padding and truncating , respectively. Pre-padding is the default. Arguments sequences : List of lists, where each element is a sequence. maxlen : Int, maximum length of all sequences. dtype : Type of the output sequences. To pad sequences with variable length strings, you can use object . padding : String, 'pre' or 'post': pad either before or after each sequence. truncating : String, 'pre' or 'post': remove values from sequences larger than maxlen , either at the beginning or at the end of the sequences. value : Float or String, padding value. Returns x : Numpy array with shape (len(sequences), maxlen) Raises ValueError : In case of invalid values for truncating or padding , or in case of invalid shape for a sequences entry.\"), ('title', 'pad_sequences')]), OrderedDict([('location', 'preprocessing/sequence.html#skipgrams'), ('text', 'keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size, window_size=4, negative_samples=1.0, shuffle=True, categorical=False, sampling_table=None, seed=None) Generates skipgram word pairs. This function transforms a sequence of word indexes (list of integers) into tuples of words of the form: (word, word in the same window), with label 1 (positive samples). (word, random word from the vocabulary), with label 0 (negative samples). Read more about Skipgram in this gnomic paper by Mikolov et al.: Efficient Estimation of Word Representations in Vector Space Arguments sequence : A word sequence (sentence), encoded as a list of word indices (integers). If using a sampling_table , word indices are expected to match the rank of the words in a reference dataset (e.g. 10 would encode the 10-th most frequently occurring token). Note that index 0 is expected to be a non-word and will be skipped. vocabulary_size : Int, maximum possible word index + 1 window_size : Int, size of sampling windows (technically half-window). The window of a word w_i will be [i - window_size, i + window_size+1] . negative_samples : Float >= 0. 0 for no negative (i.e. random) samples. 1 for same number as positive samples. shuffle : Whether to shuffle the word couples before returning them. categorical : bool. if False, labels will be integers (eg. [0, 1, 1 .. ] ), if True , labels will be categorical, e.g. [[1,0],[0,1],[0,1] .. ] . sampling_table : 1D array of size vocabulary_size where the entry i encodes the probability to sample a word of rank i. seed : Random seed. Returns couples, labels: where couples are int pairs and labels are either 0 or 1. Note By convention, index 0 in the vocabulary is a non-word and will be skipped.'), ('title', 'skipgrams')]), OrderedDict([('location', 'preprocessing/sequence.html#make_sampling_table'), ('text', \"keras.preprocessing.sequence.make_sampling_table(size, sampling_factor=1e-05) Generates a word rank-based probabilistic sampling table. Used for generating the sampling_table argument for skipgrams . sampling_table[i] is the probability of sampling the word i-th most common word in a dataset (more common words should be sampled less frequently, for balance). The sampling probabilities are generated according to the sampling distribution used in word2vec: p(word) = (min(1, sqrt(word_frequency / sampling_factor) / (word_frequency / sampling_factor))) We assume that the word frequencies follow Zipf's law (s=1) to derive a numerical approximation of frequency(rank): frequency(rank) ~ 1/(rank * (log(rank) + gamma) + 1/2 - 1/(12*rank)) where gamma is the Euler-Mascheroni constant. Arguments size : Int, number of possible words to sample. sampling_factor : The sampling factor in the word2vec formula. Returns A 1D Numpy array of length size where the ith entry is the probability that a word of rank i should be sampled.\"), ('title', 'make_sampling_table')]), OrderedDict([('location', 'preprocessing/text.html'), ('text', 'Text Preprocessing [source] Tokenizer keras.preprocessing.text.Tokenizer(num_words=None, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\', char_level=False, oov_token=None, document_count=0) Text tokenization utility class. This class allows to vectorize a text corpus, by turning each text into either a sequence of integers (each integer being the index of a token in a dictionary) or into a vector where the coefficient for each token could be binary, based on word count, based on tf-idf... Arguments num_words : the maximum number of words to keep, based on word frequency. Only the most common num_words words will be kept. filters : a string where each element is a character that will be filtered from the texts. The default is all punctuation, plus tabs and line breaks, minus the \\' character. lower : boolean. Whether to convert the texts to lowercase. split : str. Separator for word splitting. char_level : if True, every character will be treated as a token. oov_token : if given, it will be added to word_index and used to replace out-of-vocabulary words during text_to_sequence calls By default, all punctuation is removed, turning the texts into space-separated sequences of words (words maybe include the \\' character). These sequences are then split into lists of tokens. They will then be indexed or vectorized. 0 is a reserved index that won\\'t be assigned to any word. hashing_trick keras.preprocessing.text.hashing_trick(text, n, hash_function=None, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\') Converts a text to a sequence of indexes in a fixed-size hashing space. Arguments text : Input text (string). n : Dimension of the hashing space. hash_function : defaults to python hash function, can be \\'md5\\' or any function that takes in input a string and returns a int. Note that \\'hash\\' is not a stable hashing function, so it is not consistent across different runs, while \\'md5\\' is a stable hashing function. filters : list (or concatenation) of characters to filter out, such as punctuation. Default: `!\"#$%&()*+,-./:;<=>?@[\\\\]^_ {|}~ ``, includes basic punctuation, tabs, and newlines. lower : boolean. Whether to set the text to lowercase. split : str. Separator for word splitting. Returns A list of integer word indices (unicity non-guaranteed). 0 is a reserved index that won\\'t be assigned to any word. Two or more words may be assigned to the same index, due to possible collisions by the hashing function. The probability of a collision is in relation to the dimension of the hashing space and the number of distinct objects. one_hot keras.preprocessing.text.one_hot(text, n, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\') One-hot encodes a text into a list of word indexes of size n. This is a wrapper to the hashing_trick function using hash as the hashing function; unicity of word to index mapping non-guaranteed. Arguments text : Input text (string). n : int. Size of vocabulary. filters : list (or concatenation) of characters to filter out, such as punctuation. Default: `!\"#$%&()*+,-./:;<=>?@[\\\\]^_ {|}~ ``, includes basic punctuation, tabs, and newlines. lower : boolean. Whether to set the text to lowercase. split : str. Separator for word splitting. Returns List of integers in [1, n]. Each integer encodes a word (unicity non-guaranteed). text_to_word_sequence keras.preprocessing.text.text_to_word_sequence(text, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\') Converts a text to a sequence of words (or tokens). Arguments text : Input text (string). filters : list (or concatenation) of characters to filter out, such as punctuation. Default: `!\"#$%&()*+,-./:;<=>?@[\\\\]^_ {|}~ ``, includes basic punctuation, tabs, and newlines. lower : boolean. Whether to convert the input to lowercase. split : str. Separator for word splitting. Returns A list of words (or tokens).'), ('title', 'Text Preprocessing')]), OrderedDict([('location', 'preprocessing/text.html#text-preprocessing'), ('text', '[source]'), ('title', 'Text Preprocessing')]), OrderedDict([('location', 'preprocessing/text.html#tokenizer'), ('text', 'keras.preprocessing.text.Tokenizer(num_words=None, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\', char_level=False, oov_token=None, document_count=0) Text tokenization utility class. This class allows to vectorize a text corpus, by turning each text into either a sequence of integers (each integer being the index of a token in a dictionary) or into a vector where the coefficient for each token could be binary, based on word count, based on tf-idf... Arguments num_words : the maximum number of words to keep, based on word frequency. Only the most common num_words words will be kept. filters : a string where each element is a character that will be filtered from the texts. The default is all punctuation, plus tabs and line breaks, minus the \\' character. lower : boolean. Whether to convert the texts to lowercase. split : str. Separator for word splitting. char_level : if True, every character will be treated as a token. oov_token : if given, it will be added to word_index and used to replace out-of-vocabulary words during text_to_sequence calls By default, all punctuation is removed, turning the texts into space-separated sequences of words (words maybe include the \\' character). These sequences are then split into lists of tokens. They will then be indexed or vectorized. 0 is a reserved index that won\\'t be assigned to any word.'), ('title', 'Tokenizer')]), OrderedDict([('location', 'preprocessing/text.html#hashing_trick'), ('text', 'keras.preprocessing.text.hashing_trick(text, n, hash_function=None, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\') Converts a text to a sequence of indexes in a fixed-size hashing space. Arguments text : Input text (string). n : Dimension of the hashing space. hash_function : defaults to python hash function, can be \\'md5\\' or any function that takes in input a string and returns a int. Note that \\'hash\\' is not a stable hashing function, so it is not consistent across different runs, while \\'md5\\' is a stable hashing function. filters : list (or concatenation) of characters to filter out, such as punctuation. Default: `!\"#$%&()*+,-./:;<=>?@[\\\\]^_ {|}~ ``, includes basic punctuation, tabs, and newlines. lower : boolean. Whether to set the text to lowercase. split : str. Separator for word splitting. Returns A list of integer word indices (unicity non-guaranteed). 0 is a reserved index that won\\'t be assigned to any word. Two or more words may be assigned to the same index, due to possible collisions by the hashing function. The probability of a collision is in relation to the dimension of the hashing space and the number of distinct objects.'), ('title', 'hashing_trick')]), OrderedDict([('location', 'preprocessing/text.html#one_hot'), ('text', 'keras.preprocessing.text.one_hot(text, n, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\') One-hot encodes a text into a list of word indexes of size n. This is a wrapper to the hashing_trick function using hash as the hashing function; unicity of word to index mapping non-guaranteed. Arguments text : Input text (string). n : int. Size of vocabulary. filters : list (or concatenation) of characters to filter out, such as punctuation. Default: `!\"#$%&()*+,-./:;<=>?@[\\\\]^_ {|}~ ``, includes basic punctuation, tabs, and newlines. lower : boolean. Whether to set the text to lowercase. split : str. Separator for word splitting. Returns List of integers in [1, n]. Each integer encodes a word (unicity non-guaranteed).'), ('title', 'one_hot')]), OrderedDict([('location', 'preprocessing/text.html#text_to_word_sequence'), ('text', 'keras.preprocessing.text.text_to_word_sequence(text, filters=\\'!\"#$%&()*+,-./:;<=>?@[\\\\]^_`{|}~ \\', lower=True, split=\\' \\') Converts a text to a sequence of words (or tokens). Arguments text : Input text (string). filters : list (or concatenation) of characters to filter out, such as punctuation. Default: `!\"#$%&()*+,-./:;<=>?@[\\\\]^_ {|}~ ``, includes basic punctuation, tabs, and newlines. lower : boolean. Whether to convert the input to lowercase. split : str. Separator for word splitting. Returns A list of words (or tokens).'), ('title', 'text_to_word_sequence')])])])" }
    
Offset 1230, 31 lines modifiedOffset 1230, 36 lines modified
1230 ········{1230 ········{
1231 ············"location":·"backend.html#backend",1231 ············"location":·"backend.html#backend",
1232 ············"text":·"keras.backend.backend()·Publicly·accessible·method·for·determining·the·current·backend.·Returns·String,·the·name·of·the·backend·Keras·is·currently·using.·Example·>>>·keras.backend.backend()·'tensorflow'",1232 ············"text":·"keras.backend.backend()·Publicly·accessible·method·for·determining·the·current·backend.·Returns·String,·the·name·of·the·backend·Keras·is·currently·using.·Example·>>>·keras.backend.backend()·'tensorflow'",
1233 ············"title":·"backend"1233 ············"title":·"backend"
1234 ········},1234 ········},
1235 ········{1235 ········{
1236 ············"location":·"callbacks.html",1236 ············"location":·"callbacks.html",
1237 ············"text":·"Usage·of·callbacks·A·callback·is·a·set·of·functions·to·be·applied·at·given·stages·of·the·training·procedure.·You·can·use·callbacks·to·get·a·view·on·internal·states·and·statistics·of·the·model·during·training.·You·can·pass·a·list·of·callbacks·(as·the·keyword·argument·callbacks·)·to·the·.fit()·method·of·the·Sequential·or·Model·classes.·The·relevant·methods·of·the·callbacks·will·then·be·called·at·each·stage·of·the·training.·[source]·CSVLogger·keras.callbacks.CSVLogger(filename,·separator=',',·append=False)·Callback·that·streams·epoch·results·to·a·csv·file.·Supports·all·values·that·can·be·represented·as·a·string,·including·1D·iterables·such·as·np.ndarray.·Example·csv_logger·=·CSVLogger('training.log')·model.fit(X_train,·Y_train,·callbacks=[csv_logger])·Arguments·filename·:·filename·of·the·csv·file,·e.g.·'run/log.csv'.·separator·:·string·used·to·separate·elements·in·the·csv·file.·append·:·True:·append·if·file·exists·(useful·for·continuing·training).·False:·overwrite·existing·file,·[source]·LambdaCallback·keras.callbacks.LambdaCallback(on_epoch_begin=None,·on_epoch_end=None,·on_batch_begin=None,·on_batch_end=None,·on_train_begin=None,·on_train_end=None)·Callback·for·creating·simple,·custom·callbacks·on-the-fly.·This·callback·is·constructed·with·anonymous·functions·that·will·be·called·at·the·appropriate·time.·Note·that·the·callbacks·expects·positional·arguments,·as:·on_epoch_begin·and·on_epoch_end·expect·two·positional·arguments:·epoch·,·logs[·...·truncated·by·diffoscope;·len:·11554,·SHA:·6065bf115a8da85f77959b37e0ac7b395ffdd86fdb5dd7acd748f001bf64f8c3·...·]·Create·a·callback·You·can·create·a·custom·callback·by·extending·the·base·class·keras.callbacks.Callback·.·A·callback·has·access·to·its·associated·model·through·the·class·property·self.model·.·Here's·a·simple·example·saving·a·list·of·losses·over·each·batch·during·training:·class·LossHistory(keras.callbacks.Callback):·def·on_train_begin(self,·logs={}):·self.losses·=·[]·def·on_batch_end(self,·batch,·logs={}):·self.losses.append(logs.get('loss'))·Example:·recording·loss·history·class·LossHistory(keras.callbacks.Callback):·def·on_train_begin(self,·logs={}):·self.losses·=·[]·def·on_batch_end(self,·batch,·logs={}):·self.losses.append(logs.get('loss'))·model·=·Sequential()·model.add(Dense(10,·input_dim=784,·kernel_initializer='uniform'))·model.add(Activation('softmax'))·model.compile(loss='categorical_crossentropy',·optimizer='rmsprop')·history·=·LossHistory()·model.fit(x_train,·y_train,·batch_size=128,·epochs=20,·verbose=0,·callbacks=[history])·print(history.losses)·#·outputs·'''·[0.66047596406559383,·0.3547245744908703,·...,·0.25953155204159617,·0.25901699725311789]·'''·Example:·model·checkpoints·from·keras.callbacks·import·ModelCheckpoint·model·=·Sequential()·model.add(Dense(10,·input_dim=784,·kernel_initializer='uniform'))·model.add(Activation('softmax'))·model.compile(loss='categorical_crossentropy',·optimizer='rmsprop')·'''·saves·the·model·weights·after·each·epoch·if·the·validation·loss·decreased·'''·checkpointer·=·ModelCheckpoint(filepath='/tmp/weights.hdf5',·verbose=1,·save_best_only=True)·model.fit(x_train,·y_train,·batch_size=128,·epochs=20,·verbose=0,·validation_data=(X_test,·Y_test),·callbacks=[checkpointer])",1237 ············"text":·"Usage·of·callbacks·A·callback·is·a·set·of·functions·to·be·applied·at·given·stages·of·the·training·procedure.·You·can·use·callbacks·to·get·a·view·on·internal·states·and·statistics·of·the·model·during·training.·You·can·pass·a·list·of·callbacks·(as·the·keyword·argument·callbacks·)·to·the·.fit()·method·of·the·Sequential·or·Model·classes.·The·relevant·methods·of·the·callbacks·will·then·be·called·at·each·stage·of·the·training.·[source]·Callback·keras.callbacks.Callback()·Abstract·base·class·used·to·build·new·callbacks.·Properties·params·:·dict.·Training·parameters·(eg.·verbosity,·batch·size,·number·of·epochs...).·model·:·instance·of·keras.models.Model·.·Reference·of·the·model·being·trained.·The·logs·dictionary·that·callback·methods·take·as·argument·will·contain·keys·for·quantities·relevant·to·the·current·batch·or·epoch.·Currently,·the·.fit()·method·of·the·Sequential·model·class·will·include·the·following·quantities·in·the·logs·that·it·passes·to·its·callbacks:·on_epoch_end:·logs·include·acc·and·loss·,·and·optionally·include·val_loss·(if·validation·is·enabled·in·fit·),·and·val_acc·(if·validation·and·accuracy·monitoring·are·enabled).·on_batch_begin:·logs·include·size·,·the·number·of·samples·in·the·current·batch.·on_batch_end:·logs·include·loss·,·and·optionally·acc·(if·accuracy·monitoring·is·enabled).·[source]·BaseLogger·keras.callbacks.BaseLogger(stateful_metrics=None)·Callback·that·accumulates·epoch·averages·of·metrics.·This·callback·is·automa[·...·truncated·by·diffoscope;·len:·11554,·SHA:·ef1b9f5a546696aa6e1193a9fe993930d01bc1c548d1b82ab689eed2ddb78fcd·...·]·Create·a·callback·You·can·create·a·custom·callback·by·extending·the·base·class·keras.callbacks.Callback·.·A·callback·has·access·to·its·associated·model·through·the·class·property·self.model·.·Here's·a·simple·example·saving·a·list·of·losses·over·each·batch·during·training:·class·LossHistory(keras.callbacks.Callback):·def·on_train_begin(self,·logs={}):·self.losses·=·[]·def·on_batch_end(self,·batch,·logs={}):·self.losses.append(logs.get('loss'))·Example:·recording·loss·history·class·LossHistory(keras.callbacks.Callback):·def·on_train_begin(self,·logs={}):·self.losses·=·[]·def·on_batch_end(self,·batch,·logs={}):·self.losses.append(logs.get('loss'))·model·=·Sequential()·model.add(Dense(10,·input_dim=784,·kernel_initializer='uniform'))·model.add(Activation('softmax'))·model.compile(loss='categorical_crossentropy',·optimizer='rmsprop')·history·=·LossHistory()·model.fit(x_train,·y_train,·batch_size=128,·epochs=20,·verbose=0,·callbacks=[history])·print(history.losses)·#·outputs·'''·[0.66047596406559383,·0.3547245744908703,·...,·0.25953155204159617,·0.25901699725311789]·'''·Example:·model·checkpoints·from·keras.callbacks·import·ModelCheckpoint·model·=·Sequential()·model.add(Dense(10,·input_dim=784,·kernel_initializer='uniform'))·model.add(Activation('softmax'))·model.compile(loss='categorical_crossentropy',·optimizer='rmsprop')·'''·saves·the·model·weights·after·each·epoch·if·the·validation·loss·decreased·'''·checkpointer·=·ModelCheckpoint(filepath='/tmp/weights.hdf5',·verbose=1,·save_best_only=True)·model.fit(x_train,·y_train,·batch_size=128,·epochs=20,·verbose=0,·validation_data=(X_test,·Y_test),·callbacks=[checkpointer])",
1238 ············"title":·"Callbacks"1238 ············"title":·"Callbacks"
1239 ········},1239 ········},
1240 ········{1240 ········{
1241 ············"location":·"callbacks.html#usage-of-callbacks",1241 ············"location":·"callbacks.html#usage-of-callbacks",
1242 ············"text":·"A·callback·is·a·set·of·functions·to·be·applied·at·given·stages·of·the·training·procedure.·You·can·use·callbacks·to·get·a·view·on·internal·states·and·statistics·of·the·model·during·training.·You·can·pass·a·list·of·callbacks·(as·the·keyword·argument·callbacks·)·to·the·.fit()·method·of·the·Sequential·or·Model·classes.·The·relevant·methods·of·the·callbacks·will·then·be·called·at·each·stage·of·the·training.·[source]",1242 ············"text":·"A·callback·is·a·set·of·functions·to·be·applied·at·given·stages·of·the·training·procedure.·You·can·use·callbacks·to·get·a·view·on·internal·states·and·statistics·of·the·model·during·training.·You·can·pass·a·list·of·callbacks·(as·the·keyword·argument·callbacks·)·to·the·.fit()·method·of·the·Sequential·or·Model·classes.·The·relevant·methods·of·the·callbacks·will·then·be·called·at·each·stage·of·the·training.·[source]",
1243 ············"title":·"Usage·of·callbacks"1243 ············"title":·"Usage·of·callbacks"
1244 ········},1244 ········},
1245 ········{1245 ········{
1246 ············"location":·"callbacks.html#csvlogger",1246 ············"location":·"callbacks.html#callback",
1247 ············"text":·"keras.callbacks.CSVLogger(filename,·separator=',',·append=False)·Callback·that·streams·epoch·results·to·a·csv·file.·Supports·all·values·that·can·be·represented·as·a·string,·including·1D·iterables·such·as·np.ndarray.·Example·csv_logger·=·CSVLogger('training.log')·model.fit(X_train,·Y_train,·callbacks=[csv_logger])·Arguments·filename·:·filename·of·the·csv·file,·e.g.·'run/log.csv'.·separator·:·string·used·to·separate·elements·in·the·csv·file.·append·:·True:·append·if·file·exists·(useful·for·continuing·training).·False:·overwrite·existing·file,·[source]",1247 ············"text":·"keras.callbacks.Callback()·Abstract·base·class·used·to·build·new·callbacks.·Properties·params·:·dict.·Training·parameters·(eg.·verbosity,·batch·size,·number·of·epochs...).·model·:·instance·of·keras.models.Model·.·Reference·of·the·model·being·trained.·The·logs·dictionary·that·callback·methods·take·as·argument·will·contain·keys·for·quantities·relevant·to·the·current·batch·or·epoch.·Currently,·the·.fit()·method·of·the·Sequential·model·class·will·include·the·following·quantities·in·the·logs·that·it·passes·to·its·callbacks:·on_epoch_end:·logs·include·acc·and·loss·,·and·optionally·include·val_loss·(if·validation·is·enabled·in·fit·),·and·val_acc·(if·validation·and·accuracy·monitoring·are·enabled).·on_batch_begin:·logs·include·size·,·the·number·of·samples·in·the·current·batch.·on_batch_end:·logs·include·loss·,·and·optionally·acc·(if·accuracy·monitoring·is·enabled).·[source]",
1248 ············"title":·"CSVLogger"1248 ············"title":·"Callback"
1249 ········},1249 ········},
1250 ········{1250 ········{
1251 ············"location":·"callbacks.html#lambdacallback",1251 ············"location":·"callbacks.html#baselogger",
1252 ············"text":·"keras.callbacks.LambdaCallback(on_epoch_begin=None,·on_epoch_end=None,·on_batch_begin=None,·on_batch_end=None,·on_train_begin=None,·on_train_end=None)·Callback·for·creating·simple,·custom·callbacks·on-the-fly.·This·callback·is·constructed·with·anonymous·functions·that·will·be·called·at·the·appropriate·time.·Note·that·the·callbacks·expects·positional·arguments,·as:·on_epoch_begin·and·on_epoch_end·expect·two·positional·arguments:·epoch·,·logs·on_batch_begin·and·on_batch_end·expect·two·positional·arguments:·batch·,·logs·on_train_begin·and·on_train_end·expect·one·positional·argument:·logs·Arguments·on_epoch_begin·:·called·at·the·beginning·of·every·epoch.·on_epoch_end·:·called·at·the·end·of·every·epoch.·on_batch_begin·:·called·at·the·beginning·of·every·batch.·on_batch_end·:·called·at·the·end·of·every·batch.·on_train_begin·:·called·at·the·beginning·of·model·training.·on_train_end·:·called·at·the·end·of·model·training.·Example·#·Print·the·batch·number·at·the·beginning·of·every·batch.·batch_print_callback·=·LambdaCallback(·on_batch_[·...·truncated·by·diffoscope;·len:·719,·SHA:·fd9128fa6e3a0bcd532dcc0ec751df2a9a37ae0f2d6605b584f4289f2efedacb·...·]·[source]",1252 ············"text":·"keras.callbacks.BaseLogger(stateful_metrics=None)·Callback·that·accumulates·epoch·averages·of·metrics.·This·callback·is·automatically·applied·to·every·Keras·model.·Arguments·stateful_metrics·:·Iterable·of·string·names·of·metrics·that·should·not·be·averaged·over·an·epoch.·Metrics·in·this·list·will·be·logged·as-is·in·on_epoch_end·.·All·others·will·be·averaged·in·on_epoch_end·.·[source]",
1253 ············"title":·"LambdaCallback"1253 ············"title":·"BaseLogger"
 1254 ········},
 1255 ········{
 1256 ············"location":·"callbacks.html#terminateonnan",
 1257 ············"text":·"keras.callbacks.TerminateOnNaN()·Callback·that·terminates·training·when·a·NaN·loss·is·encountered.·[source]",
 1258 ············"title":·"TerminateOnNaN"
1254 ········},1259 ········},
1255 ········{1260 ········{
1256 ············"location":·"callbacks.html#progbarlogger",1261 ············"location":·"callbacks.html#progbarlogger",
1257 ············"text":·"keras.callbacks.ProgbarLogger(count_mode='samples',·stateful_metrics=None)·Callback·that·prints·metrics·to·stdout.·Arguments·count_mode·:·One·of·\"steps\"·or·\"samples\".·Whether·the·progress·bar·should·count·samples·seen·or·steps·(batches)·seen.·stateful_metrics·:·Iterable·of·string·names·of·metrics·that·should·not·be·averaged·over·an·epoch.·Metrics·in·this·list·will·be·logged·as-is.·All·others·will·be·averaged·over·time·(e.g.·loss,·etc).·Raises·ValueError·:·In·case·of·invalid·count_mode·.·[source]",1262 ············"text":·"keras.callbacks.ProgbarLogger(count_mode='samples',·stateful_metrics=None)·Callback·that·prints·metrics·to·stdout.·Arguments·count_mode·:·One·of·\"steps\"·or·\"samples\".·Whether·the·progress·bar·should·count·samples·seen·or·steps·(batches)·seen.·stateful_metrics·:·Iterable·of·string·names·of·metrics·that·should·not·be·averaged·over·an·epoch.·Metrics·in·this·list·will·be·logged·as-is.·All·others·will·be·averaged·over·time·(e.g.·loss,·etc).·Raises·ValueError·:·In·case·of·invalid·count_mode·.·[source]",
1258 ············"title":·"ProgbarLogger"1263 ············"title":·"ProgbarLogger"
1259 ········},1264 ········},
1260 ········{1265 ········{
Offset 1289, 27 lines modifiedOffset 1294, 22 lines modified
1289 ········},1294 ········},
1290 ········{1295 ········{
1291 ············"location":·"callbacks.html#reducelronplateau",1296 ············"location":·"callbacks.html#reducelronplateau",
1292 ············"text":·"keras.callbacks.ReduceLROnPlateau(monitor='val_loss',·factor=0.1,·patience=10,·verbose=0,·mode='auto',·min_delta=0.0001,·cooldown=0,·min_lr=0)·Reduce·learning·rate·when·a·metric·has·stopped·improving.·Models·often·benefit·from·reducing·the·learning·rate·by·a·factor·of·2-10·once·learning·stagnates.·This·callback·monitors·a·quantity·and·if·no·improvement·is·seen·for·a·'patience'·number·of·epochs,·the·learning·rate·is·reduced.·Example·reduce_lr·=·ReduceLROnPlateau(monitor='val_loss',·factor=0.2,·patience=5,·min_lr=0.001)·model.fit(X_train,·Y_train,·callbacks=[reduce_lr])·Arguments·monitor·:·quantity·to·be·monitored.·factor·:·factor·by·which·the·learning·rate·will·be·reduced.·new_lr·=·lr·*·factor·patience·:·number·of·epochs·with·no·improvement·after·which·learning·rate·will·be·reduced.·verbose·:·int.·0:·quiet,·1:·update·messages.·mode·:·one·of·{auto,·min,·max}.·In·min·mode,·lr·will·be·reduced·when·the·quantity·monitored·has·stopped·decreasing;·in·max·mode·it·will·be·reduced·when·the·quantity·monitored·has·stopped·increasing;·in·auto·mode,·the·direction·is·automatically·inferred·from·the·name·of·the·monitored·quantity.·min_delta·:·threshold·for·measuring·the·new·optimum,·to·only·focus·on·significant·changes.·cooldown·:·number·of·epochs·to·wait·before·resuming·normal·operation·after·lr·has·been·reduced.·min_lr·:·lower·bound·on·the·learning·rate.·[source]",1297 ············"text":·"keras.callbacks.ReduceLROnPlateau(monitor='val_loss',·factor=0.1,·patience=10,·verbose=0,·mode='auto',·min_delta=0.0001,·cooldown=0,·min_lr=0)·Reduce·learning·rate·when·a·metric·has·stopped·improving.·Models·often·benefit·from·reducing·the·learning·rate·by·a·factor·of·2-10·once·learning·stagnates.·This·callback·monitors·a·quantity·and·if·no·improvement·is·seen·for·a·'patience'·number·of·epochs,·the·learning·rate·is·reduced.·Example·reduce_lr·=·ReduceLROnPlateau(monitor='val_loss',·factor=0.2,·patience=5,·min_lr=0.001)·model.fit(X_train,·Y_train,·callbacks=[reduce_lr])·Arguments·monitor·:·quantity·to·be·monitored.·factor·:·factor·by·which·the·learning·rate·will·be·reduced.·new_lr·=·lr·*·factor·patience·:·number·of·epochs·with·no·improvement·after·which·learning·rate·will·be·reduced.·verbose·:·int.·0:·quiet,·1:·update·messages.·mode·:·one·of·{auto,·min,·max}.·In·min·mode,·lr·will·be·reduced·when·the·quantity·monitored·has·stopped·decreasing;·in·max·mode·it·will·be·reduced·when·the·quantity·monitored·has·stopped·increasing;·in·auto·mode,·the·direction·is·automatically·inferred·from·the·name·of·the·monitored·quantity.·min_delta·:·threshold·for·measuring·the·new·optimum,·to·only·focus·on·significant·changes.·cooldown·:·number·of·epochs·to·wait·before·resuming·normal·operation·after·lr·has·been·reduced.·min_lr·:·lower·bound·on·the·learning·rate.·[source]",
1293 ············"title":·"ReduceLROnPlateau"1298 ············"title":·"ReduceLROnPlateau"
1294 ········},1299 ········},
1295 ········{1300 ········{
1296 ············"location":·"callbacks.html#callback",1301 ············"location":·"callbacks.html#csvlogger",
1297 ············"text":·"keras.callbacks.Callback()·Abstract·base·class·used·to·build·new·callbacks.·Properties·params·:·dict.·Training·parameters·(eg.·verbosity,·batch·size,·number·of·epochs...).·model·:·instance·of·keras.models.Model·.·Reference·of·the·model·being·trained.·The·logs·dictionary·that·callback·methods·take·as·argument·will·contain·keys·for·quantities·relevant·to·the·current·batch·or·epoch.·Currently,·the·.fit()·method·of·the·Sequential·model·class·will·include·the·following·quantities·in·the·logs·that·it·passes·to·its·callbacks:·on_epoch_end:·logs·include·acc·and·loss·,·and·optionally·include·val_loss·(if·validation·is·enabled·in·fit·),·and·val_acc·(if·validation·and·accuracy·monitoring·are·enabled).·on_batch_begin:·logs·include·size·,·the·number·of·samples·in·the·current·batch.·on_batch_end:·logs·include·loss·,·and·optionally·acc·(if·accuracy·monitoring·is·enabled).·[source]",1302 ············"text":·"keras.callbacks.CSVLogger(filename,·separator=',',·append=False)·Callback·that·streams·epoch·results·to·a·csv·file.·Supports·all·values·that·can·be·represented·as·a·string,·including·1D·iterables·such·as·np.ndarray.·Example·csv_logger·=·CSVLogger('training.log')·model.fit(X_train,·Y_train,·callbacks=[csv_logger])·Arguments·filename·:·filename·of·the·csv·file,·e.g.·'run/log.csv'.·separator·:·string·used·to·separate·elements·in·the·csv·file.·append·:·True:·append·if·file·exists·(useful·for·continuing·training).·False:·overwrite·existing·file,·[source]",
1298 ············"title":·"Callback"1303 ············"title":·"CSVLogger"
1299 ········}, 
1300 ········{ 
1301 ············"location":·"callbacks.html#baselogger", 
1302 ············"text":·"keras.callbacks.BaseLogger(stateful_metrics=None)·Callback·that·accumulates·epoch·averages·of·metrics.·This·callback·is·automatically·applied·to·every·Keras·model.·Arguments·stateful_metrics·:·Iterable·of·string·names·of·metrics·that·should·not·be·averaged·over·an·epoch.·Metrics·in·this·list·will·be·logged·as-is·in·on_epoch_end·.·All·others·will·be·averaged·in·on_epoch_end·.·[source]", 
1303 ············"title":·"BaseLogger" 
1304 ········},1304 ········},
1305 ········{1305 ········{
1306 ············"location":·"callbacks.html#terminateonnan",1306 ············"location":·"callbacks.html#lambdacallback",
1307 ············"text":·"keras.callbacks.TerminateOnNaN()·Callback·that·terminates·training·when·a·NaN·loss·is·encountered.",1307 ············"text":·"keras.callbacks.LambdaCallback(on_epoch_begin=None,·on_epoch_end=None,·on_batch_begin=None,·on_batch_end=None,·on_train_begin=None,·on_train_end=None)·Callback·for·creating·simple,·custom·callbacks·on-the-fly.·This·callback·is·constructed·with·anonymous·functions·that·will·be·called·at·the·appropriate·time.·Note·that·the·callbacks·expects·positional·arguments,·as:·on_epoch_begin·and·on_epoch_end·expect·two·positional·arguments:·epoch·,·logs·on_batch_begin·and·on_batch_end·expect·two·positional·arguments:·batch·,·logs·on_train_begin·and·on_train_end·expect·one·positional·argument:·logs·Arguments·on_epoch_begin·:·called·at·the·beginning·of·every·epoch.·on_epoch_end·:·called·at·the·end·of·every·epoch.·on_batch_begin·:·called·at·the·beginning·of·every·batch.·on_batch_end·:·called·at·the·end·of·every·batch.·on_train_begin·:·called·at·the·beginning·of·model·training.·on_train_end·:·called·at·the·end·of·model·training.·Example·#·Print·the·batch·number·at·the·beginning·of·every·batch.·batch_print_callback·=·LambdaCallback(·on_batch_[·...·truncated·by·diffoscope;·len:·719,·SHA:·fd9128fa6e3a0bcd532dcc0ec751df2a9a37ae0f2d6605b584f4289f2efedacb·...·]",
1308 ············"title":·"TerminateOnNaN"1308 ············"title":·"LambdaCallback"
1309 ········},1309 ········},
1310 ········{1310 ········{
1311 ············"location":·"callbacks.html#create-a-callback",1311 ············"location":·"callbacks.html#create-a-callback",
1312 ············"text":·"You·can·create·a·custom·callback·by·extending·the·base·class·keras.callbacks.Callback·.·A·callback·has·access·to·its·associated·model·through·the·class·property·self.model·.·Here's·a·simple·example·saving·a·list·of·losses·over·each·batch·during·training:·class·LossHistory(keras.callbacks.Callback):·def·on_train_begin(self,·logs={}):·self.losses·=·[]·def·on_batch_end(self,·batch,·logs={}):·self.losses.append(logs.get('loss'))",1312 ············"text":·"You·can·create·a·custom·callback·by·extending·the·base·class·keras.callbacks.Callback·.·A·callback·has·access·to·its·associated·model·through·the·class·property·self.model·.·Here's·a·simple·example·saving·a·list·of·losses·over·each·batch·during·training:·class·LossHistory(keras.callbacks.Callback):·def·on_train_begin(self,·logs={}):·self.losses·=·[]·def·on_batch_end(self,·batch,·logs={}):·self.losses.append(logs.get('loss'))",
1313 ············"title":·"Create·a·callback"1313 ············"title":·"Create·a·callback"
1314 ········},1314 ········},
1315 ········{1315 ········{