Final Up to date on Might 31, 2022

Python is an decoding language. It means there may be an interpreter to run our program, slightly than compiling the code and operating natively. In Python, a REPL (read-eval-print loop) can run instructions line by line. Along with some inspection instruments supplied by Python, it helps to develop codes.

Within the following, you will notice methods to make use of the Python interpreter to examine an object and develop a program.

After ending this tutorial, you’ll be taught:

- Tips on how to work within the Python interpreter
- Tips on how to use the inspection features in Python
- Tips on how to develop an answer step-by-step with the assistance of inspection features

Let’s get began!

## Tutorial Overview

This tutorial is in 4 elements; they’re:

- PyTorch and TensorFlow
- Searching for Clues
- Studying from the Weights
- Making a Copier

## PyTorch and TensorFlow

PyTorch and TensorFlow are the 2 largest neural community libraries in Python. Their code is totally different, however the issues they’ll do are comparable.

Think about the basic MNIST handwritten digit recognition downside; you possibly can construct a LeNet-5 mannequin to categorise the digits as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
import numpy as np import torch import torch.nn as nn import torch.optim as optim import torchvision
# Load MNIST coaching knowledge remodel = torchvision.transforms.Compose([ torchvision.transforms.ToTensor() ]) practice = torchvision.datasets.MNIST(‘./datafiles/’, practice=True, obtain=True, remodel=remodel) train_loader = torch.utils.knowledge.DataLoader(practice, batch_size=32, shuffle=True)
# LeNet5 mannequin torch_model = nn.Sequential( nn.Conv2d(1, 6, kernel_size=(5,5), stride=1, padding=2), nn.Tanh(), nn.AvgPool2d(kernel_size=2, stride=2), nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0), nn.Tanh(), nn.AvgPool2d(kernel_size=2, stride=2), nn.Conv2d(16, 120, kernel_size=5, stride=1, padding=0), nn.Tanh(), nn.Flatten(), nn.Linear(120, 84), nn.Tanh(), nn.Linear(84, 10), nn.Softmax(dim=1) )
# Coaching loop def training_loop(mannequin, optimizer, loss_fn, train_loader, n_epochs=100): mannequin.practice() for epoch in vary(n_epochs): for knowledge, goal in train_loader: output = mannequin(knowledge) loss = loss_fn(output, goal) optimizer.zero_grad() loss.backward() optimizer.step() mannequin.eval()
# Run coaching optimizer = optim.Adam(torch_model.parameters()) loss_fn = nn.CrossEntropyLoss() training_loop(torch_model, optimizer, loss_fn, train_loader, n_epochs=20)
# Save mannequin torch.save(torch_model, “lenet5.pt”) |

It is a simplified code that doesn’t want any validation or testing. The counterpart in TensorFlow is the next:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
import numpy as np import tensorflow as tf from tensorflow.keras.fashions import Sequential from tensorflow.keras.layers import Conv2D, Dense, AveragePooling2D, Flatten from tensorflow.keras.datasets import mnist
# LeNet5 mannequin keras_model = Sequential([ Conv2D(6, (5,5), input_shape=(28,28,1), padding=“same”, activation=“tanh”), AveragePooling2D((2,2), strides=2), Conv2D(16, (5,5), activation=“tanh”), AveragePooling2D((2,2), strides=2), Conv2D(120, (5,5), activation=“tanh”), Flatten(), Dense(84, activation=“tanh”), Dense(10, activation=“softmax”) ])
# Reshape knowledge to form of (n_sample, peak, width, n_channel) (X_train, y_train), (X_test, y_test) = mnist.load_data() X_train = np.expand_dims(X_train, axis=3).astype(‘float32’)
# Prepare keras_model.compile(loss=“sparse_categorical_crossentropy”, optimizer=“adam”, metrics=[“accuracy”]) keras_model.match(X_train, y_train, epochs=20, batch_size=32)
# Save keras_model.save(“lenet5.h5”) |

Operating this program would provide the file `lenet5.pt`

from the PyTorch code and `lenet5.h5`

from the TensorFlow code.

## Searching for Clues

For those who perceive what the above neural networks are doing, it is best to be capable of inform that there’s nothing however many multiply and add calculations in every layer. Mathematically, there’s a matrix multiplication between the enter and the **kernel** of every fully-connected layer earlier than including the **bias** to the outcome. Within the convolutional layers, there may be the element-wise multiplication of the kernel to a portion of the enter matrix earlier than taking the sum of the outcome and including the bias as one output ingredient of the characteristic map.

Whereas creating the identical LeNet-5 mannequin utilizing two totally different frameworks, it must be potential to make them work identically if their weights are the identical. How are you going to copy over the burden from one mannequin to a different, given their architectures are similar?

You may load the saved fashions as follows:

import torch import tensorflow as tf torch_model = torch.load(“lenet5.pt”) keras_model = tf.keras.fashions.load_model(“lenet5.h5”) |

This most likely doesn’t let you know a lot. However for those who run `python`

within the command line with none parameters, you launch the REPL, in which you’ll be able to sort within the above code (you possibly can depart the REPL with `stop()`

):

Python 3.9.13 (fundamental, Might 19 2022, 13:48:47) [Clang 13.1.6 (clang-1316.0.21.2)] on darwin Sort “assist”, “copyright”, “credit” or “license” for extra info. >>> import torch >>> import tensorflow as tf >>> torch_model = torch.load(“lenet5.pt”) >>> keras_model = tf.keras.fashions.load_model(“lenet5.h5”) |

Nothing shall be printed within the above. However you possibly can examine the 2 fashions that have been loaded utilizing the `sort()`

built-in command:

>>> sort(torch_model) <class ‘torch.nn.modules.container.Sequential’> >>> sort(keras_model) <class ‘keras.engine.sequential.Sequential’> |

So right here they’re neural community fashions from PyTorch and Keras, respectively. Since they’re skilled fashions, the burden have to be saved inside. So how will you discover the weights in these fashions? Since they’re objects, the best approach is to make use of `dir()`

built-in perform to examine their members:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
>>> dir(torch_model) [‘T_destination’, ‘__annotations__’, ‘__call__’, ‘__class__’, ‘__delattr__’, ‘__delitem__’, ‘__dict__’, ‘__dir__’, ‘__doc__’, ‘__eq__’, ‘__format__’, ‘__ge__’, … ‘_slow_forward’, ‘_state_dict_hooks’, ‘_version’, ‘add_module’, ‘append’, ‘apply’, ‘bfloat16’, ‘buffers’, ‘children’, ‘cpu’, ‘cuda’, ‘double’, ‘dump_patches’, ‘eval’, ‘extra_repr’, ‘float’, ‘forward’, ‘get_buffer’, ‘get_extra_state’, ‘get_parameter’, ‘get_submodule’, ‘half’, ‘load_state_dict’, ‘modules’, ‘named_buffers’, ‘named_children’, ‘named_modules’, ‘named_parameters’, ‘parameters’, ‘register_backward_hook’, ‘register_buffer’, ‘register_forward_hook’, ‘register_forward_pre_hook’, ‘register_full_backward_hook’, ‘register_module’, ‘register_parameter’, ‘requires_grad_’, ‘set_extra_state’, ‘share_memory’, ‘state_dict’, ‘to’, ‘to_empty’, ‘train’, ‘training’, ‘type’, ‘xpu’, ‘zero_grad’] >>> dir(keras_model) [‘_SCALAR_UPRANKING_ON’, ‘_TF_MODULE_IGNORED_PROPERTIES’, ‘__call__’, ‘__class__’, ‘__copy__’, ‘__deepcopy__’, ‘__delattr__’, ‘__dict__’, ‘__dir__’, ‘__doc__’, ‘__eq__’, … ‘activity_regularizer’, ‘add’, ‘add_loss’, ‘add_metric’, ‘add_update’, ‘add_variable’, ‘add_weight’, ‘build’, ‘built’, ‘call’, ‘compile’, ‘compiled_loss’, ‘compiled_metrics’, ‘compute_dtype’, ‘compute_loss’, ‘compute_mask’, ‘compute_metrics’, ‘compute_output_shape’, ‘compute_output_signature’, ‘count_params’, ‘distribute_strategy’, ‘dtype’, ‘dtype_policy’, ‘dynamic’, ‘evaluate’, ‘evaluate_generator’, ‘finalize_state’, ‘fit’, ‘fit_generator’, ‘from_config’, ‘get_config’, ‘get_input_at’, ‘get_input_mask_at’, ‘get_input_shape_at’, ‘get_layer’, ‘get_output_at’, ‘get_output_mask_at’, ‘get_output_shape_at’, ‘get_weights’, ‘history’, ‘inbound_nodes’, ‘input’, ‘input_mask’, ‘input_names’, ‘input_shape’, ‘input_spec’, ‘inputs’, ‘layers’, ‘load_weights’, ‘loss’, ‘losses’, ‘make_predict_function’, ‘make_test_function’, ‘make_train_function’, ‘metrics’, ‘metrics_names’, ‘name’, ‘name_scope’, ‘non_trainable_variables’, ‘non_trainable_weights’, ‘optimizer’, ‘outbound_nodes’, ‘output’, ‘output_mask’, ‘output_names’, ‘output_shape’, ‘outputs’, ‘pop’, ‘predict’, ‘predict_function’, ‘predict_generator’, ‘predict_on_batch’, ‘predict_step’, ‘reset_metrics’, ‘reset_states’, ‘run_eagerly’, ‘save’, ‘save_spec’, ‘save_weights’, ‘set_weights’, ‘state_updates’, ‘stateful’, ‘stop_training’, ‘submodules’, ‘summary’, ‘supports_masking’, ‘test_function’, ‘test_on_batch’, ‘test_step’, ‘to_json’, ‘to_yaml’, ‘train_function’, ‘train_on_batch’, ‘train_step’, ‘train_tf_function’, ‘trainable’, ‘trainable_variables’, ‘trainable_weights’, ‘updates’, ‘variable_dtype’, ‘variables’, ‘weights’, ‘with_name_scope’] |

There are numerous members in every object. Some are attributes, and a few are strategies of the category. By conference, those who start with an underscore are inside members that you’re not speculated to entry in regular circumstances. If you wish to see extra of every member, you need to use the `getmembers()`

perform from the `examine`

module:

>>> import examine >>> examine(torch_model) >>> examine.getmembers(torch_model) [(‘T_destination’, ~T_destination), (‘__annotations__’, {‘_modules’: typing.Dict[str, torch.nn.modules.module.Module]}), (‘__call__’, <certain technique Module._call_impl of Sequential( … |

The output of the `getmembers()`

perform is a listing of tuples, by which every tuple is the identify of the member and the member itself. From the above, for instance, that `__call__`

is a “certain technique,” i.e., a member technique of a category.

By fastidiously wanting on the members’ names, you possibly can see that within the PyTorch mannequin, the “state” must be your curiosity, whereas within the Keras mannequin, you could have some member with the identify “weights.” To shortlist the names of them, you are able to do the next within the interpreter:

>>> [n for n in dir(torch_model) if ‘state’ in n] [‘__setstate__’, ‘_load_from_state_dict’, ‘_load_state_dict_pre_hooks’, ‘_register_load_state_dict_pre_hook’, ‘_register_state_dict_hook’, ‘_save_to_state_dict’, ‘_state_dict_hooks’, ‘get_extra_state’, ‘load_state_dict’, ‘set_extra_state’, ‘state_dict’] >>> [n for n in dir(keras_model) if ‘weight’ in n] [‘_assert_weights_created’, ‘_captured_weight_regularizer’, ‘_check_sample_weight_warning’, ‘_dedup_weights’, ‘_handle_weight_regularization’, ‘_initial_weights’, ‘_non_trainable_weights’, ‘_trainable_weights’, ‘_undeduplicated_weights’, ‘add_weight’, ‘get_weights’, ‘load_weights’, ‘non_trainable_weights’, ‘save_weights’, ‘set_weights’, ‘trainable_weights’, ‘weights’] |

This would possibly take a while in trial and error. Nevertheless it’s not too tough, and chances are you’ll uncover that you would be able to see the burden with `state_dict`

within the torch mannequin:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
>>> torch_model.state_dict <certain technique Module.state_dict of Sequential( (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) (1): Tanh() (2): AvgPool2d(kernel_size=2, stride=2, padding=0) (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (4): Tanh() (5): AvgPool2d(kernel_size=2, stride=2, padding=0) (6): Conv2d(16, 120, kernel_size=(5, 5), stride=(1, 1)) (7): Tanh() (8): Flatten(start_dim=1, end_dim=-1) (9): Linear(in_features=120, out_features=84, bias=True) (10): Tanh() (11): Linear(in_features=84, out_features=10, bias=True) (12): Softmax(dim=1) )> >>> torch_model.state_dict() OrderedDict([(‘0.weight’, tensor([[[[ 0.1559, 0.1681, 0.2726, 0.3187, 0.4909], [ 0.1179, 0.1340, -0.0815, -0.3253, 0.0904], [ 0.2326, -0.2079, -0.8614, -0.8643, -0.0632], [ 0.3874, -0.3490, -0.7957, -0.5873, -0.0638], [ 0.2800, 0.0947, 0.0308, 0.4065, 0.6916]]],
[[[ 0.5116, 0.1798, -0.1062, -0.4099, -0.3307], [ 0.1090, 0.0689, -0.1010, -0.9136, -0.5271], [ 0.2910, 0.2096, -0.2442, -1.5576, -0.0305], … |

For the TensorFlow/Keras mannequin, you will discover the weights with `get_weights()`

:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
>>> keras_model.get_weights <certain technique Mannequin.get_weights of <keras.engine.sequential.Sequential object at 0x159d93eb0>> >>> keras_model.get_weights() [array([[[[ 0.14078194, 0.04990018, -0.06204645, -0.03128023, -0.22033708, 0.19721672]],
[[-0.06618818, -0.152075 , 0.13130261, 0.22893831, 0.08880515, 0.01917628]],
[[-0.28716782, -0.23207009, 0.00505603, 0.2697424 , -0.1916888 , -0.25858143]],
[[-0.41863152, -0.20710683, 0.13254236, 0.18774481, -0.14866787, -0.14398652]],
[[-0.25119543, -0.14405733, -0.048533 , -0.12108403, 0.06704573, -0.1196835 ]]],
[[[-0.2438466 , 0.02499897, -0.1243961 , -0.20115352, -0.0241346 , 0.15888865]],
[[-0.20548582, -0.26495507, 0.21004884, 0.32183227, -0.13990627, -0.02996112]], … |

Right here it’s also with the attribute `weights`

:

>>> keras_model.weights [<tf.Variable ‘conv2d/kernel:0’ shape=(5, 5, 1, 6) dtype=float32, numpy= array([[[[ 0.14078194, 0.04990018, -0.06204645, -0.03128023, -0.22033708, 0.19721672]],
[[-0.06618818, -0.152075 , 0.13130261, 0.22893831, 0.08880515, 0.01917628]], … 8.25365111e-02, -1.72486171e-01, 3.16280037e-01, 4.12595004e-01]], dtype=float32)>, <tf.Variable ‘dense_1/bias:0’ form=(10,) dtype=float32, numpy= array([-0.19007775, 0.14427921, 0.0571407 , -0.24149619, -0.03247226, 0.18109408, -0.17159976, 0.21736498, -0.10254183, 0.02417901], dtype=float32)>] |

Right here, you possibly can observe the next: Within the PyTorch mannequin, the perform `state_dict()`

provides an `OrderedDict`

, which is a dictionary with the important thing in a specified order. There are keys reminiscent of `0.weight`

, and they’re mapped to a tensor worth. Within the Keras mannequin, the `get_weights()`

perform returns a listing. Every ingredient within the checklist is a NumPy array. The `weight`

attribute additionally holds a listing, however the parts are `tf.Variable`

sort.

You may know extra by checking the form of every tensor or array:

>>> [(key, val.shape) for key, val in torch_model.state_dict().items()] [(‘0.weight’, torch.Size([6, 1, 5, 5])), (‘0.bias’, torch.Measurement([6])), (‘3.weight’, torch.Measurement([16, 6, 5, 5])), (‘3.bias’, torch.Measurement([16])), (‘6.weight’, torch.Measurement([120, 16, 5, 5])), (‘6.bias’, torch.Measurement([120])), (‘9.weight’, torch.Measurement([84, 120])), (‘9.bias’, torch.Measurement([84])), (’11.weight’, torch.Measurement([10, 84])), (’11.bias’, torch.Measurement([10]))] >>> [arr.shape for arr in keras_model.get_weights()] [(5, 5, 1, 6), (6,), (5, 5, 6, 16), (16,), (5, 5, 16, 120), (120,), (120, 84), (84,), (84, 10), (10,)] |

Whereas you don’t see the identify of the layers from the Keras mannequin above, in actual fact, you need to use comparable reasoning to seek out the layers and get their identify:

>>> keras_model.layers [<keras.layers.convolutional.conv2d.Conv2D object at 0x159ddd850>, <keras.layers.pooling.average_pooling2d.AveragePooling2D object at 0x159ddd820>, <keras.layers.convolutional.conv2d.Conv2D object at 0x15a12b1c0>, <keras.layers.pooling.average_pooling2d.AveragePooling2D object at 0x15a1705e0>, <keras.layers.convolutional.conv2d.Conv2D object at 0x15a1812b0>, <keras.layers.reshaping.flatten.Flatten object at 0x15a194310>, <keras.layers.core.dense.Dense object at 0x15a1947c0>, <keras.layers.core.dense.Dense object at 0x15a194910>] >>> [layer.name for layer in keras_model.layers] [‘conv2d’, ‘average_pooling2d’, ‘conv2d_1’, ‘average_pooling2d_1’, ‘conv2d_2’, ‘flatten’, ‘dense’, ‘dense_1’] >>> |

## Studying from the Weights

By evaluating the results of `state_dict()`

from the PyTorch mannequin and that of `get_weights()`

from the Keras mannequin, you possibly can see that they each comprise 10 parts. From the form of the PyTorch tensors and NumPy arrays, you possibly can additional discover that they’re in comparable shapes. That is most likely as a result of each frameworks acknowledge a mannequin within the order from enter to output. You may additional affirm that from the important thing of the `state_dict()`

output in comparison with the layer names from the Keras mannequin.

You may examine how one can manipulate a PyTorch tensor by extracting one and inspecting:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
>>> torch_states = torch_model.state_dict() >>> torch_states.keys() odict_keys([‘0.weight’, ‘0.bias’, ‘3.weight’, ‘3.bias’, ‘6.weight’, ‘6.bias’, ‘9.weight’, ‘9.bias’, ’11.weight’, ’11.bias’]) >>> torch_states[“0.weight”] tensor([[[[ 0.1559, 0.1681, 0.2726, 0.3187, 0.4909], [ 0.1179, 0.1340, -0.0815, -0.3253, 0.0904], [ 0.2326, -0.2079, -0.8614, -0.8643, -0.0632], [ 0.3874, -0.3490, -0.7957, -0.5873, -0.0638], [ 0.2800, 0.0947, 0.0308, 0.4065, 0.6916]]], … [[[ 0.0980, 0.0240, 0.3295, 0.4507, 0.4539], [-0.1530, -0.3991, -0.3834, -0.2716, 0.0809], [-0.4639, -0.5537, -1.0207, -0.8049, -0.4977], [ 0.1825, -0.1284, -0.0669, -0.4652, -0.2961], [ 0.3402, 0.4256, 0.4329, 0.1503, 0.4207]]]]) >>> dir(torch_states[“0.weight”]) [‘H’, ‘T’, ‘__abs__’, ‘__add__’, ‘__and__’, ‘__array__’, ‘__array_priority__’, ‘__array_wrap__’, ‘__bool__’, ‘__class__’, ‘__complex__’, ‘__contains__’, … ‘trunc’, ‘trunc_’, ‘type’, ‘type_as’, ‘unbind’, ‘unflatten’, ‘unfold’, ‘uniform_’, ‘unique’, ‘unique_consecutive’, ‘unsafe_chunk’, ‘unsafe_split’, ‘unsafe_split_with_sizes’, ‘unsqueeze’, ‘unsqueeze_’, ‘values’, ‘var’, ‘vdot’, ‘view’, ‘view_as’, ‘vsplit’, ‘where’, ‘xlogy’, ‘xlogy_’, ‘xpu’, ‘zero_’] >>> torch_states[“0.weight”].numpy() array([[[[ 0.15587455, 0.16805592, 0.27259687, 0.31871665, 0.49091515], [ 0.11791296, 0.13400094, -0.08148099, -0.32530317, 0.09039831], … [ 0.18252987, -0.12838107, -0.0669101 , -0.4652463 , -0.2960882 ], [ 0.34022188, 0.4256311 , 0.4328527 , 0.15025541, 0.4207182 ]]]], dtype=float32) >>> torch_states[“0.weight”].form torch.Measurement([6, 1, 5, 5]) >>> torch_states[“0.weight”].numpy().form (6, 1, 5, 5) |

From the output of `dir()`

on a PyTorch tensor, you discovered a member named `numpy`

, and by calling that perform, it appears to transform a tensor right into a NumPy array. You could be fairly assured about that since you see the numbers match and the form matches. The truth is, you could be extra assured by wanting on the documentation:

>>> assist(torch_states[“0.weight”].numpy) |

The `assist()`

perform will present you the docstring of a perform, which often is its documentation.

Since that is the kernel of the primary convolution layer, by evaluating the form of this kernel to that of the Keras mannequin, you possibly can observe their shapes are totally different:

>>> keras_weights = keras_model.get_weights() >>> keras_weights[0].form (5, 5, 1, 6) |

Know that the enter to the primary layer is a 28×28×1 picture array whereas the output is 6 characteristic maps. It’s pure to correspond the 1 and 6 within the kernel form to be the variety of channels within the enter and output. Additionally, from our understanding of the mechanism of a convolutional layer, the kernel must be a 5×5 matrix.

At this level, you most likely guessed that within the PyTorch convolutional layer, the kernel is represented as (output × enter × peak × width), whereas in Keras, it’s represented as (peak × width × enter × output).

Equally, you additionally see within the fully-connected layers that PyTorch presents the kernel as (output × enter) whereas Keras is in (enter × output):

>>> keras_weights[6].form (120, 84) >>> checklist(torch_states.values())[6].form torch.Measurement([84, 120]) |

Matching the weights and tensors and displaying their shapes facet by facet ought to make these clearer:

>>> for okay,t in zip(keras_weights, torch_states.values()): … print(f”Keras: {okay.form}, Torch: {t.form}”) … Keras: (5, 5, 1, 6), Torch: torch.Measurement([6, 1, 5, 5]) Keras: (6,), Torch: torch.Measurement([6]) Keras: (5, 5, 6, 16), Torch: torch.Measurement([16, 6, 5, 5]) Keras: (16,), Torch: torch.Measurement([16]) Keras: (5, 5, 16, 120), Torch: torch.Measurement([120, 16, 5, 5]) Keras: (120,), Torch: torch.Measurement([120]) Keras: (120, 84), Torch: torch.Measurement([84, 120]) Keras: (84,), Torch: torch.Measurement([84]) Keras: (84, 10), Torch: torch.Measurement([10, 84]) Keras: (10,), Torch: torch.Measurement([10]) |

And we will additionally match the identify of the Keras weights and PyTorch tensors:

>>> for okay, t in zip(keras_model.weights, torch_states.keys()): … print(f”Keras: {okay.identify}, Torch: {t}”) … Keras: conv2d/kernel:0, Torch: 0.weight Keras: conv2d/bias:0, Torch: 0.bias Keras: conv2d_1/kernel:0, Torch: 3.weight Keras: conv2d_1/bias:0, Torch: 3.bias Keras: conv2d_2/kernel:0, Torch: 6.weight Keras: conv2d_2/bias:0, Torch: 6.bias Keras: dense/kernel:0, Torch: 9.weight Keras: dense/bias:0, Torch: 9.bias Keras: dense_1/kernel:0, Torch: 11.weight Keras: dense_1/bias:0, Torch: 11.bias |

## Making a Copier

Because you realized what the weights seem like in every mannequin, it doesn’t appear tough to create a program to repeat weights from one to a different. The hot button is to reply:

- Tips on how to set the weights in every mannequin
- What the weights are speculated to seem like (form and knowledge sort) in every mannequin

The primary query could be answered from the earlier inspection utilizing the `dir()`

built-in perform. You noticed the `load_state_dict`

member within the PyTorch mannequin, and it appears to be the software. Equally, within the Keras mannequin, you noticed a member named `set_weight`

that’s precisely the counterpart identify for `get_weight`

. You may additional affirm it’s the case by checking their documentation on-line or by way of the `assist()`

perform:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
>>> keras_model.set_weights <certain technique Layer.set_weights of <keras.engine.sequential.Sequential object at 0x159d93eb0>> >>> torch_model.load_state_dict <certain technique Module.load_state_dict of Sequential( (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) (1): Tanh() (2): AvgPool2d(kernel_size=2, stride=2, padding=0) (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (4): Tanh() (5): AvgPool2d(kernel_size=2, stride=2, padding=0) (6): Conv2d(16, 120, kernel_size=(5, 5), stride=(1, 1)) (7): Tanh() (8): Flatten(start_dim=1, end_dim=-1) (9): Linear(in_features=120, out_features=84, bias=True) (10): Tanh() (11): Linear(in_features=84, out_features=10, bias=True) (12): Softmax(dim=1) )> >>> assist(torch_model.load_state_dict)
>>> assist(keras_model.set_weights) |

You confirmed that these are each features, and their documentation defined they’re what you believed them to be. From the documentation, you additional realized that the `load_state_dict()`

perform of the PyTorch mannequin expects the argument to be the identical format as that returned from the `state_dict()`

perform; the `set_weights()`

perform of the Keras mannequin expects the identical format as returned from the `get_weights()`

perform.

Now you could have completed your journey with the Python REPL (you possibly can enter `stop()`

to depart).

By researching a bit on methods to **reshape** the weights and **solid** from one knowledge sort to a different, you give you the next program:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
import torch import tensorflow as tf
# Load the fashions torch_model = torch.load(“lenet5.pt”) keras_model = tf.keras.fashions.load_model(“lenet5.h5”)
# Extract weights from Keras mannequin keras_weights = keras_model.get_weights()
# Remodel form from Keras to PyTorch for idx in [0, 2, 4]: # conv layers: (out, in, peak, width) keras_weights[idx] = keras_weights[idx].transpose([3, 2, 0, 1]) for idx in [6, 8]: # dense layers: (out, in) keras_weights[idx] = keras_weights[idx].transpose()
# Set weights torch_states = torch_model.state_dict() for key, weight in zip(torch_states.keys(), keras_weights): torch_states[key] = torch.tensor(weight) torch_model.load_state_dict(torch_states)
# Save new mannequin torch.save(torch_model, “lenet5-keras.pt”) |

And the opposite approach round, copying weights from the PyTorch mannequin to the Keras mannequin could be executed equally,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
import torch import tensorflow as tf
# Load the fashions torch_model = torch.load(“lenet5.pt”) keras_model = tf.keras.fashions.load_model(“lenet5.h5”)
# Extract weights from PyTorch mannequin torch_states = torch_model.state_dict() weights = checklist(torch_states.values())
# Remodel tensor to numpy array weights = [w.numpy() for w in weights]
# Remodel form from PyTorch to Keras for idx in [0, 2, 4]: # conv layers: (peak, width, in, out) weights[idx] = weights[idx].transpose([2, 3, 1, 0]) for idx in [6, 8]: # dense layers: (in, out) weights[idx] = weights[idx].transpose()
# Set weights keras_model.set_weights(weights)
# Save new mannequin keras_model.save(“lenet5-torch.h5”) |

Then, you possibly can confirm they work the identical by passing a random array as enter, in which you’ll be able to count on the output tied out precisely:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
import numpy as np import torch import tensorflow as tf
# Load the fashions torch_orig_model = torch.load(“lenet5.pt”) keras_orig_model = tf.keras.fashions.load_model(“lenet5.h5”) torch_converted_model = torch.load(“lenet5-keras.pt”) keras_converted_model = tf.keras.fashions.load_model(“lenet5-torch.h5”)
# Create a random enter pattern = np.random.random((28,28))
# Convert pattern to torch enter form torch_sample = torch.Tensor(pattern.reshape(1,1,28,28))
# Convert pattern to keras enter form keras_sample = pattern.reshape(1,28,28,1)
# Verify output keras_converted_output = keras_converted_model.predict(keras_sample, verbose=0) keras_orig_output = keras_orig_model.predict(keras_sample, verbose=0) torch_converted_output = torch_converted_model(torch_sample).detach().numpy() torch_orig_output = torch_orig_model(torch_sample).detach().numpy()
np.set_printoptions(precision=4) print(keras_orig_output) print(torch_converted_output) print() print(torch_orig_output) print(keras_converted_output) |

In our case, the output is:

[[9.8908e-06 2.4246e-07 3.1996e-04 8.2742e-01 1.6853e-10 1.7212e-01 3.6018e-10 1.5521e-06 1.3128e-04 2.2083e-06]] [[9.8908e-06 2.4245e-07 3.1996e-04 8.2742e-01 1.6853e-10 1.7212e-01 3.6018e-10 1.5521e-06 1.3128e-04 2.2083e-06]]
[[4.1505e-10 1.9959e-17 1.7399e-08 4.0302e-11 9.5790e-14 3.7395e-12 1.0634e-10 1.7682e-16 1.0000e+00 8.8126e-10]] [[4.1506e-10 1.9959e-17 1.7399e-08 4.0302e-11 9.5791e-14 3.7395e-12 1.0634e-10 1.7682e-16 1.0000e+00 8.8127e-10]] |

This agrees with one another at adequate precision. Be aware that your outcome might not be precisely the identical because of the random nature of coaching. Additionally, because of the nature of floating level calculation, the PyTorch and TensorFlow/Keras mannequin wouldn’t produce the very same output even when the weights have been the identical.

Nevertheless, the target right here is to indicate you how one can make use of Python’s inspection instruments to know one thing you didn’t know and develop an answer.

## Additional Readings

This part offers extra sources on the subject if you’re trying to go deeper.

#### Articles

## Abstract

On this tutorial, you realized methods to work underneath the Python REPL and use the inspection features to develop an answer. Particularly,

- You realized methods to use the inspection features in REPL to be taught the interior members of an object
- You realized methods to use REPL to experiment with Python code
- In consequence, you developed a program changing between a PyTorch and a Keras mannequin