Final Up to date on Might 31, 2022
Python is an decoding language. It means there may be an interpreter to run our program, slightly than compiling the code and operating natively. In Python, a REPL (read-eval-print loop) can run instructions line by line. Along with some inspection instruments supplied by Python, it helps to develop codes.
Within the following, you will notice methods to make use of the Python interpreter to examine an object and develop a program.
After ending this tutorial, you’ll be taught:
- Tips on how to work within the Python interpreter
- Tips on how to use the inspection features in Python
- Tips on how to develop an answer step-by-step with the assistance of inspection features
Let’s get began!

Growing a Python Program Utilizing Inspection Instruments.
Picture by Tekton. Some rights reserved.
Tutorial Overview
This tutorial is in 4 elements; they’re:
- PyTorch and TensorFlow
- Searching for Clues
- Studying from the Weights
- Making a Copier
PyTorch and TensorFlow
PyTorch and TensorFlow are the 2 largest neural community libraries in Python. Their code is totally different, however the issues they’ll do are comparable.
Think about the basic MNIST handwritten digit recognition downside; you possibly can construct a LeNet-5 mannequin to categorise the digits as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
import numpy as np import torch import torch.nn as nn import torch.optim as optim import torchvision
# Load MNIST coaching knowledge remodel = torchvision.transforms.Compose([ torchvision.transforms.ToTensor() ]) practice = torchvision.datasets.MNIST(‘./datafiles/’, practice=True, obtain=True, remodel=remodel) train_loader = torch.utils.knowledge.DataLoader(practice, batch_size=32, shuffle=True)
# LeNet5 mannequin torch_model = nn.Sequential( nn.Conv2d(1, 6, kernel_size=(5,5), stride=1, padding=2), nn.Tanh(), nn.AvgPool2d(kernel_size=2, stride=2), nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0), nn.Tanh(), nn.AvgPool2d(kernel_size=2, stride=2), nn.Conv2d(16, 120, kernel_size=5, stride=1, padding=0), nn.Tanh(), nn.Flatten(), nn.Linear(120, 84), nn.Tanh(), nn.Linear(84, 10), nn.Softmax(dim=1) )
# Coaching loop def training_loop(mannequin, optimizer, loss_fn, train_loader, n_epochs=100): mannequin.practice() for epoch in vary(n_epochs): for knowledge, goal in train_loader: output = mannequin(knowledge) loss = loss_fn(output, goal) optimizer.zero_grad() loss.backward() optimizer.step() mannequin.eval()
# Run coaching optimizer = optim.Adam(torch_model.parameters()) loss_fn = nn.CrossEntropyLoss() training_loop(torch_model, optimizer, loss_fn, train_loader, n_epochs=20)
# Save mannequin torch.save(torch_model, “lenet5.pt”) |
It is a simplified code that doesn’t want any validation or testing. The counterpart in TensorFlow is the next:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
import numpy as np import tensorflow as tf from tensorflow.keras.fashions import Sequential from tensorflow.keras.layers import Conv2D, Dense, AveragePooling2D, Flatten from tensorflow.keras.datasets import mnist
# LeNet5 mannequin keras_model = Sequential([ Conv2D(6, (5,5), input_shape=(28,28,1), padding=“same”, activation=“tanh”), AveragePooling2D((2,2), strides=2), Conv2D(16, (5,5), activation=“tanh”), AveragePooling2D((2,2), strides=2), Conv2D(120, (5,5), activation=“tanh”), Flatten(), Dense(84, activation=“tanh”), Dense(10, activation=“softmax”) ])
# Reshape knowledge to form of (n_sample, peak, width, n_channel) (X_train, y_train), (X_test, y_test) = mnist.load_data() X_train = np.expand_dims(X_train, axis=3).astype(‘float32’)
# Prepare keras_model.compile(loss=“sparse_categorical_crossentropy”, optimizer=“adam”, metrics=[“accuracy”]) keras_model.match(X_train, y_train, epochs=20, batch_size=32)
# Save keras_model.save(“lenet5.h5”) |
Operating this program would provide the file lenet5.pt
from the PyTorch code and lenet5.h5
from the TensorFlow code.
Searching for Clues
For those who perceive what the above neural networks are doing, it is best to be capable of inform that there’s nothing however many multiply and add calculations in every layer. Mathematically, there’s a matrix multiplication between the enter and the kernel of every fully-connected layer earlier than including the bias to the outcome. Within the convolutional layers, there may be the element-wise multiplication of the kernel to a portion of the enter matrix earlier than taking the sum of the outcome and including the bias as one output ingredient of the characteristic map.
Whereas creating the identical LeNet-5 mannequin utilizing two totally different frameworks, it must be potential to make them work identically if their weights are the identical. How are you going to copy over the burden from one mannequin to a different, given their architectures are similar?
You may load the saved fashions as follows:
import torch import tensorflow as tf torch_model = torch.load(“lenet5.pt”) keras_model = tf.keras.fashions.load_model(“lenet5.h5”) |
This most likely doesn’t let you know a lot. However for those who run python
within the command line with none parameters, you launch the REPL, in which you’ll be able to sort within the above code (you possibly can depart the REPL with stop()
):
Python 3.9.13 (fundamental, Might 19 2022, 13:48:47) [Clang 13.1.6 (clang-1316.0.21.2)] on darwin Sort “assist”, “copyright”, “credit” or “license” for extra info. >>> import torch >>> import tensorflow as tf >>> torch_model = torch.load(“lenet5.pt”) >>> keras_model = tf.keras.fashions.load_model(“lenet5.h5”) |
Nothing shall be printed within the above. However you possibly can examine the 2 fashions that have been loaded utilizing the sort()
built-in command:
>>> sort(torch_model) <class ‘torch.nn.modules.container.Sequential’> >>> sort(keras_model) <class ‘keras.engine.sequential.Sequential’> |
So right here they’re neural community fashions from PyTorch and Keras, respectively. Since they’re skilled fashions, the burden have to be saved inside. So how will you discover the weights in these fashions? Since they’re objects, the best approach is to make use of dir()
built-in perform to examine their members:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
>>> dir(torch_model) [‘T_destination’, ‘__annotations__’, ‘__call__’, ‘__class__’, ‘__delattr__’, ‘__delitem__’, ‘__dict__’, ‘__dir__’, ‘__doc__’, ‘__eq__’, ‘__format__’, ‘__ge__’, … ‘_slow_forward’, ‘_state_dict_hooks’, ‘_version’, ‘add_module’, ‘append’, ‘apply’, ‘bfloat16’, ‘buffers’, ‘children’, ‘cpu’, ‘cuda’, ‘double’, ‘dump_patches’, ‘eval’, ‘extra_repr’, ‘float’, ‘forward’, ‘get_buffer’, ‘get_extra_state’, ‘get_parameter’, ‘get_submodule’, ‘half’, ‘load_state_dict’, ‘modules’, ‘named_buffers’, ‘named_children’, ‘named_modules’, ‘named_parameters’, ‘parameters’, ‘register_backward_hook’, ‘register_buffer’, ‘register_forward_hook’, ‘register_forward_pre_hook’, ‘register_full_backward_hook’, ‘register_module’, ‘register_parameter’, ‘requires_grad_’, ‘set_extra_state’, ‘share_memory’, ‘state_dict’, ‘to’, ‘to_empty’, ‘train’, ‘training’, ‘type’, ‘xpu’, ‘zero_grad’] >>> dir(keras_model) [‘_SCALAR_UPRANKING_ON’, ‘_TF_MODULE_IGNORED_PROPERTIES’, ‘__call__’, ‘__class__’, ‘__copy__’, ‘__deepcopy__’, ‘__delattr__’, ‘__dict__’, ‘__dir__’, ‘__doc__’, ‘__eq__’, … ‘activity_regularizer’, ‘add’, ‘add_loss’, ‘add_metric’, ‘add_update’, ‘add_variable’, ‘add_weight’, ‘build’, ‘built’, ‘call’, ‘compile’, ‘compiled_loss’, ‘compiled_metrics’, ‘compute_dtype’, ‘compute_loss’, ‘compute_mask’, ‘compute_metrics’, ‘compute_output_shape’, ‘compute_output_signature’, ‘count_params’, ‘distribute_strategy’, ‘dtype’, ‘dtype_policy’, ‘dynamic’, ‘evaluate’, ‘evaluate_generator’, ‘finalize_state’, ‘fit’, ‘fit_generator’, ‘from_config’, ‘get_config’, ‘get_input_at’, ‘get_input_mask_at’, ‘get_input_shape_at’, ‘get_layer’, ‘get_output_at’, ‘get_output_mask_at’, ‘get_output_shape_at’, ‘get_weights’, ‘history’, ‘inbound_nodes’, ‘input’, ‘input_mask’, ‘input_names’, ‘input_shape’, ‘input_spec’, ‘inputs’, ‘layers’, ‘load_weights’, ‘loss’, ‘losses’, ‘make_predict_function’, ‘make_test_function’, ‘make_train_function’, ‘metrics’, ‘metrics_names’, ‘name’, ‘name_scope’, ‘non_trainable_variables’, ‘non_trainable_weights’, ‘optimizer’, ‘outbound_nodes’, ‘output’, ‘output_mask’, ‘output_names’, ‘output_shape’, ‘outputs’, ‘pop’, ‘predict’, ‘predict_function’, ‘predict_generator’, ‘predict_on_batch’, ‘predict_step’, ‘reset_metrics’, ‘reset_states’, ‘run_eagerly’, ‘save’, ‘save_spec’, ‘save_weights’, ‘set_weights’, ‘state_updates’, ‘stateful’, ‘stop_training’, ‘submodules’, ‘summary’, ‘supports_masking’, ‘test_function’, ‘test_on_batch’, ‘test_step’, ‘to_json’, ‘to_yaml’, ‘train_function’, ‘train_on_batch’, ‘train_step’, ‘train_tf_function’, ‘trainable’, ‘trainable_variables’, ‘trainable_weights’, ‘updates’, ‘variable_dtype’, ‘variables’, ‘weights’, ‘with_name_scope’] |
There are numerous members in every object. Some are attributes, and a few are strategies of the category. By conference, those who start with an underscore are inside members that you’re not speculated to entry in regular circumstances. If you wish to see extra of every member, you need to use the getmembers()
perform from the examine
module:
>>> import examine >>> examine(torch_model) >>> examine.getmembers(torch_model) [(‘T_destination’, ~T_destination), (‘__annotations__’, {‘_modules’: typing.Dict[str, torch.nn.modules.module.Module]}), (‘__call__’, <certain technique Module._call_impl of Sequential( … |
The output of the getmembers()
perform is a listing of tuples, by which every tuple is the identify of the member and the member itself. From the above, for instance, that __call__
is a “certain technique,” i.e., a member technique of a category.
By fastidiously wanting on the members’ names, you possibly can see that within the PyTorch mannequin, the “state” must be your curiosity, whereas within the Keras mannequin, you could have some member with the identify “weights.” To shortlist the names of them, you are able to do the next within the interpreter:
>>> [n for n in dir(torch_model) if ‘state’ in n] [‘__setstate__’, ‘_load_from_state_dict’, ‘_load_state_dict_pre_hooks’, ‘_register_load_state_dict_pre_hook’, ‘_register_state_dict_hook’, ‘_save_to_state_dict’, ‘_state_dict_hooks’, ‘get_extra_state’, ‘load_state_dict’, ‘set_extra_state’, ‘state_dict’] >>> [n for n in dir(keras_model) if ‘weight’ in n] [‘_assert_weights_created’, ‘_captured_weight_regularizer’, ‘_check_sample_weight_warning’, ‘_dedup_weights’, ‘_handle_weight_regularization’, ‘_initial_weights’, ‘_non_trainable_weights’, ‘_trainable_weights’, ‘_undeduplicated_weights’, ‘add_weight’, ‘get_weights’, ‘load_weights’, ‘non_trainable_weights’, ‘save_weights’, ‘set_weights’, ‘trainable_weights’, ‘weights’] |
This would possibly take a while in trial and error. Nevertheless it’s not too tough, and chances are you’ll uncover that you would be able to see the burden with state_dict
within the torch mannequin:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
>>> torch_model.state_dict <certain technique Module.state_dict of Sequential( (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) (1): Tanh() (2): AvgPool2d(kernel_size=2, stride=2, padding=0) (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (4): Tanh() (5): AvgPool2d(kernel_size=2, stride=2, padding=0) (6): Conv2d(16, 120, kernel_size=(5, 5), stride=(1, 1)) (7): Tanh() (8): Flatten(start_dim=1, end_dim=-1) (9): Linear(in_features=120, out_features=84, bias=True) (10): Tanh() (11): Linear(in_features=84, out_features=10, bias=True) (12): Softmax(dim=1) )> >>> torch_model.state_dict() OrderedDict([(‘0.weight’, tensor([[[[ 0.1559, 0.1681, 0.2726, 0.3187, 0.4909], [ 0.1179, 0.1340, -0.0815, -0.3253, 0.0904], [ 0.2326, -0.2079, -0.8614, -0.8643, -0.0632], [ 0.3874, -0.3490, -0.7957, -0.5873, -0.0638], [ 0.2800, 0.0947, 0.0308, 0.4065, 0.6916]]],
[[[ 0.5116, 0.1798, -0.1062, -0.4099, -0.3307], [ 0.1090, 0.0689, -0.1010, -0.9136, -0.5271], [ 0.2910, 0.2096, -0.2442, -1.5576, -0.0305], … |
For the TensorFlow/Keras mannequin, you will discover the weights with get_weights()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
>>> keras_model.get_weights <certain technique Mannequin.get_weights of <keras.engine.sequential.Sequential object at 0x159d93eb0>> >>> keras_model.get_weights() [array([[[[ 0.14078194, 0.04990018, -0.06204645, -0.03128023, -0.22033708, 0.19721672]],
[[-0.06618818, -0.152075 , 0.13130261, 0.22893831, 0.08880515, 0.01917628]],
[[-0.28716782, -0.23207009, 0.00505603, 0.2697424 , -0.1916888 , -0.25858143]],
[[-0.41863152, -0.20710683, 0.13254236, 0.18774481, -0.14866787, -0.14398652]],
[[-0.25119543, -0.14405733, -0.048533 , -0.12108403, 0.06704573, -0.1196835 ]]],
[[[-0.2438466 , 0.02499897, -0.1243961 , -0.20115352, -0.0241346 , 0.15888865]],
[[-0.20548582, -0.26495507, 0.21004884, 0.32183227, -0.13990627, -0.02996112]], … |
Right here it’s also with the attribute weights
:
>>> keras_model.weights [<tf.Variable ‘conv2d/kernel:0’ shape=(5, 5, 1, 6) dtype=float32, numpy= array([[[[ 0.14078194, 0.04990018, -0.06204645, -0.03128023, -0.22033708, 0.19721672]],
[[-0.06618818, -0.152075 , 0.13130261, 0.22893831, 0.08880515, 0.01917628]], … 8.25365111e-02, -1.72486171e-01, 3.16280037e-01, 4.12595004e-01]], dtype=float32)>, <tf.Variable ‘dense_1/bias:0’ form=(10,) dtype=float32, numpy= array([-0.19007775, 0.14427921, 0.0571407 , -0.24149619, -0.03247226, 0.18109408, -0.17159976, 0.21736498, -0.10254183, 0.02417901], dtype=float32)>] |
Right here, you possibly can observe the next: Within the PyTorch mannequin, the perform state_dict()
provides an OrderedDict
, which is a dictionary with the important thing in a specified order. There are keys reminiscent of 0.weight
, and they’re mapped to a tensor worth. Within the Keras mannequin, the get_weights()
perform returns a listing. Every ingredient within the checklist is a NumPy array. The weight
attribute additionally holds a listing, however the parts are tf.Variable
sort.
You may know extra by checking the form of every tensor or array:
>>> [(key, val.shape) for key, val in torch_model.state_dict().items()] [(‘0.weight’, torch.Size([6, 1, 5, 5])), (‘0.bias’, torch.Measurement([6])), (‘3.weight’, torch.Measurement([16, 6, 5, 5])), (‘3.bias’, torch.Measurement([16])), (‘6.weight’, torch.Measurement([120, 16, 5, 5])), (‘6.bias’, torch.Measurement([120])), (‘9.weight’, torch.Measurement([84, 120])), (‘9.bias’, torch.Measurement([84])), (’11.weight’, torch.Measurement([10, 84])), (’11.bias’, torch.Measurement([10]))] >>> [arr.shape for arr in keras_model.get_weights()] [(5, 5, 1, 6), (6,), (5, 5, 6, 16), (16,), (5, 5, 16, 120), (120,), (120, 84), (84,), (84, 10), (10,)] |
Whereas you don’t see the identify of the layers from the Keras mannequin above, in actual fact, you need to use comparable reasoning to seek out the layers and get their identify:
>>> keras_model.layers [<keras.layers.convolutional.conv2d.Conv2D object at 0x159ddd850>, <keras.layers.pooling.average_pooling2d.AveragePooling2D object at 0x159ddd820>, <keras.layers.convolutional.conv2d.Conv2D object at 0x15a12b1c0>, <keras.layers.pooling.average_pooling2d.AveragePooling2D object at 0x15a1705e0>, <keras.layers.convolutional.conv2d.Conv2D object at 0x15a1812b0>, <keras.layers.reshaping.flatten.Flatten object at 0x15a194310>, <keras.layers.core.dense.Dense object at 0x15a1947c0>, <keras.layers.core.dense.Dense object at 0x15a194910>] >>> [layer.name for layer in keras_model.layers] [‘conv2d’, ‘average_pooling2d’, ‘conv2d_1’, ‘average_pooling2d_1’, ‘conv2d_2’, ‘flatten’, ‘dense’, ‘dense_1’] >>> |
Studying from the Weights
By evaluating the results of state_dict()
from the PyTorch mannequin and that of get_weights()
from the Keras mannequin, you possibly can see that they each comprise 10 parts. From the form of the PyTorch tensors and NumPy arrays, you possibly can additional discover that they’re in comparable shapes. That is most likely as a result of each frameworks acknowledge a mannequin within the order from enter to output. You may additional affirm that from the important thing of the state_dict()
output in comparison with the layer names from the Keras mannequin.
You may examine how one can manipulate a PyTorch tensor by extracting one and inspecting:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
>>> torch_states = torch_model.state_dict() >>> torch_states.keys() odict_keys([‘0.weight’, ‘0.bias’, ‘3.weight’, ‘3.bias’, ‘6.weight’, ‘6.bias’, ‘9.weight’, ‘9.bias’, ’11.weight’, ’11.bias’]) >>> torch_states[“0.weight”] tensor([[[[ 0.1559, 0.1681, 0.2726, 0.3187, 0.4909], [ 0.1179, 0.1340, -0.0815, -0.3253, 0.0904], [ 0.2326, -0.2079, -0.8614, -0.8643, -0.0632], [ 0.3874, -0.3490, -0.7957, -0.5873, -0.0638], [ 0.2800, 0.0947, 0.0308, 0.4065, 0.6916]]], … [[[ 0.0980, 0.0240, 0.3295, 0.4507, 0.4539], [-0.1530, -0.3991, -0.3834, -0.2716, 0.0809], [-0.4639, -0.5537, -1.0207, -0.8049, -0.4977], [ 0.1825, -0.1284, -0.0669, -0.4652, -0.2961], [ 0.3402, 0.4256, 0.4329, 0.1503, 0.4207]]]]) >>> dir(torch_states[“0.weight”]) [‘H’, ‘T’, ‘__abs__’, ‘__add__’, ‘__and__’, ‘__array__’, ‘__array_priority__’, ‘__array_wrap__’, ‘__bool__’, ‘__class__’, ‘__complex__’, ‘__contains__’, … ‘trunc’, ‘trunc_’, ‘type’, ‘type_as’, ‘unbind’, ‘unflatten’, ‘unfold’, ‘uniform_’, ‘unique’, ‘unique_consecutive’, ‘unsafe_chunk’, ‘unsafe_split’, ‘unsafe_split_with_sizes’, ‘unsqueeze’, ‘unsqueeze_’, ‘values’, ‘var’, ‘vdot’, ‘view’, ‘view_as’, ‘vsplit’, ‘where’, ‘xlogy’, ‘xlogy_’, ‘xpu’, ‘zero_’] >>> torch_states[“0.weight”].numpy() array([[[[ 0.15587455, 0.16805592, 0.27259687, 0.31871665, 0.49091515], [ 0.11791296, 0.13400094, -0.08148099, -0.32530317, 0.09039831], … [ 0.18252987, -0.12838107, -0.0669101 , -0.4652463 , -0.2960882 ], [ 0.34022188, 0.4256311 , 0.4328527 , 0.15025541, 0.4207182 ]]]], dtype=float32) >>> torch_states[“0.weight”].form torch.Measurement([6, 1, 5, 5]) >>> torch_states[“0.weight”].numpy().form (6, 1, 5, 5) |
From the output of dir()
on a PyTorch tensor, you discovered a member named numpy
, and by calling that perform, it appears to transform a tensor right into a NumPy array. You could be fairly assured about that since you see the numbers match and the form matches. The truth is, you could be extra assured by wanting on the documentation:
>>> assist(torch_states[“0.weight”].numpy) |
The assist()
perform will present you the docstring of a perform, which often is its documentation.
Since that is the kernel of the primary convolution layer, by evaluating the form of this kernel to that of the Keras mannequin, you possibly can observe their shapes are totally different:
>>> keras_weights = keras_model.get_weights() >>> keras_weights[0].form (5, 5, 1, 6) |
Know that the enter to the primary layer is a 28×28×1 picture array whereas the output is 6 characteristic maps. It’s pure to correspond the 1 and 6 within the kernel form to be the variety of channels within the enter and output. Additionally, from our understanding of the mechanism of a convolutional layer, the kernel must be a 5×5 matrix.
At this level, you most likely guessed that within the PyTorch convolutional layer, the kernel is represented as (output × enter × peak × width), whereas in Keras, it’s represented as (peak × width × enter × output).
Equally, you additionally see within the fully-connected layers that PyTorch presents the kernel as (output × enter) whereas Keras is in (enter × output):
>>> keras_weights[6].form (120, 84) >>> checklist(torch_states.values())[6].form torch.Measurement([84, 120]) |
Matching the weights and tensors and displaying their shapes facet by facet ought to make these clearer:
>>> for okay,t in zip(keras_weights, torch_states.values()): … print(f”Keras: {okay.form}, Torch: {t.form}”) … Keras: (5, 5, 1, 6), Torch: torch.Measurement([6, 1, 5, 5]) Keras: (6,), Torch: torch.Measurement([6]) Keras: (5, 5, 6, 16), Torch: torch.Measurement([16, 6, 5, 5]) Keras: (16,), Torch: torch.Measurement([16]) Keras: (5, 5, 16, 120), Torch: torch.Measurement([120, 16, 5, 5]) Keras: (120,), Torch: torch.Measurement([120]) Keras: (120, 84), Torch: torch.Measurement([84, 120]) Keras: (84,), Torch: torch.Measurement([84]) Keras: (84, 10), Torch: torch.Measurement([10, 84]) Keras: (10,), Torch: torch.Measurement([10]) |
And we will additionally match the identify of the Keras weights and PyTorch tensors:
>>> for okay, t in zip(keras_model.weights, torch_states.keys()): … print(f”Keras: {okay.identify}, Torch: {t}”) … Keras: conv2d/kernel:0, Torch: 0.weight Keras: conv2d/bias:0, Torch: 0.bias Keras: conv2d_1/kernel:0, Torch: 3.weight Keras: conv2d_1/bias:0, Torch: 3.bias Keras: conv2d_2/kernel:0, Torch: 6.weight Keras: conv2d_2/bias:0, Torch: 6.bias Keras: dense/kernel:0, Torch: 9.weight Keras: dense/bias:0, Torch: 9.bias Keras: dense_1/kernel:0, Torch: 11.weight Keras: dense_1/bias:0, Torch: 11.bias |
Making a Copier
Because you realized what the weights seem like in every mannequin, it doesn’t appear tough to create a program to repeat weights from one to a different. The hot button is to reply:
- Tips on how to set the weights in every mannequin
- What the weights are speculated to seem like (form and knowledge sort) in every mannequin
The primary query could be answered from the earlier inspection utilizing the dir()
built-in perform. You noticed the load_state_dict
member within the PyTorch mannequin, and it appears to be the software. Equally, within the Keras mannequin, you noticed a member named set_weight
that’s precisely the counterpart identify for get_weight
. You may additional affirm it’s the case by checking their documentation on-line or by way of the assist()
perform:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
>>> keras_model.set_weights <certain technique Layer.set_weights of <keras.engine.sequential.Sequential object at 0x159d93eb0>> >>> torch_model.load_state_dict <certain technique Module.load_state_dict of Sequential( (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) (1): Tanh() (2): AvgPool2d(kernel_size=2, stride=2, padding=0) (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (4): Tanh() (5): AvgPool2d(kernel_size=2, stride=2, padding=0) (6): Conv2d(16, 120, kernel_size=(5, 5), stride=(1, 1)) (7): Tanh() (8): Flatten(start_dim=1, end_dim=-1) (9): Linear(in_features=120, out_features=84, bias=True) (10): Tanh() (11): Linear(in_features=84, out_features=10, bias=True) (12): Softmax(dim=1) )> >>> assist(torch_model.load_state_dict)
>>> assist(keras_model.set_weights) |
You confirmed that these are each features, and their documentation defined they’re what you believed them to be. From the documentation, you additional realized that the load_state_dict()
perform of the PyTorch mannequin expects the argument to be the identical format as that returned from the state_dict()
perform; the set_weights()
perform of the Keras mannequin expects the identical format as returned from the get_weights()
perform.
Now you could have completed your journey with the Python REPL (you possibly can enter stop()
to depart).
By researching a bit on methods to reshape the weights and solid from one knowledge sort to a different, you give you the next program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
import torch import tensorflow as tf
# Load the fashions torch_model = torch.load(“lenet5.pt”) keras_model = tf.keras.fashions.load_model(“lenet5.h5”)
# Extract weights from Keras mannequin keras_weights = keras_model.get_weights()
# Remodel form from Keras to PyTorch for idx in [0, 2, 4]: # conv layers: (out, in, peak, width) keras_weights[idx] = keras_weights[idx].transpose([3, 2, 0, 1]) for idx in [6, 8]: # dense layers: (out, in) keras_weights[idx] = keras_weights[idx].transpose()
# Set weights torch_states = torch_model.state_dict() for key, weight in zip(torch_states.keys(), keras_weights): torch_states[key] = torch.tensor(weight) torch_model.load_state_dict(torch_states)
# Save new mannequin torch.save(torch_model, “lenet5-keras.pt”) |
And the opposite approach round, copying weights from the PyTorch mannequin to the Keras mannequin could be executed equally,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
import torch import tensorflow as tf
# Load the fashions torch_model = torch.load(“lenet5.pt”) keras_model = tf.keras.fashions.load_model(“lenet5.h5”)
# Extract weights from PyTorch mannequin torch_states = torch_model.state_dict() weights = checklist(torch_states.values())
# Remodel tensor to numpy array weights = [w.numpy() for w in weights]
# Remodel form from PyTorch to Keras for idx in [0, 2, 4]: # conv layers: (peak, width, in, out) weights[idx] = weights[idx].transpose([2, 3, 1, 0]) for idx in [6, 8]: # dense layers: (in, out) weights[idx] = weights[idx].transpose()
# Set weights keras_model.set_weights(weights)
# Save new mannequin keras_model.save(“lenet5-torch.h5”) |
Then, you possibly can confirm they work the identical by passing a random array as enter, in which you’ll be able to count on the output tied out precisely:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
import numpy as np import torch import tensorflow as tf
# Load the fashions torch_orig_model = torch.load(“lenet5.pt”) keras_orig_model = tf.keras.fashions.load_model(“lenet5.h5”) torch_converted_model = torch.load(“lenet5-keras.pt”) keras_converted_model = tf.keras.fashions.load_model(“lenet5-torch.h5”)
# Create a random enter pattern = np.random.random((28,28))
# Convert pattern to torch enter form torch_sample = torch.Tensor(pattern.reshape(1,1,28,28))
# Convert pattern to keras enter form keras_sample = pattern.reshape(1,28,28,1)
# Verify output keras_converted_output = keras_converted_model.predict(keras_sample, verbose=0) keras_orig_output = keras_orig_model.predict(keras_sample, verbose=0) torch_converted_output = torch_converted_model(torch_sample).detach().numpy() torch_orig_output = torch_orig_model(torch_sample).detach().numpy()
np.set_printoptions(precision=4) print(keras_orig_output) print(torch_converted_output) print() print(torch_orig_output) print(keras_converted_output) |
In our case, the output is:
[[9.8908e-06 2.4246e-07 3.1996e-04 8.2742e-01 1.6853e-10 1.7212e-01 3.6018e-10 1.5521e-06 1.3128e-04 2.2083e-06]] [[9.8908e-06 2.4245e-07 3.1996e-04 8.2742e-01 1.6853e-10 1.7212e-01 3.6018e-10 1.5521e-06 1.3128e-04 2.2083e-06]]
[[4.1505e-10 1.9959e-17 1.7399e-08 4.0302e-11 9.5790e-14 3.7395e-12 1.0634e-10 1.7682e-16 1.0000e+00 8.8126e-10]] [[4.1506e-10 1.9959e-17 1.7399e-08 4.0302e-11 9.5791e-14 3.7395e-12 1.0634e-10 1.7682e-16 1.0000e+00 8.8127e-10]] |
This agrees with one another at adequate precision. Be aware that your outcome might not be precisely the identical because of the random nature of coaching. Additionally, because of the nature of floating level calculation, the PyTorch and TensorFlow/Keras mannequin wouldn’t produce the very same output even when the weights have been the identical.
Nevertheless, the target right here is to indicate you how one can make use of Python’s inspection instruments to know one thing you didn’t know and develop an answer.
Additional Readings
This part offers extra sources on the subject if you’re trying to go deeper.
Articles
Abstract
On this tutorial, you realized methods to work underneath the Python REPL and use the inspection features to develop an answer. Particularly,
- You realized methods to use the inspection features in REPL to be taught the interior members of an object
- You realized methods to use REPL to experiment with Python code
- In consequence, you developed a program changing between a PyTorch and a Keras mannequin