Hire a web Developer and Designer to upgrade and boost your online presence with cutting edge Technologies

Tuesday 25 April 2023

Weight and Activation Quantization

 The experiments showed that the tradeoff between accuracy and

quantization differs in each layer. We call this layer sensitivity.
For example, middle layers are much more robust to quantization
whilst
also housing majority of the weights
As shown in Figure, majority of the MAC operations are conducted
in
the second, third and fourth convolutional layers.
Using mixed2 scheme, more than 3 times saves is achieved at the
MAC
operations. Hence, the energy consumption and latency
numbers
can also be significantly reduced with only sacrificing
0
.2% accuracy.
In the next Figure , the distribution of weights per layer is shown.
It is clear that the majority of weights reside in the first dense
layer,
followed by the fourth convolutional layer.
These are also some of the least sensitive layers. Meaning, by
deeply
quantizing these layers, majority of the memory space
can
be saved

No comments:

Post a Comment

Connect broadband

Promise of Deep Learning for Natural Language Processing

  The promise of deep learning in the field of   natural language processing   is the better performance by models that may require more dat...