Finetune synonym

3/19/2023

In my experience learning from scratch leads to better results, but it is much costly than the others especially regarding time and resources consumption. The output layer can also be different and have some of it frozen regarding the training. You can also have a base model working for a similar task and then freezing some of the layers to keep the old knowledge when performing the new training session with the new data. Usually, we change the learning rate to a smaller one, so it does not have a significant impact on the already adjusted weights. Then, we train the same model with the remaining 10%. In Fine-tuning, an approach of Transfer Learning, we have a dataset, and we use let's say 90% of it in training. Then, we train the same model with another dataset that has a different distribution of classes, or even with other classes than in the first training dataset). In Transfer Learning or Domain Adaptation, we train the model with a dataset. Fine-tuning is one approach to transfer learning where you change the model output to fit the new task and train only the output model. Transfer learning is when a model developed for one task is reused to work on a second task.

I just don't know if I am stuck, or I am just doing something horribly wrong. I don't know if this is the proper platform for discussion on this topic, perhaps you know a slack or gitter channel which this belongs to. It's likely that I am doing anything wrong or that I get the purpose of transfer learning wrong (I thought that it should speed up learning, as most layers aren't trainable and therefore no calculation is done)

It seems, that transfer learning or fine tuning on these detectors doesn't have any advantage at all. I also found pretrained weights for traffic sign classification with VGG16 which I thought should be the ideal base for transfer learning on this topic, but this detector was the worst performing so far (loss stagnated at 11, even when learning rate is changed and after 100 epochs it overfitted). I've implemented my detector not on my own, but based upon an original SSD port to Keras/Tensorflow (from here) and already trained it with different variations (Belgium from scratch, pretrained with MS COCO, Transfer to Germany, Convolution frozen, fine tuned to Germany) and after weeks of Training now I can say, that Belgium with random weights from scratch is converging fastest (after only 40 epochs/2 days my custom SSD loss function is down to a value of 3) while all other variations need much more time, more epochs and loss is never falling below a value of 9.

(as a comparision) learn the new Country from scratchĮven the very first detector (the one trained from scratch on the comprehensive belgium dataset), does it have any advantage, to load pretrained weights from published model Zoos (for example VGG16/COCO) and then finetune/transferlearn based on this?.
Transfer learn the Network and freeze some of the convolution layers.
So having a couple of examples of traffic signs in the new Country, is it better to ~~~ the following section is about my Research questions ~~~ For an overview on this Topic, I would recommend the wikipedia article The Research focuses on the problem of having a detector which is well trained and interfering on the traffic sign dataset it was trained for (I took the Belgium Traffic sign detection dataset) but when it Comes to use the detector in another Country (Germany, Austria, Italy, Spain.) the traffic signs look more or less different which results in a certain unwanted loss. In my master thesis, I am researching on transfer learning on a specific use Case, a traffic sign detector implemented as a Single Shot Detector with a VGG16 base network for classification.

0 Comments

Finetune synonym

Leave a Reply.

Author

Archives

Categories