Assembly Line Activity Recognition: Difference between revisions

Assembly Line Activity Recognition (view source)

Revision as of 16:26, 1 July 2022

48 bytes added , 1 July 2022

m

Center image galleries

Mherrera

Administrators

386

edits

@@ Line 226: / Line 226: @@
 For the experimentation process, a baseline training was executed in order to have a point of reference and a starting point. After the baseline was executed, the first decision to make was whether to use transfer learning or keep the training process from scratch. The results from using transfer learning yielded the biggest improvement on the network performance; we used a SlowFast model from torch hub that was trained on the [https://www.deepmind.com/open-source/kinetics kinetics] 400 dataset. This achieved twice as better performance in one forth of the time compared to training from scratch. Transfer learning improved the training times as well as the network performance. The following plots show the difference between transfer learning and the baseline confusion matrices.
-<gallery widths=350px heights=350px>
+<gallery widths=350px heights=350px mode=packed>
 File:Assembly_baseline_conf_matrix.png|Baseline confusion matrix
 File:Assembly_transfer_learning_conf_matrix.png|Transfer learning confusion matrix
@@ Line 235: / Line 235: @@
 The next problem that needed solving was the data imbalance present in the dataset. As seen in the original data distribution plot, the dataset was not balanced and some classes were under represented. To tackle this problem, the first technique tested was the dataset subsampling, where not all samples available were used; only selected samples from each class in order to keep a balanced distribution. This was not optimal since a lot of useful data was being left out; after that, data replication was introduced where samples were selected with replacement. This was also not ideal since samples for the under represented classes were repeated a lot in the dataset. Finally, different loss functions were tested, particularly weighted cross entropy and focal loss, both of which account for the data distribution to calculate the loss. This is what yielded the best results and led to the use of focal loss for all experiments going forward. The following plots shows the original baseline dataset distribution and the final distribution used for most of the experiments; this also includes the removal of under represented labels.
-<gallery widths=350px heights=250px>
+<gallery widths=350px heights=250px mode=packed>
 File:Assembly_baseline_dataset_distribution.png|Baseline dataset
 File:Assembly_balanced_dataset.png|Dataset distribution after class balancing
@@ Line 246: / Line 246: @@
 To solve this the first experiment was training with more data, specifically with the complete dataset; this however, did not reduce the overfit, so the next experiment was to remove the underrepresented classes such as Part Removal, this had an improvement over the network but the overfit remained; finally the training approach was changed to use cross validation which solved the overfitting issue and it was kept for the final training. The following image shows the result of cross-validation training and how both plots do not cross anymore.
-<gallery widths=350px heights=350px>
+<gallery widths=350px heights=350px mode=packed>
 File:Assembly_overfitting_plot.png|Loss plots overfitting behaviour
 File:Assembly_no_overfit_training_plot.png|Cross validation training (no overfit)
@@ Line 311: / Line 311: @@
 In addition, the following plots show the training and validation loss for the best performing network, as well as the test confusion matrix; this matrix shows predominance along its diagonal; indicating a match between the networks predictions and the sample's ground truth.
-<gallery widths=350px heights=350px>
+<gallery widths=350px heights=350px mode=packed>
 File:Assembly-training-loss.svg|Loss plots
 File:Assembly-test-confusion-matrix.svg|Confusion matrix (5354 samples)