[cs231n_review#Lecture 3-1] loss functions

코딩상륙작전 2023. 6. 18. 17:01

there exists some parameter matrix, W which will take this long column vector representing the image pixels, and convert this and give you 10 numbers giving scores for each of the 10 classes in the case of CIFAR-10. Where we kind of had this interpretation where larger values of those scores, so a larger value for the cat class means the classifier thinks that the cat is more likely for that image, and lower values for maybe the dog or car class indicate lower probabilities of those classes being present in the image.

to formalize this a little bit, usually when we talk about a loss function, we imagine that we have some training data set of xs and ys, usually N examples of these where the xs are the inputs to the algorithm in the image classification case, the xs would be the actually pixel values of your images, and the ys will be the things you want your algorithm to predict, we usually call these the labels or the targets.

So in the case of image classification, remember we're trying to categorize each image for CIFAR-10 to one of 10 categories, so the label y here will be an integer between one and 10 or maybe between zero and nine depending on what programming language you're using, but it'll be an integer telling you what is the correct category for each one of those images x.

* Loss function

And now our loss function will denote L_i to denote the, so then we have this prediction function x which takes in our example x and our weight matrix W and makes some prediction for y, in the case of image classification these will be our 10 numbers. Then we'll define some loss function L_i which will take in the predicted scores coming out of the function f together with the true target or label Y and give us some quantitative value for how bad those predictions are for that training example. And now the final loss L will be the average of these losses summed over the entire data set over each of the N examples in our data set.