_to1hot functiion incompatibility with non-incremental labels
Ketos overall implementation assumes (or wants) background labels to have a label of 0. And all of the examples and actual systems we developed so far have background in the training set. However, lets say we have an application with binary signal labels (no background). Ketos standardize will map them to label 1 and 2 and this breaks the one-hot-enconding we do later in the batch generator.
def _to1hot(cls, class_label, n_classes=2):
one_hot = np.zeros(n_classes)
one_hot[class_label]=1.0
return one_hot
one_hot = np.zeros(n_classes)
will create a list with index 0 and 1, but we only have class labels 1 and 2. which will break the next line of code.
See Issue #149 (closed) In this case the user had signal label assigned to label 1 and background label assigned to 0. However, since she didnt specify that background labels had a value of 0, ketos assumed that both were signal labels, which I think is an oversight.
My main question here is why do we need to assume that label 0 is background and everything else is signal? At the end of the day, background is just another type class. Why do we need to make this distinction? Couldnt we just add another parameter to the functions that create background (such as create random background) for the user to define what class should be assigned to creations?
Or, alternatively, we can move the standardization to later in the pipeline. Or at the very least make the standardization assume that label 0 should continue to be mapped to label 0.