Typically, c-class Classifiers of I-dimensional inputs are trained with input and target matrix sizes
[I N ] = size(input)
[c N ] = size(target)
where the target columns are 1/0 c-dimensional unit vectors.
For example, with three distinct classes the targets are the column vectors of the 3-dimensional unit matrix:
[ 1 0 0 ]
[ 0 1 0 ]
[ 0 0 1 ]
Using SOFTMAX as the output function, the outputs will be unit sum non-negative vectors and are interpreted as the a posteriori probabilities of the input belonging to each class.
The classification is determined by the largest a postiori probability:
[ 0.70 0.40 0.11
0.10 0.05 0.66 ===> [ 1 3 2 ]
0.20 0.55 0.33]
THE EXCEPTION IS WHEN THERE ARE ONLY 2 CLASSES. Then, only one output in [0 1 ] is necessary. The other output is obtained by subtracting the output from one.
Hope this helps.
*Thank you for formally accepting my answer*