A detection device has a neural network process section performing a neural network process using parameters to calculate and output a classification result and a regression result of each of frames in an input image. The classification result shows a presence of a person in the input image. The regression result shows a position of the person in the input image. The parameters are determined based on a learning process using a plurality of positive samples and negative samples. The positive samples have segments of a sample image containing at least a part of the person and a true value of the position of the person in the sample image. The negative samples have segments of the sample image containing no person.