A noise signal is estimated based on a captured audio signal captured from a sound capture unit. It is determined whether the estimated noise signal thus estimated is in a noiseless state. If it is determined that the estimated noise signal is in the noiseless state, the captured audio signal is analyzed as a target sound signal, and a characteristic obtained by the analysis is learned and modeled, thereby generating a target sound model.