The eating and drinking action detection apparatus: acquires vibration produced from inside of a body of a subject and generates a vibration signal corresponding to the vibration; divides the vibration signal into each frame to calculate power of the vibration signal for each frame; determines, for each frame, whether the frame is a stationary signal having a periodicity or a non-stationary signal having no periodicity; detects, based on the power of each frame and a determination result for each frame whether the frame is the stationary signal or the non-stationary signal, a period of the non-stationary signal being continued while the power of the vibration signal is equal to or larger than a power threshold, acquires a continuation time of the period; and determines, based on the continuation time, whether the subject performed swallowing or mastication in the period of the non-stationary signal being continued.