The present disclosure relates to a method for emotion-triggered capturing of audio and/or image data by an audio and/or image capturing device. The method includes receiving and analyzing a time-sequential set of data including first physiological data representing a first physiological parameter corresponding to a first person, a second physiological data representing a second physiological parameter corresponding to a second person, and voice audio data including a voice of at least one of the first and the second person, to determine whether a simultaneous change of emotional state of a first person and a second person occurs and transmitting a trigger signal to the capturing device. The present disclosure also relates to a corresponding apparatus and a system comprising the apparatus.