There is a method for emotion-triggered capturing of audio and/or image data by an audio and/or image capturing device (202). The method includes receiving (402) and analyzing (404) a time-sequential set of data including first physiological data representing a first physiological parameter on a first person (302), a second physiological data representing a second physiological parameter on a second person (304), and voice audio data including a voice of at least one of the first (302) and the second (304) person, to determine whether a simultaneous change of emotional state of a first person (302) and a second person (304) occurs and transmitting (406) a trigger signal to said capturing device (202).There is also a corresponding apparatus (100) and a system (200) comprising the apparatus (100).