An apparatus for analyzing a sound signal is based on an ear model for deriving, for a number of inner hair cells, an estimate for a time-varying concentration of transmitter substance inside a cleft between an inner hair cell and an associated auditory nerve from the sound signal so that an estimated inner hair cell cleft contents map over time is obtained. This map is analyzed by means of a pitch analyzer to obtain a pitch line over time, the pitch line indicating a pitch of the sound signal for respective time instants. A rhythm analyzer is operative for analyzing envelopes of estimates for selected inner hair cells, the inner hair cells being selected in accordance with the pitch line, so that segmentation instants are obtained, wherein a segmentation instant indicates an end of the preceding note or a start of a succeeding note. Thus, a human-related and reliable sound signal analysis can be obtained.