Systems and methods are provided for handling concurrent speech in which temporally overlapping first speech data and second speech data is received from respective first and second participants of a session. A speech policy applied to the speech data specifies dropping the second speech when it interrupts the first speech within a first interval of the first speech data. The first interval is temporally bounded by the beginning of the first speech and a first predetermined amount of time after the beginning of the first speech. The speech policy specifies outputting the first speech data and then outputting the second speech data when the second speech data interrupts a second interval of the first speech data. The second interval of the first speech data is temporally bounded by the end of the first speech data and a second predetermined amount of time before the end of the first speech data.