The invention relates to a portable system that allows blind or visually impaired persons to interpret the surrounding environment by sound or touch, said system comprising: two cameras (3i, 3d) separate from one another and configured to capture an image of the environment simultaneously, and means (4i, 4d) for generating sound and/or touch output signals. Advantageously, the system also comprises processing means (2) connected to the cameras (3i, 3d) and to the means (4i, 4d) for generating sound and/or touch signals. The processing means are configured to combine the images captured in real time and to process the information associated with at least one vertical band with information relating to the depth of the elements in the combined image, said processing means (2) also being configured to: divide the vertical band into one or more regions; define a sound or touch signal, in each region, according to the depth of the region and the height of the region; and define a sound or touch output signal based