An exemplary system includes 1) a fitting facility configured to maintain data representative of a library of one or more sounds and data representative of a library of one or more environments, and 2) a detection facility configured to detect a selection by a user of a sound included in the library of one or more sounds an environment included in the library of one or more environments. The fitting facility is further configured to generate, based on the selected sound and the selected environment, an audio signal representative of an acoustic scene and use the audio signal to fit a cochlear implant system to a patient. Corresponding systems and methods are also disclosed.