A method for training a visual prosthesis includes presenting a non-visual reference stimulus corresponding to a reference image to a visual prosthesis patient. Training data sets are generated by presenting a series of stimulation patterns to the patient through the visual prosthesis. Each stimulation pattern in the series is determined at least in part on a received user perception input and a fitness function optimization algorithm. The presented stimulation patterns and the user perception inputs are stored and presented to a neural network off-line to determine a vision solution.