The present invention is an improved fitting and training system for a visual prosthesis. A patient, using the visual prosthesis observes a display and indicates location, movement, shape or other properties of the display image to provide for improved fitting and training. In one embodiment, the patient uses a touch screen monitor which displays an image. The patient touches the monitor at the location where the patient perceives the image. The system then corrects the image to the location indicated by the patient. In another embodiment a patient observes an image moving across the touch screen monitor and indicates by moving their hand across the monitor which direction they believe the image is moving. The system can then rotate the image to match the image perceived by the patient.