A scanning confocal endoscope system is configured by: a point source that scans on a subject with excitation light by periodically moving in a two-dimensional plane a point source control means that controls the point source so that irradiation density of the excitation light becomes smaller than or equal to predetermined density over a whole scanning area a confocal pinhole arranged at a position conjugate with a converging point of the excitation light an image signal detection means that detects an image signal by receiving fluorescence emitted from the subject being excited by the excitation light via the confocal pinhole and an image generation means that generates a confocal image using the detected image signal.