An object is to detect a lesion region with a high accuracy, without significantly affected by differences in an in-vivo image taking environment. To achieve the object, a suspected-lesion-region extracting unit (16) extracts suspected convexity lesion regions and suspected concavity lesion regions using pixels each having a pixel value different from that of surrounding pixels. A groove determining unit (17) determines, as a groove region, a region that corresponds to a shadow of a groove formed between body-cavity organ walls and that is selected from the suspected concavity lesion regions. A lesion-region extracting unit (18) extracts a lesion region by excluding the determined groove region from the suspected concavity lesion regions.