A candidate-region for an object-region is set in an image. In a graph including point-S corresponding the object-region, point-T corresponding to a background-region, a point corresponding to each pixel in the image, S-link connecting each pixel and point-S, T-link connecting each pixel and point-T, and N-link connecting each pair of adjacent pixels, a cost is set for each link, graph-cut is performed. Whether a pixel connected to point-S by a link is present in the graph is judged. If no pixel is present, graph-cut is performed at each stage while costs set for all the S-links connecting each pixel in the candidate-region and point-S are increased stepwise in an increment of a predetermined threshold or less until a pixel connected to point-S by a link appears. Whether each pixel in the image belongs to the object-region or the background-region is determined based on a result of graph-cut performed last.