A method of processing images captured using a capsule camera is disclosed. According to one embodiment, two images designated as a reference image and a float image are received, where the float image corresponds to a captured capsule image and the reference image corresponds to a previously composite image or another captured capsule image prior to the float image. Automatic segmentation is applied to the float image and the reference image to detect any non-GI (non-gastrointestinal) region. The non-GI regions are excluded in match measure between the reference image and a deformed float image during the registration process. The two images are stitched together by rendering the two images at the common coordinate. In another embodiment, large area of non-GI regions are removed directly from the input image, and remaining portions are stitched together to form a new image without performing image registration.