A method and apparatus of processing images captured from human gastrointestinal (GI) tract by a capsule camera are disclosed. High frame-rate images captured from human gastrointestinal (GI) tract by a capsule camera are received for processing. The high frame-rate images comprise first images at a first spatial resolution corresponding to a regular frame rate and second images at a second spatial resolution, the first images and the second images are interleaved, and the second spatial resolution is lower than the first spatial resolution. Motion models among the high frame-rate images are derived by applying image registration to the high frame-rate images. The high frame-rate images are stitched according to the motion models to generate stitching outputs comprising stitched images and non-stitched images. The stitching outputs are provided.