Techniques are provided for fusion of image frames to generate panoramic background images using color and depth data provided from a 3D camera. An example system may include a partitioning circuit configured to partition an image frame into segments and objects, the segments comprising a group of pixels sharing common features associated with the color and depth data, the objects comprising one or more related segments. The system may also include an object consistency circuit configured to assign either 2D or 3D transformation types to each of the segments and objects to transform them to a co-ordinate system of a reference image frame. The system may further include a segment recombination circuit to combine the transformed objects and segments into a transformed image frame and an integration circuit to integrate the transformed image frame with the reference image frame to generate the panoramic image.