A system and method for real time estimation of heart rate (HR) from one or more face videos acquired in non-invasive manner. The system receives face videos and obtains several blocks as ROI (Region of Interest) consisting of facial skin areas. Subsequently, the temporal fragments are extracted from the blocks and filtered to minimize the noise. In the next stage, several temporal fragments are extracted from the video. The several temporal fragments, corrupted by noise are determined using an image processing range filter and pruned for further processing. The HR of each temporal fragment, referred as local HR is estimated along with its quality. Eventually, a quality based fusion is applied to estimate a global HR corresponding to the received face videos. In addition, the disclosure herein is also applicable for frontal, profile and multiple faces and performs in real-time.