Traditional electrocardiography (ECG) and photo-plethysmography (PPG) based HR estimation require human skin contact which is not only user uncomfortable, but also infeasible when multiple user monitoring is required or extreme sensitive conditions is a prime concern as in the case of monitoring neonates, sleeping human and skin damaged patients. Temporal signals depicting the motion or color variations in the frames across time, are estimated from a Region of Interest using Eulerian or Lagrangian approaches. However, the Eulerian approach fails under improper illumination, inappropriate camera focus or human factors like skin color. Likewise, Lagrangian approach is highly time-consuming and may fail when few or less discriminatory features are available for tracking. The present disclosure provides a poorness measure that is indicative of when an approach fails and facilitates serial fusion of the two approaches. Switching to an appropriate approach results in accurate heart rate estimation.