A method (600) for media content tracking includes receiving a user identifier (12) and instructing display systems (120) to display media content (20) based on the user identifier. Each display system has a corresponding screen (122). The method also includes receiving image data (312) from an imaging system (300) configured to have a field of view (Fv) arranged to capture images (310) of a user (10). The method further includes determining gaze characteristics of the user including a gaze target (GT) of the user and determining whether the gaze target corresponds to one of the screens. When the gaze target corresponds to one of the screens, the method includes determining a time period (tGE) of gaze engagement with the corresponding screen. The method also includes storing at least one of the gaze characteristics and the media content displayed on the screen corresponding to the gaze target.