A method for media content tracking is disclosed. The method includes receiving a user identifier and instructing display systems to display media content based on the user identifier. Each display system has a corresponding screen. The method also includes receiving image data from an imaging system configured to have a field of view arranged to capture images of a user. The method further includes determining gaze characteristics of the user including a gaze target of the user. The method further includes determining whether the gaze target corresponds to one of the screens. When the gaze target corresponds to one of the screens, the method includes determining a time period of gaze engagement with the corresponding screen. The method also includes storing at least one of the gaze characteristics and the media content or an identifier of the media content displayed on the screen corresponding to the gaze target.