An image display device 4 includes an image extraction unit 451 that extracts a main image group including main images satisfying a predetermined condition from an acquired image group an image association unit 453 that associates sub images not extracted by the image extraction unit 451 with each of the main images included in the main image group and a display control unit 454 that generates a display screen in which the main image group is arranged in a first area and a sub image group including the sub images is arranged in a second area different from the first area, and displays the display screen in the display unit 44. The main image group is arranged in the first area in a time-series manner along a first direction. The sub image group is arranged in the second area in a time-series manner along a second direction orthogonal to the first direction such that each of the sub images associated with the main images by the image association unit 453 is aligned with the main images. The display control unit 454 changes the display screen in a mode in which the main image group is moved along the first direction and in a mode in which at least part of the main image group and at least part of the sub image group are moved along the second direction.