The present invention provides a multi-viewpoint video audiovisual system that allows a viewer to view a multi-viewpoint video content and removes in advance video images in which hands are hidden as a video image candidate. A first invention of the present application selects video images to be displayed on a video image display section in advance and presents the selected video images to a viewer by causing an information processing apparatus to perform hand-target image recognition on motion image (video image) data that includes image data captured with a plurality of video cameras and a frame string disposed in a time region to select video images showing a hand or not to select video images showing no hand.