The present technology relates to an information processing apparatus, an information processing method, and an endoscope system capable of providing an optimal video image to an operator in accordance with surgical scenes.A processing mode determination unit determines, in accordance with surgical scenes, a processing mode for an in-vivo image captured by an imaging apparatus including an imaging element arranged so as to enable pixel shift processing, and an image combining unit processes an image output from the imaging apparatus, in accordance with the processing mode. The present technology is applicable to, for example, an endoscope system for imaging a living body with an endoscope.