A method and apparatus for cross-domain medical image synthesis is disclosed. A source domain medical image is received. A synthesized target domain medical image is generated using a trained contextual deep network (CtDN) to predict intensities of voxels of the target domain medical image based on intensities and contextual information of voxels in the source domain medical image. The contextual deep network is a multi-layer network in which hidden nodes of at least one layer of the contextual deep network are modeled as products of intensity responses and contextual response.