A method and device for high-resolution three-dimensional (3-D) imaging which obtains camera pose using defocusing is disclosed. The device comprises a lens obstructed by a mask having two sets of apertures. The first set of apertures produces a plurality of defocused images of the object, which are used to obtain camera pose. The second set of optical filters produces a plurality of defocused images of a projected pattern of markers on the object. The images produced by the second set of apertures are differentiable from the images used to determine pose, and are used to construct a detailed 3-D image of the object. Using the known change in camera pose between captured images, the 3-D images produced can be overlaid to produce a high-resolution 3-D image of the object.