An ultrasonic volume data processing device which forms a three-dimensional image of a target tissue in a living body is provided. A range in which a rendering process is applied is limited by a three-dimensional region of interest (3D-ROI). The three-dimensional region of interest has a clipping plane as a rendering start surface. A shape of the clipping plane can be deformed into a convex shape or a concave shape by a user operation, and the clipping plane may be freely inclined in two-dimensional directions. With this configuration, for example, the clipping plane can be suitably positioned in a gap between a face of a fetus and a placenta. When the curved clipping plane is used, a striped pattern noise tends to be formed in the three-dimensional image. In order to resolve or reduce the striped pattern noise, a special voxel calculation is applied to a final voxel of each ray in the voxel calculation for each ray.