A method of the invention comprises obtaining training dental CT scans, identifying individual teeth and jaw bone in each of these CT scans, and training a deep neural network with training input data obtained from these CT scans and training target data. A further method of the invention comprises obtaining (203) a patient dental CT scan, identifying (205) individual teeth and jaw bone in this CT scan and using (207) the trained deep learning network to determine or verify a desired final position from input data obtained from this CT scan. The (training) input data represents all teeth and the entire alveolar process and identifies the individual teeth and the jaw bone. The determined or verified desired final positions are used to determine a sequence of desired intermediate positions per tooth and the intermediate and final positions and attachment types are used to create three-dimensional representations of teeth and/or aligners.