An arrangement for determining positions of the teats of an animal is provided in a robot-based milking system. The arrangement comprises a camera pair directed towards the teats of the animal for repeatedly recording pairs of images, and an image processing device for repeatedly detecting the teats of the animal and determining their positions by a stereoscopic calculation method based on the recorded pairs of images. The cameras of the camera pair are arranged vertically one above the other and the image processing device is provided, for each teat and for each pair of images, to define the position of the lower tip of the teat contour in the pair of images as conjugate points for said stereoscopic calculation, and to find said conjugate points along a substantially vertical epipolar line.