Embodiments are disclosed for health assessment and diagnosis implemented in an artificial intelligence (AI) system. In an embodiment, a method comprises: obtaining, using one or more processors of a device, a speech sample from a user uttering a first sentence; processing the speech sample through a neural network to predict a first set of one or more disease-related symptoms of the user; and generating, using the one or more processors, a second sentence to predict a second set of one or more disease-related symptoms or confirm the first set of disease-related symptoms.