Biomarkers and methods for predicting risk of a disease in particular RA are provided. Sequences of DNA are obtained. The DNA may be extracted from a sample that is collected from a subject. A relative abundance of a biomarker is then calculated based on the sequences of the DNA. The biomarker comprises a DNA sequence in a genome of Lactobacillus salivarius. A probability of the subject having the disease is obtained based on the relative abundance.