A computer receives human generated reference strings and determines the character, n-gram, type switch, and subtype switch distributions of the reference strings. Each of the aforementioned distributions include corresponding statistical data, such as an average frequency, maximum frequency, minimum frequency, and standard deviation. The computer then receives one or more test strings from which the computer similarly computes the aforementioned statistical data for each of the aforementioned distributions. The computer then compares the distributions of the test string(s) with the distributions of the reference strings. Based on the deviation of the test string distributions from the reference string distributions, the computer determines whether the test strings are human or machine generated.