Systems and methods for extracting text from images rendered on a display screen, the method comprising capturing a color image rendered on a display screen; and transforming the color image to binary color image, preserving text-like graphic components and filtering out non-text-like graphical components. The transforming comprises scanning one or more areas of the color image; and detecting continuous bi-tonal regions in the scanned one or more areas, wherein the continuous bi-tonal regions have large variances.