Rules are automatically learned via machine-learning techniques to deduce the semantic roles of extracted information elements, as well as, compute the respective levels of certainty that the semantic roles are indeed as deduced. Such a process is referred to herein as “tagging” the information elements. The tagged information elements are then associated, in a database, with their respective deduced semantic roles and levels of certainty. The machine-learning techniques provided herein include supervised, unsupervised, and semi-supervised techniques. Embodiments described herein may be applied to data leakage prevention, cyber security, quality-of-service analysis, lawful interception, or any other relevant application.