Live Chat Software for Business

Most consumer generated content systems use keywords to classify verbatim. Keyword classification is largely based on human intuition and experience. As valuable as that may be, it’s no match for the statistical machine learning techniques used by OpenMic. First, many of the clues to categorization in a verbatim message are counterintuitive. For example, let's say the category is “legal issues.” The obvious keywords pointing to "legal issues" are: “law suit,” “lawyer,” “brief,” etc. But when you examine a set of verbatim that you know are about "legal issues", you'll see there are a lot of them that don't contain any of these keywords. It turns out that two of the best clues to messages about legal issues are “sir” and “madam” . Messages beginning with the phrase “Dear Sir or Madam” are far more likely to pertain to legal issues than not. This includes valuable clues which wouldn’t occur to a human (like “Dear Sir or Madam”) as well as clues which at first glance would semm to indicate a category but indicate the opposite. For example, "I couldn't be happier" throws a wrench in most keyword classification schemes.