Artificial Intelligence For Language? Programming Content And Intuition In Semantics
    By News Staff | June 10th 2013 11:22 AM | 16 comments | Print | E-mail | Track Comments

    For a human, knowing the difference between the "charge" of a battery and being charged in a crime is easy. Any three-year-old can look at a cartoon of a chicken and say "That's a chicken" but for computers those are still daunting tasks.

    Obviously language is easier than visual recognition and linguists and programmers have spent 50 years trying to program semantics as software. While IBM's Jeopardy-winning Watson system and Google Translate are high profile, successful applications of language technologies, the humorous answers and mistranslations they sometimes produce are evidence of the continuing difficulty of the problem: that computers lack a lifetime of experience. Using the context in which a word is used, an intrinsic understanding of syntax and logic, and a sense of the speaker's intention, we can intuit what another person is telling us and programs cannot. 

    "In the past, people have tried to hand-code all of this knowledge," explains Katrin Erk, a professor of linguistics at The University of Texas at Austin focusing on lexical semantics. "I think it's fair to say that this hasn't been successful. There are just too many little things that humans know."

    Other efforts have tried to use dictionary meanings to train computers to better understand language, but these attempts have also faced obstacles. Dictionaries have their own sense distinctions, which are crystal clear to the dictionary-maker but murky to the dictionary reader and no two dictionaries provide the same set of meanings.

    Watching annotators struggle to make sense of conflicting definitions led Erk to try a different tactic. Instead of hard-coding human logic or deciphering dictionaries, why not mine a vast body of texts (which are a reflection of human knowledge) and use the implicit connections between the words to create a weighted map of relationships — a dictionary without a dictionary?

    "An intuition for me was that you could visualize the different meanings of a word as points in space," she said. "You could think of them as sometimes far apart, like a battery charge and criminal charges, and sometimes close together, like criminal charges and accusations ("the newspaper published charges..."). The meaning of a word in a particular context is a point in this space. Then we don't have to say how many senses a word has. Instead we say: 'This use of the word is close to this usage in another sentence, but far away from the third use.'"

    A sentence is translated to logic for inference with the Markov Logic Network and its words are translated to points in space. Here "fix" should be close to "correct" and far away from "attach." Credit: Katrin Erk, The University of Texas at Austin

    To create a model that can accurately recreate the intuitive ability to distinguish word meaning requires a lot of text and a lot of analytical horsepower.

    "The lower end for this kind of a research is a text collection of 100 million words," she explained. "If you can give me a few billion words, I'd be much happier. But how can we process all of that information? That's where supercomputers and Hadoop come in."

    Erk initially conducted her research on desktop computers, but around 2009, she began using the parallel computing systems at the Texas Advanced Computing Center (TACC). Access to a special Hadoop-optimized subsystem on TACC's Longhorn supercomputer allowed Erk and her collaborators to expand the scope of their research. Hadoop is a software architecture well suited to text analysis and the data mining of unstructured data that can also take advantage of large computer clusters. Computational models that take weeks to run on a desktop computer can run in hours on Longhorn. This opened up new possibilities.

    "In a simple case we count how often a word occurs in close proximity to other words. If you're doing this with one billion words, do you have a couple of days to wait to do the computation? It's no fun," Erk said. "With Hadoop on Longhorn, we could get the kind of data that we need to do language processing much faster. That enabled us to use larger amounts of data and develop better models."

    Treating words in a relational, non-fixed way corresponds to emerging psychological notions of how the mind deals with language and concepts in general, according to Erk. Instead of rigid definitions, concepts have "fuzzy boundaries" where the meaning, value and limits of the idea can vary considerably according to the context or conditions. Erk takes this idea of language and recreates a model of it from hundreds of thousands of documents.

    Say That Another Way

    So how can we describe word meanings without a dictionary? One way is to use paraphrases. A good paraphrase is one that is "close to" the word meaning in that high-dimensional space that Erk described.

    "We use a gigantic 10,000-dimentional space with all these different points for each word to predict paraphrases," Erk explained. "If I give you a sentence such as, 'This is a bright child,' the model can tell you automatically what are good paraphrases ('an intelligent child') and what are bad paraphrases ('a glaring child'). This is quite useful in language technology."

    A "charge" can be a criminal charge, an accusation, a battery charge, or a person in your care. Some of those meanings are closer together, others further apart. Credit: Katrin Erk, The University of Texas at Austin

    Language technology already helps millions of people perform practical and valuable tasks every day via web searches and question-answer systems, but it is poised for even more widespread applications.

    Automatic information extraction is an application where Erk's paraphrasing research may be critical. Say, for instance, you want to extract a list of diseases, their causes, symptoms and cures from millions of pages of medical information on the web.

    "Researchers use slightly different formulations when they talk about diseases, so knowing good paraphrases would help," Erk said.

    In a paper to appear in ACM Transactions on Intelligent Systems and Technology, Erk and her collaborators illustrated they could achieve state-of-the-art results with their automatic paraphrasing approach.

    Recently, Erk and Ray Mooney, a computer science professor also at The University of Texas at Austin, were awarded a grant from the Defense Advanced Research Projects Agency to combine Erk's distributional, high dimensional space representation of word meanings with a method of determining the structure of sentences based on Markov logic networks.

    "Language is messy," said Mooney. "There is almost nothing that is true all the time. "When we ask, 'How similar is this sentence to another sentence?' our system turns that question into a probabilistic theorem-proving task and that task can be very computationally complex."

    In their paper, "Montague Meets Markov: Deep Semantics with Probabilistic Logical Form," presented at the Second Joint Conference on Lexical and Computational Semantics (STARSEM2013) in June, Erk, Mooney and colleagues announced their results on a number of challenge problems from the field of artificial intelligence.

    In one problem, Longhorn was given a sentence and had to infer whether another sentence was true based on the first. Using an ensemble of different sentence parsers, word meaning models and Markov logic implementations, Mooney and Erk's system predicted the correct answer with 85% accuracy. This is near the top results in this challenge. They continue to work to improve the system.

    There is a common saying in the machine-learning world that goes: "There's no data like more data." While more data helps, taking advantage of that data is key.

    "We want to get to a point where we don't have to learn a computer language to communicate with a computer. We'll just tell it what to do in natural language," Mooney said. "We're still a long way from having a computer that can understand language as well as a human being does, but we've made definite progress toward that goal."


    For a slightly different perspective on this you might want to watch .

    Gerhard Adam
    Despite all this effort, does any of this actually convey meanings?  It certainly can address usage, but I'm not clear that there's anything resembling meaning being derived here. 
    Mundus vult decipi
    Natural language processing is basically a human-computer interaction field. Endowing computers with meaning is not what NLP does, despite the flagrant use of the word "meaning" near descriptions of NLP. No meaning is "conveyed" but you might get a shallow equivalent--if you tell the computer to do something and the desired behavior results, it's all the same to the user.

    This article appears to be describing good ol' semantic nets, except now they've big data-ized it. The power of semantic nets depends on human users to provide the actual meaning.
    Most modern natural Language processing is statistical with Machine Learning. Often using Markov models, Niave baysian, and recently, much progress has also been made with deep belief nets. I think all of these methods as long as they are not logic based do allow the algorithm to assing meaning to words and phrases. If these algorithms are not assigning meaning I would argue that the human mind is also not assigning meaning.
    The important thing to understand about maching learning is that the programmer does not tell the algorithm how to learn the data. They do build the archtecture that is capable of learning, but the learning is done by the algorthm.

    Thank you Captain ML. But what exactly is it that an ML program learns? Is it learning what things mean? Or is it learning weights for what are essentially nonlinear functions? Is the association of human chosen symbols to other human chosen symbols resulting in meaning within the program? Is categorization of images letting the program know what the images represent? If a program believes that a string of bytes is spam, does it understand what the content of the email that those bytes represent means?
    Assuming NB learner it knows that the probability of the string being spam and not ham based on the indiviual probabilies of the words in the string. The descision is base on a threshold value. Other models will be slightly differnt. But point is I believe that is quite similar to what the human or animal brain does. the tech is different, but from a high level vanage point I think it is very similar. Therefore, my statement is that if learning algorithms are not assigning meaning what would make a person believe that the human mind is assigning meaning rather than just performing comlicated nonlinear modeling. What is it that the biological brain doing that is fundementlaly different than the learning algorithm.

    Gerhard Adam
    I suspect that if you defined "meaning" you'd be closer to your answer.

    If I say "AI researchers are idiots", tell me what the difference is in the way your brain is responding versus a learning algorithm.
    Mundus vult decipi
    Gerhard Adam
    BTW, I'm not trying to be insulting, but rather I wanted to say something sufficiently provocative to get the emotional connection to show.

    My point is that no learning algorithm would come close to assigning the meaning to that statement that a human being would.  That's the difference in what the biological brain does.

    This is the same problem I mentioned regarding Watson on Jeopardy.  It may have won, but it didn't know what the meaning of "winning" was, nor did it have any emotional connection to the concept.  Therefore, win or lose didn't make any difference.
    Mundus vult decipi
    KickBorn ( is an attempt to solve this problem. This robot can understand 70% of english language

    Almost any tool can do that. What they can't do is actually understand anything. It is the same brute force approach using 30 years ago, just with more data and higher speed. Google Translate is over 70% right so digitizing a voice is nothing special.
    Thanks for replying.But if you could explain me "What they can't do is actually understand anything" . Any software which can read a content , interpret it, learn from it and then provide information accordingly next time.
    Isn't it a AI ?

    Using the term "learn" in different contexts doesn't make them equivalent (any more than using the term "memory" does), i.e. how animals (such as humans) learn vs. narrow AI (learning to train a classifier) vs. simple state changes.

    I think what Hank was getting statements like "This robot can understand 70% of english language" imply the use of "understand" as many quacks and overconfident researchers use it, which is to say not at all like human understanding. To "understand" something implies meaningfulness. In simple reflexive organisms and that layer that is still existent in more complex animals like humans, you might say a behavioral response is equivalent to understanding (or representation) without there even needing to be any state or representation. But beyond that you need to do a lot to prove that understanding or meaningfulness is happening. Does you program know what it's like to sit in a chair? Can it identify even non-chairs (such as a rock) which afford sitting? Does it understand that good is up and bad is down? Does it understand what it feels like to be angry?
    Gerhard Adam
    Samuel ... I really like you.  While I certainly can't speak to all AI researchers, you seem to be someone that truly does understand the problems and issues.

    That may sound presumptuous coming from me, but hopefully you'll take it in the spirit intended.
    Mundus vult decipi
    Thanks for replying.

    Definitely not , my software will not be aware of how it feels like to sit on chair. Do humans know how it feels like to sit on clouds ? If no, then what will they do ? They will experience it or learn from someone. most of us, 80% of people in this world learn from the content. And KickBorn is an attempt to do same stuff.

    I will appreciate if you could explain more what is missing ?

    Gerhard Adam
    Do humans know how it feels like to sit on clouds ? If no, then what will they do ? They will experience it or learn from someone. most of us, 80% of people in this world learn from the content.
    Sorry, but that's simply wrong.  Humans don't know what it feels like to sit on clouds, because such a thing isn't possible.  There is no experience to be gained.

    However, you make an even more serious mistake and that's in presuming that simply hearing about someone else's experience is comparable to having the experience yourself.  That is certainly not true for humans and it isn't true for anything else.

    You can never know what it feels to fly like a bird.  You can never know what it feels like to swim like a shark.  These are things beyond your ability to experience because we don't have the same senses, nor do we have the same motivations.  We can know nothing about these things except to put our own anthropomorphic interpretations on them.

    People do NOT learn from content, they learn from experience and by doing. 
    Mundus vult decipi
    Computer based analysis of the Semantics of language expressed as text is an AI level problem. Existing methods almost universally use Models of Language (Dictionaries, Grammars, Word Nets, Taxonomies, and Ontologies). The two simplest and most pervasive Models claim that Languages have Words and that those Words have Meanings. While acknowledging that good alternatives do not yet exist, this talk attempts to make plausible that these two "obvious" but fatally incorrect Models result, automatically, in a cascading series of forced engineering decisions that each discard a fraction of the available semantics until we end up with brittle systems that fail in catastrophic and memorable ways. The proposed alternative to word-centric Model Based methods of language analysis is Understanding Machines - capable of learning languages the way humans learn languages in babyhood - using new classes of algorithms based on Model Free Methods.

    For more about this, click the link in the first comment I made above.