Share on Facebook Share on Twitter Email
Answers.com

Semantic similarity

 
Wikipedia: Semantic similarity

Semantic similarity is a concept whereby a set of documents or terms within term lists are assigned a metric based on the likeness of their meaning / semantic content.

According to some opinions the concept of semantic similarity is different from semantic relatedness because semantic relatedness includes concepts as antonymy and meronymy, while similarity doesn't. However, much of the literature uses these terms interchangeably, along with terms like semantic distance. In essence, semantic similarity, semantic distance, and semantic relatedness all mean, "How much does term A have to do with term B?"

The answer to this question, as given by the many automatic measures of semantic similarity/relatedness, is usually a number between -1 and 1, or between 0 and 1. 1 signifies extremely high similarity/relatedness, and 0 signifies little-to-none.

An intuitive way of displaying terms according to their semantic similarity is by grouping together closer related terms and spacing more distantly related ones wider apart. This is common - if sometime subconscious - practice for mind maps and concept maps.

Concretely, this can be achieved for instance by defining a topological similarity, by using ontologies to define a distance between words (a naive metric for terms arranged as nodes in a directed acyclic graph like a hierarchy would be the minimal distance—in separating edges—between the two term nodes), or using statistical means such as a vector space model to correlate words and textual contexts from a suitable text corpus (co-occurrence).

Semantic similarity measures have been recently applied and developed in biomedical ontologies, namely, the Gene Ontology. They are mainly used to compare genes and proteins based on the similarity of their functions rather than on their sequence similarity. [1] These comparisons can be done using some tools freely available on the web:

  • ProteInOn can be used to find interacting proteins, find assigned GO terms and calculate the functional semantic similarity of proteins and to get the information content and calculate the functional semantic similarity of GO terms.
  • FuSSiMeG provides a functional similarity measure between two proteins using the semantic similarity between the GO terms annotated with the proteins.
  • CESSM provides a tool for the automated evaluation of GO-based semantic similarity measures.

Besides applications in bioinformatics, similarity is also applied to find similar geographic features or feature types. For instance the SIM-DL similarity server can be used to compute similarities between concepts stored in geographic feature type ontologies.

References

  1. ^ Pesquita C, Faria D, Falcão AO, Lord P, Couto FM (2009). "Semantic Similarity in Biomedical Ontologies". PLoS Computational Biology 5 (7). doi:10.1371/journal.pcbi.1000443. PMID 19649320. http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000443. 

See also


External links


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
 
 

 

Copyrights:

Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Semantic similarity" Read more