Similarity measure

Welcome, Guest. Please Login or Register.
2025-06-30 18:03:03 CoV Wiki
Learn more about the Church of Virus

News: Do you want to know where you stand?

  Church of Virus BBS
  General
  Science & Technology
  Similarity measure

_{« previous} _{next »}

Pages: [1]

Author

Topic: Similarity measure (Read 566 times)

David Lucifer
Archon

Posts: 2642
Reputation: 8.40
Rate David Lucifer

Enlighten me.

Similarity measure
« on: 2005-09-07 23:16:02 »

I've been toying with the idea for a couple years now that there may exist an algorithm for determining a quantative measure of how similar two digital objects are. All information can be encoded as a sequence of ones and zeros aka a bit string. Obviously computer programs and email are encoded as bit strings but also all music, photos, books, movies, DNA, anything that can be recorded can be encoded.

What if there was a machine where you could feed in any two bit strings and it would reply with a number between 0 and 1 where 0 means they have no similarity and 1 means they are identical (0.5 means they are 50% similar). To determine the identical case is pretty easy, just make sure the bit strings match exactly, every one and zero. Cases close to identical should also be easy, if the bit strings are identical except for a few positions where one string has a zero where the other has a one.

It gets a lot trickier when similar things don't look similar at the bit level. Compare all these Mona Lisa pictures. They are different sizes, different shadings, some are cropped, many are altered, but they are undeniably similar to some extent. Or imagine two copies of the same Mona Lisa picture where one is upside down. We recognize them as very similar but it would be difficult for a machine to recognize the fact because all the ones and zeros are in different places.

The same song played by the same artist at two consecutive live shows will likely be very similar. One of those live recordings compared to the album version will be similar but less so. The same song recorded by a different artist will be less similar, but still more similar than a different song by the same artist. Two songs in the same genre (e.g. polka or heavy metal) should (generally) be more similar than comparing a polka song to a heavy metal song. And any song compared to a photo would be highly dissimilar.

Does such an algorithm exist? It must because humans do this all the time automatically. The act of recognition means that what we are perceiving is similar (in some salient way) to something we already know about. Finding this algorithm may very well be one of the keys to real AI.

Report to moderator

Logged

Walter Watts
Archon

Gender:

Posts: 1571
Reputation: 8.02
Rate Walter Watts

Just when I thought I was out-they pull me back in

Re:Similarity measure
« Reply #1 on: 2005-09-08 00:01:39 »

[David Lucifer]

<snip>
Does such an algorithm exist? It must because humans do this all the time automatically. The act of recognition means that what we are perceiving is similar (in some salient way) to something we already know about. Finding this algorithm may very well be one of the keys to real AI.
<snip>

[Walter Watts]

Interesting question, David.
I see the human "act of recognition" as a fairly quick pattern matching process performed on multiple streams of highly dynamic massively parallel processed streaming data.

To digitally encode this "comparison snapshot" (for lack of a better term) begs the question of where to mark the beginning and the end of the comparator data. That would require headers (field information) which not only varies within each individual, but also from individual to individual.

Hmmm.....

Walter
<who has many more questions than answers>

Report to moderator

Logged

Walter Watts
Tulsa Network Solutions, Inc.

No one gets to see the Wizard! Not nobody! Not no how!