PDA

View Full Version : Fuzzy Searches?



wie
07-05-2004, 02:23 AM
I am wondering if it would be possible to implement some kind of fuzzy search in BW?
Having searched a lot in the Medline literature database lately, I found the button "Related articles" very helpful. The program tries to find articles that have the same keywords and agree overall a lot.
Now I could imagine for BW a button "Find similar verses" and "Find similar string". The program then checks for agreements in lemmata and word order. This would be very helpful to find parallels. Perhaps one could add a manual entry for the "cut-off" point.

almather
07-05-2004, 06:57 AM
If this were possible, my initial reactions is: this would be very useful.

Al

Ben Spackman
07-05-2004, 03:36 PM
Isn't this what the greek semantic-domain lists do? Of course, it's only for the greek...

Charlie
07-06-2004, 10:53 AM
If I understand you correctly, this is what you are looking for:
You are looking at a passage and you want to see other passages that contain n number of words in common with the current passage.

This functionality was suggested by Dr. Jan Verbruggen last fall and was added to an update after the initial release. Here's what I think is the simplest way to use it.
1. Select the entire verse you are looking at in the corresponding morphology version (WTM for Hebrew), right click on the selected text and choose Copy String to Command Line.
2. Then with the text on the Command Line open the ASE and double click the merge box (the AND) to set the option "only require n of the attached word boxes".
3. Turn off hit highlighting by double clicking the hlt/hit box in the ASE status bar. Generally you will also want to delete the word boxes containing articles, conjunctions, prepositions, etc.
4. Then click Go.

wie
07-06-2004, 11:34 AM
This is not exactly what I wanted (too complicated and not "fuzzy"). Let me repeat:
I could imagine for BW an option "Find similar verses" and "Find similar string" (using the right mouse button). The program then checks for agreements in lemmata and word order. This would be very helpful to find parallels. Perhaps one could add a manual entry for the "cut-off" point.

The best hits appear at the top and the more you go down in the list the percentage of agreement falls off. The program has to calculate a "global agreement factor", it involves such things as word order, agreement in lemmata, agreement in form, distance from string etc.
I think this is not very easy to program, but it would be extremely helpful.

JVerbruggen
07-06-2004, 01:53 PM
If I understand Wieland correctly, he would like BW to check links between one verse and any other verse in the BW that has similar words or order. In the search I suggested, one would specify the words one is looking for, but would also find verses that were not a complete match.

Jan Verbruggen

vr8ce
07-06-2004, 05:39 PM
If I understand you correctly, this is what you are looking for:
You are looking at a passage and you want to see other passages that contain n number of words in common with the current passage.

This functionality was suggested by Dr. Jan Verbruggen last fall and was added to an update after the initial release. Here's what I think is the simplest way to use it.

Cool! Is there a way to do that in the command line? If not can we please have one? :) Pretty please? I haven't had the need to learn the ASE yet (the command line is the #1 reason I chose BW), and I would use this capability if it were available on the CL.

Thanks!

Vince

vr8ce
07-06-2004, 05:44 PM
This is not exactly what I wanted (too complicated and not "fuzzy"). Let me repeat:
I could imagine for BW an option "Find similar verses" and "Find similar string" (using the right mouse button). The program then checks for agreements in lemmata and word order. This would be very helpful to find parallels. Perhaps one could add a manual entry for the "cut-off" point.
...

The definition of insanity is doing the same thing and expecting a different result, so repeating yourself isn't going to help much. :)

"Checks for agreements in lemmata and word order." What does that mean, specifically? Can you give some more verbiage on what you want that to do. Can you give some examples in English (i.e. searching in English translations) and original language (i.e. searching in Greek or Hebrew). So, for example, if I'm searching for ".peter fish;2" (find peter and fish in adjoining verses, yes I know that's not a good technical description but I'm not trying to be technical :)), what would you want your new capability to do? Feel free to provide a better example.

Thanks,

Vince

wie
07-07-2004, 03:34 AM
I don't know what "insanity" has to do with my request?

Here is an example:
BGT Mark 6:6 KAI EQAUMAZEN DIA THN APISTIAN AUTWN. KAI PERIHGEN TAS KWMAS KUKLW DIDASKWN.

I select the string "KAI EQAUMAZEN DIA THN APISTIAN AUTWN", do a right click and select "Find similar strings".
Then those verses are displayed in the verse list that show any similarity with this string with those having the highest agreement being on top.
Agreement is defined by word order, agreement in lemmata, agreement in form, distance of words, distance from string etc.

vr8ce
07-07-2004, 04:27 PM
I don't know what "insanity" has to do with my request?
Insanity has to do with your repeating yourself and expecting someone to read it different the second time than they did the first. I do this all the time as well, and it never works for me either. :) If they didn't understand what I said the first time, then I need to re-word it, or provide more details, or something.


Here is an example:
BGT Mark 6:6 KAI EQAUMAZEN DIA THN APISTIAN AUTWN. KAI PERIHGEN TAS KWMAS KUKLW DIDASKWN. I select the string "KAI EQAUMAZEN DIA THN APISTIAN AUTWN", do a right click and select "Find similar strings".
Then those verses are displayed in the verse list that show any similarity with this string with those having the highest agreement being on top.
Excellent, we're off to a good start.


Agreement is defined by word order, agreement in lemmata, agreement in form, distance of words, distance from string etc.
It's this part that's still fuzzy (sorry, gratuitous pun). What do you mean by "word order, agreement in lemmata, etc."? Can you give some examples of other verses that would match, and why (according to your criteria). Do you only mean for this to work in original languages? And so on.

Thanks!

Vince