PDA

View Full Version : [Q] How to find all verses in which a word occurs N or more times?



mtp1032
07-19-2013, 10:49 AM
Can this be done via the menus (in which cast I've not yet found it :confused:) or is there a morphology string that could accomplish this task? Thanks, in advance, Michael

Jim Wert
07-19-2013, 12:11 PM
Can this be done via the menus (in which cast I've not yet found it :confused:) or is there a morphology string that could accomplish this task? Thanks, in advance, Michael

A. I am not one of the Michaels. Sorry.
B. I found you question ambiguous. I assume you want to find multiple occurrences of one specific word.

In version NRS I used this search:
'earth *99 earth *99 earth
and got 4 verses where "earth" was used 3 times (in 26.5 seconds).
'earth *99 earth
got 71 verses with 2 or 3 occurrences (in 0.46 seconds).
'lord *99 lord *99 lord
got 115 verses with 3 or 4 occurrences (in 205 seconds).
'lord *99 lord *99 lord *99 lord
after 15+ minutes I Aborted the search; status bar said it was compiling.

If I understand correctly, *99 means there may be 0-99 words between the earths (or lords).

Hope this helps.

--Jim

P.S. From a little more experimenting, it appears that 3 is the practical limit for this technique.
I tried using the "lord" search, with the hits from 3 lords as the search limits.
With a "PHRASE search" ['] it didn't seem to get out of the 'Compile' stage.
With a "Linear PHRASE" search [;], which was much slower for 2 0r 3 hits, it did seem to keep chugging. The status text looked like it eventually quit, so I aborted the search, to discover that it must have kept going. It reported 8 verses in 565 seconds, got into 2Ch before I aborted.

mtp1032
07-19-2013, 01:12 PM
I want to be able to execute a computer-based search whose objective is to find all verses in which a specified Hebrew word, or Strong’s Number occurs N times (where N is an integer between 1 and 99).
For example, suppose I specify the Hebrew word אָרֶץ:


If N = * (or is not specified), the search will find and display exactly 853 verses in which אָרֶץoccurs 1 or more times. This is the current behavior.
If N = 3, the search will find and display 21 verses with exactly 3 occurrences of אָרֶץ
If N = 4, the search will find and display 1 verse with exactly 4 occurrences of אָרֶץ
If N = 5, the search will find and display 1 verse with exactly 5 occurrences of אָרֶץ

I hope this clears up any confusion you may have,
Blessings,
Michael

MBushell
07-19-2013, 06:41 PM
A. I am not one of the Michaels. Sorry.
B. I found you question ambiguous. I assume you want to find multiple occurrences of one specific word.

In version NRS I used this search:
'earth *99 earth *99 earth
and got 4 verses where "earth" was used 3 times (in 26.5 seconds).
'earth *99 earth
got 71 verses with 2 or 3 occurrences (in 0.46 seconds).
'lord *99 lord *99 lord
got 115 verses with 3 or 4 occurrences (in 205 seconds).
'lord *99 lord *99 lord *99 lord
after 15+ minutes I Aborted the search; status bar said it was compiling.

If I understand correctly, *99 means there may be 0-99 words between the earths (or lords).

Hope this helps.

--Jim

P.S. From a little more experimenting, it appears that 3 is the practical limit for this technique.
I tried using the "lord" search, with the hits from 3 lords as the search limits.
With a "PHRASE search" ['] it didn't seem to get out of the 'Compile' stage.
With a "Linear PHRASE" search [;], which was much slower for 2 0r 3 hits, it did seem to keep chugging. The status text looked like it eventually quit, so I aborted the search, to discover that it must have kept going. It reported 8 verses in 565 seconds, got into 2Ch before I aborted.

Hint: enter the search on the command line and then open the Advanced Search Engine. It is much quicker for this kind of search.
Mike

Jim Wert
07-19-2013, 07:11 PM
Thank you, that is a very helpful elaboration.

I am not aware of any automated way in BW to do what you request. (By the way, I could not replicate your statistics using WTT/WTM; form gave fewer hits, lemma gave more.)

The way I showed above can be used with some extra steps.

Do the normal search, create a verse list from the result, using the Verse List Manager (VLM).
Do the search for two or more occurrences, using the VLM load and compare it to the normal one to create a list of verses with only 1 hit.
Do a search for three or more occurrences. Using the VLM load and compare it to the two-hit list to create a list of verses with 2 and only 2 hits.
Manually inspect the 3 or more occurrences search results pane to see the verses that are marked as *4 or *10 (for some reason BW thinks that the 5 occurrences verse has 10 hits in it).

My results, using the lemma search, were:
5 times = 1 verse
4 times = 3 verses
3 times = 32 verses
2 times = 238 verses
1 time = 1916 verses

I did not do the VLM steps, but depended on the statistics that BW reported. For non-standard searches I have in a few situations found the VLM counts to be more reliable than the reported search statistics.

There is a small learning curve to use the VLM; once you've done it a few times it will go quickly. One way to open it is the F7 key.

--Jim

MGVH
07-20-2013, 12:06 AM
Reading the first post, I thought initially that the intent was to find any verse that used any word N # of times.

As MBushell noted, use the Graphical Search Engine=GSE. The search will run much more quickly.

If you are looking for any word repeated N # of times (and you are not interested in specifying what that word is), try the attached QF files. One is for Hebrew WTM, the other for Greek BGM.
I started with this command line entry: '#1 *25 #1 *25 #1

You will note that using "#1" is a way of asking that all the words be the same. The *25 is arbitrary. You can in/decrease that as you wish.
With that command line, I then opened the GSE which set up my initial parameters.
Running a search mostly returned hits of verses with multiple conjunctions, prepositions, particles, or pronouns.
Note in my GSE query that I eliminated those by double-clicking the first word box and add inclusion/exclusion elements. (That's why the first box has the +/- in it.)
As you seek to limit your results you may choose to add more exclusion words/forms.

Use that pattern to look for 2 or 4 or 5 or ... more words.

1133
1134