PDA

View Full Version : number of words in a particular book of the bible



acheung
11-12-2012, 11:40 PM
Does anyone know of an easy way to get the total number of words in a particular book, say, Isaiah in WTT?

At present, the title page of each version shows the numbers of chapters and verses, total number of words and number of unique words for the whole bible (or the NT if OT is not available for that version). But individual book stats are not available.

For a particular book, the context menu shows the numbers of verses and unique words, but not the total number of words. Of course, I can copy the list to word, convert it to a table and extract the number column to excel and add the numbers together to get the total number of words, but this seems a very clumsy way of doing it.

Any suggestions?

MGVH
11-13-2012, 09:47 AM
To find all the words in Isaiah that are in the WTT (=BHS) version:
Select WTT in the command line: wtt ENTER (after each time)
Limit search to Isaiah: l isa
Search for every word: .*
Open Word List Manager via button bar or Tools menu > Analyzing the text
Click at bottom on "Load or Generate Word List"

For version, make sure it is WTT
For Source, choose "Load highlighted words from last query"
Click on "Create List"

Results are displayed at the bottom of the Word List pane: 7066 unique words occurring a total of 17197 times

Peter
11-13-2012, 10:50 AM
Thank you, Mark ! That's the easiest way to do such a query.
Peter, Germany

Donald Cobb
11-13-2012, 01:05 PM
Does anyone know of an easy way to get the total number of words in a particular book, say, Isaiah in WTT?

At present, the title page of each version shows the numbers of chapters and verses, total number of words and number of unique words for the whole bible (or the NT if OT is not available for that version). But individual book stats are not available.

For a particular book, the context menu shows the numbers of verses and unique words, but not the total number of words. Of course, I can copy the list to word, convert it to a table and extract the number column to excel and add the numbers together to get the total number of words, but this seems a very clumsy way of doing it.

Any suggestions?

You can also use the context tab in BW9. One of the three panes there is labeled "Book context"; that gives you the number of verses and, just after that, the number of words. So as I look at Romans, it gives me automatically the figure of 433 verses and 1050 words. You can also do a right click and click on "Export list to Word list manager." That will immediately build the list of the words of that book in the manager, which will also specify the number of words imported, i.e. (for Romans), 1050 in 7111 occurrences.

That's even easier than the other solution and uses functions in the way for which they were designed. :)

Blessings,

Donald Cobb
Aix-en-Provence, France

acheung
11-13-2012, 02:55 PM
You can also use the context tab in BW9. One of the three panes there is labeled "Book context"; that gives you the number of verses and, just after that, the number of words. So as I look at Romans, it gives me automatically the figure of 433 verses and 1050 words. You can also do a right click and click on "Export list to Word list manager." That will immediately build the list of the words of that book in the manager, which will also specify the number of words imported, i.e. (for Romans), 1050 in 7111 occurrences.

That's even easier than the other solution and uses functions in the way for which they were designed. :)

Blessings,

Donald Cobb
Aix-en-Provence, France

Thank you very much, Mark and Donald, for your prompt help. Donald's solution works particularly well with morphological text WTM and BNM, as it shows the true number of unique words taking into account of morphological differences.

I have a further question: how can you find the (true) number of unique words that occur less than X times in the whole or part of the bible (e.g. words in Isaiah that occur less than 5 times in OT; words in Romans that occur less than 10 times in NT, or LXX or NT+LXX). It will be nice to be able to also have the number excluding proper nouns (as proper nouns usually present no difficulty even though they might be rare words).

I am involved in bible translation and stats like that would help me gauge the difficulty and time required to translate a particular book of the bible, and also estimate the time required for the whole bible based on samples with various stats.

Alex Cheung

MGVH
11-13-2012, 03:48 PM
You can also use the context tab in BW9. One of the three panes there is labeled "Book context"; that gives you the number of verses and, just after that, the number of words. So as I look at Romans, it gives me automatically the figure of 433 verses and 1050 words. You can also do a right click and click on "Export list to Word list manager." That will immediately build the list of the words of that book in the manager, which will also specify the number of words imported, i.e. (for Romans), 1050 in 7111 occurrences.

That's even easier than the other solution and uses functions in the way for which they were designed. :)

Good thinking, Donald! I had been thinking there was something like that, but I looked under Words rather Context as you rightly did.

MGVH
11-13-2012, 05:11 PM
I have a further question: how can you find the (true) number of unique words that occur less than X times in the whole or part of the bible (e.g. words in Isaiah that occur less than 5 times in OT; words in Romans that occur less than 10 times in NT, or LXX or NT+LXX). It will be nice to be able to also have the number excluding proper nouns (as proper nouns usually present no difficulty even though they might be rare words).

I can think of a number of ways of doing this... The Word List Manager can do this well. I describe that later. For now, here are a few ways:

Flashcard Module Method


Open the Flashcard Module (use button or Tools > Vocabulary Flashcard Module).
File > Open > select gntvoc.vrc file for the Greek NT or hotvoc.vrc for the Hebrew OT
Go to Tools > Filter:

Uncheck Filter by Chapter Range / Filter by Frequency Range
Choose “Include only words in this verse range” (near bottom)
Choose “Calculate frequencies from verse range”
In verse range box, enter, “~~~~~” << You will need to enter book and chapter:verse range; e.g. Matthew 1:1-28.20 << i.e., you will need to know the last chapter:verse in book
Click Apply


Back in the Flashcard Module, click on “Freq” column head to sort by frequency (i.e., frequency of usage in NT, not your passage)
You now have a list of words with an indication of how many times they occur in this passage compared to how many times they occur in the OT or NT: e.g., 15/5777 means it occurs 15 times in this book and 5777 in whole Testament


Report Generator Method (only works for Greek)


Open the Report Generator Module (use button or Tools > Report Generator).
Choose BNM (for just NT) or BGM (for LXX + NT) as your version.
Range box, enter, “~~~~~” << You will need to enter book and chapter:verse range; e.g. Matthew 1:1-28.20 << i.e., you will need to know the last chapter:verse in book
You need to select at least one lexicon (but it won't matter since we are not going to ask to display it)
Leave the “Include Biblical Text…” box empty
In the “Analyze these Greek…” box, enter BNT or BGT depending on #2 above
In the Report Options, the only box you need to check is “Include Frequency Lists (Grk)”
If you want your report to come out by frequency instead of alphabetically, check “Sort Frequency List by Frequency”
Click “Build Report”


Use Tab View
This really won't give you the summary overview you want, but remember that as you move your mouse over a text with the "Use" tab open on the right, it automatically tells you how many times that word occurs in the book and in how many verses or how many times in how many verses in that whole version.

MGVH
11-13-2012, 05:39 PM
The Word List Manager is really the best way:

First we need to create a full vocab list
Open the Word List Manager
Click on Load or Generate Word List
In Version, choose WTM, BGM (for LXX and NT), or BNM
In Source, use "Load words from a Bible version"
To be safe, click on Reset Verse Range
Uncheck boxes at bottom
Create List
You now have a listing of all the words with their frequency in that version
To make things easier for later, you can save that word list: File > Save the IEL and label it clearly

To open it later, you will click on Load or Generate Word List, then choose "Load words from an inclusion/exclusion file" and create the list


Now you want to edit the list to your desired frequency range. E.g., to create a list of words that are used 50 times are less, click on the first word in the list, scroll down until you find the last word that is used 51 times, then SHIFT and click on that word to choose all the words used 51 or more times. Now use Edit > Delete selected.

Save this list with an appropriate title for later


Now you want to create a list of words in the book you are interested in.
Choose the Secondary Word List, then Load or Generate Word List
In Version, choose WTM, BGM (for LXX and NT), or BNM
In Source, use "Load words from a Bible version"
In Verse Range, type the first three letters of the book (The first three letters of a book name is the default abbreviation used in BibleWorks for all but Judges/Jdg and Philemon /Phm.)
Uncheck boxes at bottom
Create the list
You now have the list of words w/ the desired frequency in the main window and the list of words in the particular book > use Select > Select words common to both lists
The highlighted words will show you the words that occur with the given frequency

Mark Eddy
11-13-2012, 11:47 PM
Thanks for going through all the steps to come up with the list of words, Mark H.. The only short-coming with this is that it does not seem to exclude proper nouns. Using the Word List Manager I produced a list of all lemmas in BGM used 50 times and under, but then I had to go through the list one by one to exclude the proper nouns. I started with hapax legomena, and I'm up to excluding those used twice. It is going to take a long, long time to exclude all of the proper nouns. But I do not know a way to exclude them automatically using the Word List Manager. I also noticed that especially the Old Testament portion of BGM (=BLM) has some mistakes. For example, it may have a genitive or accusative form listed as a separate lemma from the nominative. So there are actually fewer lemmas than what BGM states. I have reported some of these cases to BW when I have found them, so perhaps in future releases this will be cleaned up a bit. There are not a lot of these cases, but there are some, especially when it comes to proper nouns.
For what it's worth.
Mark Eddy

acheung
11-14-2012, 12:40 AM
The Word List Manager is really the best way:

First we need to create a full vocab list
Open the Word List Manager
Click on Load or Generate Word List
In Version, choose WTM, BGM (for LXX and NT), or BNM
In Source, use "Load words from a Bible version"
To be safe, click on Reset Verse Range
Uncheck boxes at bottom
Create List
You now have a listing of all the words with their frequency in that version
To make things easier for later, you can save that word list: File > Save the IEL and label it clearly

To open it later, you will click on Load or Generate Word List, then choose "Load words from an inclusion/exclusion file" and create the list


Now you want to edit the list to your desired frequency range. E.g., to create a list of words that are used 50 times are less, click on the first word in the list, scroll down until you find the last word that is used 51 times, then SHIFT and click on that word to choose all the words used 51 or more times. Now use Edit > Delete selected.

Save this list with an appropriate title for later


Now you want to create a list of words in the book you are interested in.
Choose the Secondary Word List, then Load or Generate Word List
In Version, choose WTM, BGM (for LXX and NT), or BNM
In Source, use "Load words from a Bible version"
In Verse Range, type the first three letters of the book (The first three letters of a book name is the default abbreviation used in BibleWorks for all but Judges/Jdg and Philemon /Phm.)
Uncheck boxes at bottom
Create the list
You now have the list of words w/ the desired frequency in the main window and the list of words in the particular book > use Select > Select words common to both lists
The highlighted words will show you the words that occur with the given frequency



Thank you very much for your kind help. For now, I found an approximate method to exclude proper nouns by searching for them in morphological assistant:
1. type wtm [enter] in command line
2. l isa
3. go to morphological assistant, select POS: noun, type: Proper name, lemma: *, which results in code @np--*, then Lookup and yield 191 forms, 1239 hits
4. I can subtract the number of forms to get the number of unique words minus proper nouns.
5. Since there are some high frequency proper nouns, if I set the frequency low, it will result in considerable double counting, so I will have to look at the exported word list and make a rough estimate for adjustment. The result is not exact, but reasonably good for my purpose.

Alex Cheung

Donald Cobb
11-14-2012, 01:29 AM
Thank you very much for your kind help. For now, I found an approximate method to exclude proper nouns by searching for them in morphological assistant:
1. type wtm [enter] in command line
2. l isa
3. go to morphological assistant, select POS: noun, type: Proper name, lemma: *, which results in code @np--*, then Lookup and yield 191 forms, 1239 hits
4. I can subtract the number of forms to get the number of unique words minus proper nouns.
5. Since there are some high frequency proper nouns, if I set the frequency low, it will result in considerable double counting, so I will have to look at the exported word list and make a rough estimate for adjustment. The result is not exact, but reasonably good for my purpose.

Alex Cheung

Hello Alex,

I'm doing this with New Testament texts, so the the same thing in a Hebrew text may yield results that are a little different. Here's what I've done to build a list from Romans, excluding proper names from a search:

1. In the context tab, right click on "Export list to Word list Manager"; this will send the list to the "Main word list" (i.e., left-hand column).
2. With the Word List Manager (WLM) still open, select BGM as your search text and limit your search to Romans. Then type in the command line the following: .*@n???p (The "p" at the end is for "proper name"). Type enter. This will give you a list of all proper names in Romans.
3. In the WLM, select "Secondary word list", then "load or create word list"
4. In the new window that opens, select BGM as your search version, then "load highlighted words from last query" (as said in a previous post, I take the precaution of selecting "use search window limits"). Then deselect "keep Greek accents and Hebrew vowel points". This seems to be necessary, as the list created from the context tab doesn't have accents (I realized that the hard way! More on that below). Then create list.
5. In the WLM, you now have all the words in Romans in your Main word list, and all the proper names in Romans in your Secondary word list.
6. Select "Main Word List" (radial button), then click on "select" => "select words common to both lists"
7. Then "Edit" => "delete selected." That will give you your list of all the words in Romans in the "Main Word list", minus all the proper names.
8. You can then select the words that occur more than 50 x, and delete them from the list.

A couple things: in between steps 2 and 3, BW will automatically send the results of your query on proper names to the WLM, in the Secondary word list. This will not help, though, because it sends the words with their accents, and the WLM doesn't seem to be able to compare the two lists (i.e., Παυλος and Παῦλος, for instance, are seen as two different words). So you'll have to discount that automatically generated list and proceed to step 3.

One other thing: Did you know that you can also create a lexicon for that list? This can be helpful for learning the rarer words. Go to "File" => "make lexicon from selected words", then follow the steps there.

I'd be interested in hearing if this works as well in Hebrew texts (WTM). Please let me know!

Donald Cobb
Aix-en-Provence, France

acheung
11-14-2012, 09:26 PM
Hello Alex,

I'm doing this with New Testament texts, so the the same thing in a Hebrew text may yield results that are a little different. Here's what I've done to build a list from Romans, excluding proper names from a search:

1. In the context tab, right click on "Export list to Word list Manager"; this will send the list to the "Main word list" (i.e., left-hand column).
2. With the Word List Manager (WLM) still open, select BGM as your search text and limit your search to Romans. Then type in the command line the following: .*@n???p (The "p" at the end is for "proper name"). Type enter. This will give you a list of all proper names in Romans.
3. In the WLM, select "Secondary word list", then "load or create word list"
4. In the new window that opens, select BGM as your search version, then "load highlighted words from last query" (as said in a previous post, I take the precaution of selecting "use search window limits"). Then deselect "keep Greek accents and Hebrew vowel points". This seems to be necessary, as the list created from the context tab doesn't have accents (I realized that the hard way! More on that below). Then create list.
5. In the WLM, you now have all the words in Romans in your Main word list, and all the proper names in Romans in your Secondary word list.
6. Select "Main Word List" (radial button), then click on "select" => "select words common to both lists"
7. Then "Edit" => "delete selected." That will give you your list of all the words in Romans in the "Main Word list", minus all the proper names.
8. You can then select the words that occur more than 50 x, and delete them from the list.

A couple things: in between steps 2 and 3, BW will automatically send the results of your query on proper names to the WLM, in the Secondary word list. This will not help, though, because it sends the words with their accents, and the WLM doesn't seem to be able to compare the two lists (i.e., Παυλος and Παῦλος, for instance, are seen as two different words). So you'll have to discount that automatically generated list and proceed to step 3.

One other thing: Did you know that you can also create a lexicon for that list? This can be helpful for learning the rarer words. Go to "File" => "make lexicon from selected words", then follow the steps there.

I'd be interested in hearing if this works as well in Hebrew texts (WTM). Please let me know!

Donald Cobb
Aix-en-Provence, France

Hi Donald,

Thanks for the very helpful advice with clear steps. The method also works well for Hebrew (with the appropriate changes from BNM to WTM and .*@n???p to .*@np*). I did need to modify step 8 as the frequencies of words shown are those peculiar to that book and may not be representative of the NT or OT as a whole (e.g. a common NT word may occur rarely in Romans). The modified steps are:
8a. clear the secondary list
8b. type l [enter] to remove limit to the book, then .*@* to generate a word list for the whole NT BNM or OT WTM
8c. save the list for later use, name it , say, BNM word frequency list, and from it create lists for desired excluded frequencies by deleting words that are less than X. This will leave words occurring more than X times in NT/OT on the secondary word list.
8d. Select "Main Word List" (radial button), then click on "select" => "select words common to both lists"
8e. Then "Edit" => "delete selected." That will give the list of all the words in e.g. Romans in the "Main Word list", minus all the proper names and words occurring more than X times.

With Isaiah as my selected book and X =50, the results are:
1291 verses, 2068 unique words, 23248 total words, total unique words minus proper nouns and 50+ words = 2946 kind of more difficult words, yielding a ratio of 12.7% against the total number of words

For Ruth, the numbers are, respectively, 81, 301, 1823, and 125, yielding a ratio of 6.9%, showing that it is a much easier book to read than Isaiah.

For Job, the ratio is 16.1%, somewhat more difficult than Isaiah, a result that is expected.

For ezekiel, the ratio is 8.4%, between Ruth and Isaiah, again as expected. But it is good to be able to quantify the difficulty level with an objective measure like this.