Jump to content

Finding Word Count in a Book


joelmadasu

Recommended Posts

Is there a way to find out how many words are in a book in OT through Accordance? For instance, I want to find out the percentage of a word that appeared in the book of Genesis. How can I do that?

Edited by joelm
Link to comment
Share on other sites

Hi Joel,

 

Entering this search should do it : * <AND> [RANGE Gen]

I got 38262 hits in the KJV.

 

Bear in mind this is actual words not distinct words so God accounts for 230 of those.

 

Thx

D

Link to comment
Share on other sites

The Table feature in Analytics also give you word counts for books and chapter, but there it does include punctuation so it won't be as accurate.

Link to comment
Share on other sites

Thank you both for your help! :) So that search method "* <AND> [RANGE Gen]" should work with Hebrew Text as well, correct?

 

Also can someone please let me know what does it mean by "Total Hits" and "Total Words" in the "Table Bar Chart"?

 

post-31592-0-08092400-1375117733_thumb.pngpost-31592-0-43969300-1375117737_thumb.png

Edited by joelm
Link to comment
Share on other sites

  • 2 months later...

I have need to know specific word counts for my dissertation (so it needs to be accurate!). Could you direct me how I can get data on the following?

 

Words in Matthew:

Words in Mark:

Unique words in Matthew (Size of Matthew’s vocabulary):

Unique words in Mark:

Link to comment
Share on other sites

Hi Jared,

choose your text. (I am using GNT-T)

 

* <AND> [RANGE Mat] (gives 18363 hits)

 

* <AND> [RANGE Mar] (gives 11312 hits)

these are word numbers not including punctuation

 

to get the vocab size you can just do an analysis (word count) from the little graph icon

post-29509-0-43270600-1381184181_thumb.png

 

This analysis will show you that there are 1334 lexemes that Matthew uses, i.e. his vocab size.

post-29509-0-20028500-1381184179_thumb.png

 

If you were looking for how many different inflected forms were used, then with the analysis tab active (clicked in) press CMD-T and this sheet will drop down

drag INFLECT into the first column and delete LEX (just click LEX and press the delete key)

post-29509-0-32280700-1381184164_thumb.png

 

then when you press OK you will see that you have 3035 different inflected words in Matthew.

post-29509-0-71804000-1381184160_thumb.png

 

Does that make sense to you?

 

You can also fairly easily find vocab that Matthew uses that Mark doesn't, or vv. hence unique in that way.

  • Like 1
Link to comment
Share on other sites

Now the question is: can you do it for yourself on other pericopes! :D

Link to comment
Share on other sites

Re Joel's question. I am not sure. There are a few confusing issues here.

 

However, one issue I do see is that your range is incorrect. When searching a Hebrew Bible, if you want the whole OT Gen-Mal won't give it all to you as Mal is the last of the Prophets, and there are still the writings to go. In the HMT-W4 the final book is Nehemiah in the BHS-T the final book is 2Chronicles. (Best to double check that in the menu DISPLAY => LIST ALL BOOK NAMES)

 

One confusion comes in the difference between a word count and a hits count. As Helen mentions above, a hit is a word (or in Hebrew a lexical prefix or suffix) whereas a word count in this analysis includes punctuation (I don't know why that is the case, but there must be a good reason there somewhere).

 

That means that the counts per 1000 words look strange. For example, I selected every word in Mark, and then did an table graph analysis and it told me that there were 829 words per 1000 words. That isn't quite the natural way you would think that this should be reported, but when you realise it includes the punctuation then at least you can see why it is reporting it as such.

 

I know I haven't helped a great deal, but it may spark some further discussion.

 

Sorry it has taken so long to get back to you Joel.

Link to comment
Share on other sites

  • 5 years later...

Maybe I have missed this in one of the replies on this topic, but is there also a way of searching within a range, e.g. I want to know how many Greek words there are between Mark 1–10 and separately between Mark 11–16 (leaving aside for now the question of whether Mark 16:9–20 were part of the original manuscript! . . .)? Thanks for your help (I've followed the distinction between "hits" and "words").

Link to comment
Share on other sites

Yep. Just do this :

 

* [RANGE mk 1-10]

 

and

 

* [RANGE mk 11-16]

 

for the Greek text of your choice.

 

Thx

D

Link to comment
Share on other sites

Thank you! That was very helpful – and easy when you know how!

Link to comment
Share on other sites

  • 3 years later...

Do any of you know how to do a similar search for the number of letters in a book? Say the book of Acts in the NA28?

 

Link to comment
Share on other sites

9 hours ago, Rex Howe said:

Do any of you know how to do a similar search for the number of letters in a book? Say the book of Acts in the NA28?

 

Hi Rex, as far as I know there is no way to do this directly in Accordance, however, you can brute force it (in a way).

 

Set the range to Acts (use the Range criterion in the + button (and of course NA-28 in the search box

 

Then type "=??" and press return. (the "s are important here and must be included in the search bar)

 

This gives you 614 hits of 2 letter words. *this includes breathings and accents that are counted as "letters" in Accordance. (1228 letters)

 

Then add a ? for "=???" hits of 3 "letter" words (1912 words ie. 3824 letters)

 

And rinse and repeat. I think it tops out at 21 ?'s. 

 

I can't yet figure out how to exclude the breathings and accents in NA28. I'll keep working on it though.

 

Alternatively you could use a Tregelles' NT from the Accordance Exchange which does seem to ignore the accents and breathings in a letter count. 

 

You could just go through the same process as above. The TNT is based on the Tyndale House text (before SBLGNT). That might get you closer.

 

Though it's an interesting question, given the diversity of texts, why letter counts would matter - but I guess that's up to you 🙂

 

Hope that helps somewhat.

 

Please get back to me if this is unclear.

Link to comment
Share on other sites

Thanks Ken! Letters matter in the reconstruction of fragmentary papyri - letter per line, lines per page, pages per codex. That sort of thing. I found another brute force way to do it - I copied each chapter in NA28 and pasted it in Microsoft Word, then I used the Word Count tool. Not ideal, but effective. Word does not count punctuation as characters, and you can get the count with no spaces, which is helpful in running script. A word, character count tool in accordance would be great for those of us working in papyrology and codicology!

  • Like 1
Link to comment
Share on other sites

6 hours ago, Rex Howe said:

Thanks Ken! Letters matter in the reconstruction of fragmentary papyri - letter per line, lines per page, pages per codex. That sort of thing. I found another brute force way to do it - I copied each chapter in NA28 and pasted it in Microsoft Word, then I used the Word Count tool. Not ideal, but effective. Word does not count punctuation as characters, and you can get the count with no spaces, which is helpful in running script. A word, character count tool in accordance would be great for those of us working in papyrology and codicology!

Thanks Rex. That makes sense of course.! i should have thought of that. Word count is very  easy (as outlined above). Pop your request into the forum « « feature requests ». I agree. It sounds like a good idea 👍 

 

is Tregelles NT a possibility? Though it sounds like your Word workaround is more efficient. You can copy words without accents and breathing marks. That might reassure you that the Word character count is accurate. 
 

 

Screen Shot 2022-06-17 at 08.11.32.png

Edited by Ken Simpson
Addit
Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...