Titles included in word counts?

January 5, 2018

Hi all - I'm a new Accordance user as of 2018. I've got a computational stylistics project that I'm working on, for which accurate word-counts per text are relatively important. I'm using NA28 Greek text and it appears that the titles of the texts are included in the overall word count for the texts. For instance, προς Ρομαιους is included in the wordcount for Romans. Can anyone verify if this is the case? If it is, then all of the word counts for books will need to be reduced by the number of words in the titles. 1 Corinthians will have three less words, etc.

Ben

January 5, 2018

I also just noticed that in the Holmes version of the Greek Apostolic Fathers, that it not only includes the title, but also any colophon material at the end of the text. Since these are scribal additions, they should not be included in the word count for the text itself.

January 6, 2018

Yep. Seems like they are. I did a search for προς and got 18 hits. The concordance shows the title in the list. And counting them up shows that hit was included in the count.

And a search for * against Romans returns 7116 words while * <NOT> "ΠΡΟΣ ΡΩΜΑΙΟΥΣ" return 7114.

As the titles are in verse 0 you should be able to exclude them, though it would be a hassle to have to define all the ranges. On the AF not sure if that would work or not.

Thx

D

January 6, 2018

Daniel - Thanks for your reply. This is something that Accordance should be aware of, since many scholars get their numbers for word-counts from this software.

Ben

January 6, 2018

Accordance is just drawing the word counts from the published texts, though. If you have an issue with titles or other material being included, isn’t that an issue with the text as published by its editors? If you were using a Greek text that happened not to have titles included, Accordance wouldn’t include them in the count. So it’s a content problem, not a software problem. Just saying.

January 7, 2018

It’s definitely a data question, though the software needs to provide support, and whether it’s an issue or not depends upon what you are doing. It is like the issue of pericope headings which keeps coming up. Various people, myself included, while not interested in them ourselves, do not see a need to prevent them being added so long as they do not interfere with search results. These book titles ar similar. It is easy enough to exclude these titles (see above) but it poses a larger question of whether there ought to be a way to identify these separate bits (layers is how I think of them in my text model) of text. In the end you can get different counts for a variety of reasons - if you look at the different scribal hands in a text you’ll get different answers. If you look at different texts, WH, EPT, NA etc you’ll get different counts too. If you look at different verse boundaries you get different answers. Some things like bracketed words are already handled.

Something like the Codex Sinaiticus XML file goes to considerable lengths to identify all these different types of text in the document. You can then query what you want, though it can get complex. Again depends what you are trying to do. FWIW looking at the beginning page of Romans in the online Codex Sinaiticus images προς Ρομαιους does appear at the start, though clearly in a different hand.

While the NA and EPT texts in Acc both include verse 0 with the titles Westcott Hort does not. The data always gets you in the end.

Thx

D

Titles included in word counts?

Recommended Posts

benjaminlwhite

Link to comment

Share on other sites

benjaminlwhite

Link to comment

Share on other sites

Λύχνις Δαν

Link to comment

Share on other sites

benjaminlwhite

Link to comment

Share on other sites

gbjohnston

Link to comment

Share on other sites

Λύχνις Δαν

Link to comment

Share on other sites

Please sign in to comment

Browse

Activity