Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Creating User Greek database (module) from UTF8 Unicode text

  1. #1

    Default Creating User Greek database (module) from UTF8 Unicode text

    Hi, I'm trying to make my first BW database (user Bible module), which is the unaccented Greek WH text from 1881. (The one in BW is from 1885. I want to use WH1881 to compare with the TR and with WH1885.)

    I have the text, and all the references are correct. It compiles correctly. However, it looks like gibberish. I assume this is because it's in Unicode UTF-8.

    Name:  WH1881_sample_gibberish.png
Views: 173
Size:  12.0 KB

    I searched the forums, but can't figure out how to convert this unicode to something BW10 will understand. I tried opening the UTF-8 file in Notepad++ and changing the encoding to ANSI or to some other Greek encoding, but I can't seem to find what works.

    Name:  EncodingOptionsInNotepad++.png
Views: 172
Size:  57.8 KB

    EDIT: I noticed the two options to import: BW Format & CCAT Format. I tried both with no success. I found mention on the forums that BW now supports importing Unicode, but I can't figure out how to do it.

    Do I actually need to run some kind of macro to convert the text? If so, where can I find the tools for that?

    Thank you!
    Eric
    Last edited by ETC; 02-12-2017 at 12:57 PM.

  2. #2

    Default

    hmm... usually the issue has been going from legacy fonts to unicode. Maybe one of the converters I list here will work for you moving from unicode to BW or CCAT.
    Mark G. Vitalis Hoffman
    Professor of Biblical Studies
    Lutheran Theological Seminary at Gettysburg
    ltsg.edu - CrossMarks.com
    Biblical Studies and Technological Tools

  3. #3

    Default Here's what BibleWorks support said

    I asked BW support about this and got this answer:

    In response to your question we can point out that BibleWorks will EXPORT in Unicode, but to import a version, via the Version Database Compiler, the base text must be in ASCII format. For Greek and Hebrew, the database must be in CCAT format. The Version Database Compiler will not support the import of a database in Unicode text.

    Information about creating version databases for use in BibleWorks can be found in Chapter 66 of the BibleWorks Help. A quick way to access this chapter is to open the Version Database Compiler, then moving the mouse pointer into the window and hitting the F1 function key.

    It is hoped that this information will be helpful to you in this matter. We thank you for using BibleWorks.

  4. #4

    Default Thank you, Mark!

    @Mark G. Vitalis Hoffman

    Thank you very much for those links. The Unicode to BWGRKL font macro linked on your site (it's by John Kendall and it's downloadable here: https://www.bibleworks.com/forums/sh...=2176#post2176) worked great to convert the text (as far as I know; I've not checked very thoroughly yet).

    @Anyone

    I do have one problem. In the text there are embedded notes in {curly brackets}. They came through as in the following screenshot. I looked at the BW documentation, and I thought they were formatted correctly.

    Mt 1:9 was originally like this: MAT 1:9 οζιας δε εγεννησεν τον ιωαθαμ ιωαθαμ δε εγεννησεν τον αχας αχας{UBS4: αχαζ αχαζ} δε εγεννησεν τον εζεκιαν
    Mt 1:9 was this way in the imported text (bwgrkl): MAT 1:9 oziaj de egennhsen ton iwaqam iwaqam de egennhsen ton acaj acaj{UBS4: acaz acaz} de egennhsen ton ezekian

    Best I can tell from Help Chapter 66, this is exactly the right way to embed the notes. "Embedded notes" was selected, and the database for compiled in CCAT format.

    Name:  Mt1_9_embedded_note_problem.png
Views: 95
Size:  12.9 KB
    Last edited by ETC; 02-13-2017 at 03:24 PM. Reason: added the bwgrkl text for Mt 1:9 and reference to Help Chapter 66, etc.

  5. #5

    Default Inserting spaces around {embedded note} does not help

    I got to thinking that maybe the problem was because the {embedded note} in the .txt file did not have spaces around it on both sides. So I changed Mt 1:9 in the source .txt file.

    As can be seen in this screenshot, it did not fix the problem. All it did was made the spacing better. The opening { is being interpreted as a rough breathing mark and an accent. The closing } is being interpreted as a Greek full stop.

    Name:  Mt1_9_embedded_note_problem_after_inserting_space.png
Views: 90
Size:  5.3 KB

    I have contacted support again. I sent them the .txt file.

    EDIT: Actually there appears to be another issue. There should be no diacritical marks in this text, but there is also one on the alpha in οζια in Matt. 1:9 rather than presenting it as οζιας. Maybe I didn't do the conversion right. Seems there were several macros in the downloaded file I mentioned above. Maybe I launched the wrong one. Hmm.
    Last edited by ETC; 02-14-2017 at 04:22 AM.

  6. #6
    Join Date
    Apr 2004
    Posts
    616

    Default

    There may be other issues in your text (as indicated by the accent mark that you discussed), but I think that you will have to used End notes instead of Embedded notes for the Greek. I doubt that anyone has ever tried included Embedded notes in Greek. In fact, the ability to add any notes at all to Greek text was only added in the last two BibleWorks versions. You will need to make sure that you use the correct format for the end notes. I suggest exporting a section of the WHT Greek text in CCAT format and see the format, and how the notes are formatted. (There are some notes in the WHT, though not extensively. Mark 15:1 has a brief note, for example.)

    As you have found out, the text must be completely in ANSI, not Unicode format, or the file will not compile correctly. Any deviation from the specifications described in the Help file will likely result in an unstable version, which may result in program crashes. Exporting a similar version and following that pattern is a very helpful approach to creating versions.

    Blessings,
    Glenn
    Glenn Weaver

    For technical support, please contact Customer Support.

  7. #7

    Default

    Thanks, Glenn!

    Thanks for helping me with my newbie questions on this.

    It's good to know that I must not use Embedded notes, but End notes. I have read chapter 66 of the Help more than once, and it's printed out on my desk with hightlighting. I don't recall anything in it about Embedded Notes not working for Greek, so I guess that was just a newbie mistake.

    Also, in my first attempt to import, I did not use CCAT format. I guess I wasn't going for "best practice". The help says, "It is best to use the CCAT format, especially for Hebrew versions." Based on the feedback I've received from BW support, I now understand it is essential to use CCAT for Greek and Hebrew versions.

    I'll see if I can get closer to what I need. Thanks for the help!
    Attached Images Attached Images  

  8. #8

    Default

    Glenn Weaver from BW support has provided more valuable information. I'm posting it so that it'll be available for others and so that I can find it again the next time I need to do this

    I looked the text file that you provided, and I have a couple of items for you to consider when compiling a version. (I also responded earlier on the Forum, but I will go into more detail here.)
    1) As you know, the files cannot be in Unicode format. BibleWorks can export to Unicode, but the Version Database Compiler cannot accept Unicode input files.

    2) Unicode files are not only the format of the input text file, but the actual font encoding. The characters are very different. The characters need to be converted from Unicode to the proper ANSI text input format. (You did this by using the Word macro.)

    3) The format of your text file is in the BWGRKL font format, not in CCAT format. The CCAT format is the format developed by the Center for the Computer Analysis of Texts. The Greek CCAT format is similar to the BWRKL format, but is not exactly identical. (The Hebrew CCAT format is vastly different from the BWHEBB format!) For example, CCAT does not provide a final sigma in the first word of the Mat 1:9. The BWGRKL specifies the letter 'j' as a final sigma, while the CCAT text uses the letter 's' that is also the medial form. When using the Version Database Compiler, it is better to import text in the CCAT format instead of the BWGRKL or BWHEBB format.

    4) If you wish to include notes with the Greek text, you have to specify the CCAT format. The notes will not compile correctly using the BWGRKL format.

    5) The embedded notes probably will not work. Until recent versions, it was not possible to include notes at all with Greek texts. I expect that only Endnotes are supported for Greek texts. You will need to make sure you use the proper endnote format as specified in the Help file. (While it is useful to export versions to see what the files look like, please keep in mind that the input specification has changed since some of the files have been compiled, and you will have to use the newer specification as listed in the help file.)

    In your example of Mat 1:9, your text compiles using the following endnote format, when selecting the CCAT input format. You have to move the notes to the end of the verse, and there is no indicator in the text where the note belongs. You have to include the { <p><nsup>1</nsup> } content, and you have to use the <g> and </g> tags to show Greek in the notes.

    Mat 1:9 oziaj de egennhsen ton iwaqam iwaqam de egennhsen ton acaj acaj de egennhsen ton ezekian { <p><nsup>1</nsup>UBS4: <g>acaz acaz</g>}

    (Notice that I did not change the BWGRKL format of the final sigma to the CCAT specification. You will still need to convert your text to CCAT instead of BWGRKL format to show the proper characters.)

    6) The first line that you added to the top of the file should be removed before compiling.

    7) The 3-letter abbreviations in BW are not all-caps, but that does not appear to hinder the compiler from compiling the file.

    8) The text file format is in UNIX format. Ordinarily it should be in Dos/Windows format, but it does not appear to be hindering the compiler.
    I forgot to add, that you should also check the box that the text “Has Superscripts”, even though it does not have superscripts. This addresses an undocumented bug in the compiler.
    One thing I have not found is the specs for the CCAT format. If anyone knows where exactly they are, please let me know. Thank you!
    Last edited by ETC; 02-14-2017 at 03:28 PM.

  9. #9
    Join Date
    Jul 2004
    Posts
    174

    Default

    If I had to dedicate myself to compile WHH1885 I would do so:
    In BW there is WHT database (the New Testament in the Original Greek, The Text Revised by Brooke Foss Westcott D.D. and Fenton John Anthony Hort D.D. (Macmillan: Cambridge and London, 1885).

    1) Export “WHT” database (from Mat 1:1 to Rev 22:21) using the BW tool ”Export database” and save as “rtf” format (“Whh85.rtf”) and as “CCAT” format “Whh85.cat”;

    2) Convert your “Unicode text” using the “Unicode to BWGRKL font macro” (Macro originally created 08/2004 by John Kendall, Cardiff, Wales); save converted txt (twice) as Whh-81.docs or doc format and as “Whh-81.Txt” (txt format) (if you have some wrong character in the text you must correct it). Import your text with“Whh-81.txt” with BW tool “version database compiler” (edit the Whh-81.ddf : Description: WHH – 1881; Version ID: WHH-81; Version #: 7000 (or another number); Content: NT; Language: Greek; Select: Input in BW format and End Notes* (if you edit note in the Greek text); Install after Compiling; Print Blank References.

    *Note for compiling End Notes in Greek bible: “you can add end notes to Greek compiled versions. You must set the endnote option in the DDF file. The note is appended to each line (at the end of the Bible verse) preceeded by <<<. (Export WHT for an example.) Other than the endnote setting the DDF is the same as a normal Greek text DDF. Any Greek in the note must be un the bwgrkl format and the verse text must be in CCAT format. <<<Write English txt in “Arial fonts” <g> write Greek text in “bwgrkl fonts”</g> write English txt in “Arial fonts”.

    FIRST RAW CORRECTION

    3) Compare with BW Tool “Text comparisons setting” WHH-81 Vs WHT. Correct the differences (now you can fix conversions errors and differences from your printed edition – now is useful to use “Whh85.rtf” only for copy in your text Whh-81.docs (or doc) the correct words), save the “final” text as “whh-81.txt” and compile again the GNT in Version database compiler with the correct “whh-81.txt”.
    4) Export the resulting text (WHH-81 database) with BW tool export database and save it as whh-81.cat (CCAT format).
    5) Import in BW “whh-81.cat” (which replaces the previous database compiled with “whh-81.txt”) (Select Input in CCAT format);

    SECOND CORRECTION/FINAL CORRECTION
    6) Compare again with BW Tool “Text comparisons setting” WHH-81 Vs WHT. Correct the differences (you can fix conversions errors and differences from your printed edition – Now is useful “Whh-85.cat” to copy in your Whh-85.cat the correct words, save the “final” text as “whh-81.cat” and compile again the GNT in Version database compiler with the last “whh-81.cat” (Select Input in CCAT format).
    7) Compile your whh-81.vmf (you can copy wht.vmf and rename it as whh-81.vmf).

    Pasquale
    Last edited by pasquale; 02-15-2017 at 07:51 AM.

  10. #10

    Default

    Thank you, Pasquale, for the detailed procedure.

    I'm sure your approach is a lot better than anything I'd thought up so far. Unfortunately, I'm going to have to put this on hold for a while since I have some other more pressing matters. I had thought importing a version into BW10 was going to be straightforward. I was incorrect. But it is doable, and I hope to follow your instructions when I can get back to it.

    Thanks again.
    Eric
    Last edited by ETC; 02-28-2017 at 10:25 AM. Reason: Change "you're" to "your" :)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •