PDA

View Full Version : The BibleWorks Hebraica/Judaica project



bkMitchell
10-07-2010, 12:46 AM
Hello Bibleworks users and forum members,

Would you like to see more Hebrew modules?
Would you be happy to use any of the following(or any other Hebraica) in Bibleworks?
If, we work together I think it is possible increase the number of Hebrew modules available to BibleWorks' users. Here are a few things that might be handy to have with in BibleWorks.


The Siddur / Siddorim

Jacob ben Chajim ibn Adonijah's Tanakh
The Mishnah, Talmud, Mishneh Torah
The Masorah Parva,
The Masorah Magna,
The Masorah Finalis
Rashi's commentary on the pentatuech


Entering Hebrew text into BibleWorks is not that hard, but it is not as easy as copy and pasting unicode Hebrew text into the BW compiler. One, first must transcribe Hebrew/Greek texts into CCAT format.


What is CCAT?
CCAT stands for: CENTER FOR COMPUTER ANALYSIS OF TEXTS


On the BibleWorks formus CCAT most often means a type of transliteration scheme (Hebrew and Greek) called Beta Code. It uses only the ASCII characters/codes that are readily available on all computers. Special Hebrew, Greek, Coptic, etc fonts are then mapped over the CCAT/BETA codes.


What then is Beta Code?
Beta Code was created by David Woodley Packard a linguist, humanist, professor, but more famously known as the son of the co-founder of the Hewlett-Packard Company. It was first adopted in the 1970's for the Thesaurus Linguae Graecae (TLG), (1976) I assume by The Gramcord institute, in 1987 for for the Perseus Project, by Bibleworks, and many others.

What does CCAT/Beta Code look like?

Here is the first line from the first tractate of the Mishnah Database I have started working on:


Mishnah:
Seder Zeraim; Tractate Berakhot


With vowel points
מֵאֵימָתַי קוֹרִין אֶת שְׁמַע בְּעַרְבִית
M")"YMFTAY QWORIYN )ET $:MA( BAR:BIYT

Without vowel points
מאימתי קורין את שמע בערבית
M)YMTY QWRYN )T $M( B(RBYT




מאימתי קורין את שמע בערבית
משעה שהכהנים נכנסים לאכל בתרומתן עד סוף האשמורה הראשונה דברי רבי אליעזר
וחכמים אומרים עד חצות
רבן גמליאל אומר עד שיעלה עמוד השחר
מעשה שבאו בניו מבית המשתה אמרו לו לא קרינו את שמע
אמר להם אם לא עלה עמוד השחר חיבין אתם לקרות
ולא זו בלבד אלא כל מה שאמרו חכמים עד חצות מצותן עד שיעלה עמוד השחר
הקטר חלבים ואברים מצותן עד שיעלה עמוד השחר
וכל הנאכלין ליום אחד מצותן עד שיעלה עמוד השחר
אם כן למה אמרו חכמים עד חצות כדי להרחיק את האדם מן העבירה




M$NH ZW BMHDWRH HMBW)RT
M)YMTY QWRYN )T $M( B(RBYT
M$(H $HKHNYM NKNSYM L)KL BTRWMTN (D SWP H)$MWRH HR)$WNH DBRY RBY )LY(ZR
WXKMYM )WMRYM (D XCWT
RBN GMLY)L )WMR (D $Y(LH (MWD H$XR
M($H $B)W BNYW MBYT HM$TH )MRW LW L) QRYNW )T $M(
)MR LHM )M L) (LH (MWD H$XR XYBYN )TM LQRWT
WL) ZW BLBD )L) KL MH $)MRW XKMYM (D XCWT MCWTN (D $Y(LH (MWD H$XR
HQ+R XLBYM W)BRYM MCWTN (D $Y(LH (MWD H$XR
WKL HN)KLYN LYWM )XD MCWTN (D $Y(LH (MWD H$XR
)M KN LMH )MRW XKMYM (D XCWT KDY LHRXYQ )T H)DM MN H(BYRH

It only took about 4 minutes to transcribe this from Unicode, it is easy to enter un-pointed Hebrew this way than it is to enter the vowel points, and (in some text) the accents.

Would, you be interesting in the Mishnah or other Jewish and Hebraica texts?
Would you be interested in helping to type, or in prof reading? Are, there any other BW users working on Hebrew databases? If, so let's work together.

Grace and Peace

jdarlack
10-21-2010, 05:58 AM
Great initiative, BK! Someone with the time, patience and know-how could conceivably create some kind of program that would 'convert' Unicode Hebrew into CCAT. (Unfortunately I am lacking in all of the above right now.) If you have a non-copyrighted (or a copyleft) Unicode text such a tool could ease your pain considerably. Of course copyright opens a whole thorny mess of issues, so in many ways the transcription of text from a public domain text would save a bit of hassle.

bkMitchell
10-21-2010, 09:02 AM
... If you have a non-copyrighted (or a copyleft) Unicode text such a tool could ease your pain considerably. Of course copyright opens a whole thorny mess of issues, so in many ways the transcription of text from a public domain text would save a bit of hassle.

I am sure you are familiar with the term, "Free content (http://en.wikipedia.org/wiki/Free_content)"
I know of Free Content Mishnah (http://en.wikisource.org/wiki/Wikisource:Open_Mishnah_Project) (Both Hebrew/English) and Public domain Mikraot Gedolot (http://en.wikisource.org/wiki/Mikraot_Gedolot) project progress and in Unicode.The text, of the Mishnah I am working with is primarily the one just mentioned. What, do you think of these? Would there be any thorny issues to deal with? I am new to this.

(I do know of safe public domain Hebrew Texts in PDF format)

jdarlack
10-22-2010, 10:30 AM
I would say that in both cases, these would be fine to use (given that the websites pretty much say as much). As a courtesy, I'd offer whatever you compile to the administrators of the site. They may appreciate having their material made available in other formats.

Several years ago, I tossed around the idea of converting all of the rabbinic texts available at http://www.mechon-mamre.org/. I contacted the website administrators, and unfortunately they forbade me to use their texts (even though their texts were available to all freely online). I am not sure if it was because of a copyright issue as much as it might have been a "theological" issue for them.

bkMitchell
10-25-2010, 02:56 AM
Several years ago, I tossed around the idea of converting all of the rabbinic texts available at http://www.mechon-mamre.org/. I contacted the website administrators, and unfortunately they forbade me to use their texts (even though their texts were available to all freely online). I am not sure if it was because of a copyright issue as much as it might have been a "theological" issue for them.

Now, this is truly intriguing:confused: and somewhat disturbing:( because according to the wikisource Open Mishnah Project:

"Mechon Mamre has graciously granted us permission to use its text of the Mishnah according to Maimonides version"(link) (http://en.wikisource.org/wiki/Wikisource:Open_Mishnah_Project/Permissions)Not to mention that Biola University's Unbound Bible(link) (http://unbound.biola.edu/index.cfm?method=unbound.welcome) has the Aleppo codex in electronic form that is also from Mechon Mamre:

The Aleppo Codex without Vowel Points or Punctuation Based on the electronic edition at http://www.mechon-mamre.org (http://www.mechon-mamre.org/) Imported from the CrossWire Bible Society's (http://www.crosswire.org/) "The Sword Project" (http://www.crosswire.org/sword/) Bible Modules (http://www.crosswire.org/sword/modules/). I don't understand why you weren't allowed but, a secular organization, a christian university's online Bible program, and another Christian software program seem to be legally allowed to use texts, from the MM and yet they refused you? In fact I remember you mentioning the Aleppo Codex once of the forums as an addition for BW.

jdarlack
10-25-2010, 10:29 AM
Now, this is truly intriguing:confused: and somewhat disturbing:Intriguing and disturbing indeed!

Michael Hanel
10-25-2010, 11:49 AM
I don't understand why you weren't allowed but, a secular organization, a christian university's online Bible program, and another Christian software program seem to be legally allowed to use texts, from the MM and yet they refused you? In fact I remember you mentioning the Aleppo Codex once of the forums as an addition for BW.

My *guess* is because Jim specifically asked, whereas the other parties might not have. The problem is that there are no crystal clear lines when it comes to this e-universe. If a website uses a public domain text, but they made the e-edition themselves, do they hold some "right" over that text?

The other wrinkle is that some people don't want their text associated with a commercial product, but they are okay with the text being available for freeware programs. For instance, I believe there are copyrighted texts available in e-sword that are only there because the copyright holders allowed their text to be part of it gratis, which they only did because it was a freeware program (proviso: I do not claim to know the specific arrangements, I'm just inferring that a free program is not paying licensing costs otherwise it really is a labor of love.). Commercial products attract more attention. If there is someone out there who thinks someone is infringing on copyrights, that person may use a lawyer to tell that person to remove the e-text or give proper attribution. But when a commercial business is involved, a warning letter could quickly turn into a lawsuit.

BigJayOneill
10-26-2010, 11:30 AM
I would love to see BibleWorks and the BW community go into this direction. To be able to interact with the Talmud in a manner that BW allows would be a great blessing. I would also like to add another suggestion: the siddur, the Jewish prayer book, would be a fantastic module for BW. Many elements within the liturgy can be traced to the era of the second temple. I have thought about doing it myself and have made a few small modules for my personal usage. However, I do know from experience, thanks to my Hebrew Amidah and Pirke Avot user database modules, that typing these texts into BibleWorks format is time consuming and difficult on the eyes!


BTW- I do know that Spertus College of Judaica has BibleWorks available for those who have electronic access to their library. Market BibleWorks within Jewish circles and I bet we would get all kinds of interesting contributions!

bkMitchell
10-27-2010, 12:00 AM
I would love to see BibleWorks and the BW community go into this direction. To be able to interact with the Talmud in a manner that BW allows would be a great blessing. I would also like to add another suggestion: the siddur, the Jewish prayer book, would be a fantastic module for BW...
BTW- I do know that Spertus College of Judaica has BibleWorks available for those who have electronic access to their library. Market BibleWorks within Jewish circles and I bet we would get all kinds of interesting contributions!

Hey, BigJayOneill

thank you for voicing your support and making some good suggestions, too!

BigJayOneill
10-28-2010, 10:24 AM
It is my pleasure. Thank you for your contribution to the BibleWorks community!

bkMitchell
10-28-2010, 08:52 PM
It is my pleasure. Thank you for your contribution to the BibleWorks community!

Thank me later, I haven't finished or released anything, yet. However, I have added your Siddur suggestion to the master list and as a priority. I think I will be able to complete the entering of a small siddur faster than any of the other projects. I have located a few public domain Siddurim in good condition.

I'll be back,
Brian

BigJayOneill
10-29-2010, 03:45 PM
Thank me later, I haven't finished or released anything, yet. However, I have added your Siddur suggestion to the master list and as a priority. I think I will be able to complete the entering of a small siddur faster than any of the other projects. I have located a few public domain Siddurim in good condition.

I'll be back,
Brian


This is great news!

I know of a few online siddurim that I have utilized in order to compile a small "morning service" collection within my Kindle (PDF files with large font). If I could learn of a way in which to convert unicode docs into CCAT/ BW format then I would "go nuts" making modules! I would LOVE to get the Texts you mentioned into BibleWorks. :D

benelchi
11-13-2010, 10:19 PM
Converting a Hebrew text into CCAT would be very easy to do programatically. If you have access to the Hebrew text (assuming it is in a documented format like unicode), a conversion program would be very easy to write and it would save many hours of typing and proof reading for transcription errors.

I am very much interested in having these resource available in Bibleworks, so let me know if I can help with the software for conversion of the text.

benelchi
11-13-2010, 10:23 PM
This is great news!

I know of a few online siddurim that I have utilized in order to compile a small "morning service" collection within my Kindle (PDF files with large font). If I could learn of a way in which to convert unicode docs into CCAT/ BW format then I would "go nuts" making modules! I would LOVE to get the Texts you mentioned into BibleWorks. :D


Kindle documents (and some PDF's) are encrypted. Additionally, PDF's may have graphic pictures of the text rather than true Hebrew unicode. If you cannot select the text and paste it into another document like MS Word, then it likely is not a usable source for conversion. If the source does allow the text to be selected and copied, it could be placed in a document that was easy to convert to CCAT. The PDF format is not the most easy document format to convert from directly.

bkMitchell
11-15-2010, 01:03 AM
If you cannot select the text and paste it into another document like MS Word, then it likely is not a usable source for conversion. If the source does allow the text to be selected and copied, it could be placed in a document that was easy to convert to CCAT. The PDF format is not the most easy document format to convert from directly.

I am using texts that are in Unicode and that are 'free content'.

benelchi
11-15-2010, 07:09 PM
I am using texts that are in Unicode and that are 'free content'.

If you can point me to the source, I should be able to make a conversion program.

BigJayOneill
11-15-2010, 08:54 PM
If you can point me to the source, I should be able to make a conversion program.


This would be great!

bkMitchell
11-17-2010, 09:39 PM
If you can point me to the source, I should be able to make a conversion program.

"Ask, and it shall be given you;" (Mat 7:7)

A Free Content Mishnah
The Open Mishnah Project at Wikisource (http://he.wikisource.org/wiki/%D7%9E%D7%A9%D7%A0%D7%94)

The following link has lots of Judaica
Hebrew Wikisource (http://he.wikisource.org/wiki/%D7%A2%D7%9E%D7%95%D7%93_%D7%A8%D7%90%D7%A9%D7%99)

I am not using this one, but it is pretty nice:
Free Siddur Project (http://siddur.arielbenjamin.com/texts)

BigJayOneill
11-18-2010, 01:14 PM
I like the following: http://www.onlinesiddur.com/

In addition, if you copy some text from any of the prayers and paste it into a search bar you will find numerous sources for siddurim texts. I am sure we could do the same thing with other Jewish texts as well.

If you create the program, please let us know. I am sure we will come up with a few ways to utilize it!

bkMitchell
11-21-2010, 07:29 PM
Some Good news, AND a challenge!

I have just finished converting or rather re-transcribing a small Ashkenazi Siddur with vowel points into the CCAT format.
When a 12point font is used it takes about 69 pages in OpenWriter office.
(Sorry, I been busy at work so I am sorry I was a little on the slow side on this project. From now on I should have more time.)

Now, I need help in thinking about how to arrange/organize the text for use with BibleWorks database compiler.

The issue here is the fact that:
Prayer Books are usually not numbered/ and versed like Bible texts. However, to import the Siddur text into the database compiler it will need to be versified.

Does, anyone have an suggestions?

bkMitchell
11-22-2010, 01:59 AM
Just for fun
Here is a famous song/Prayer from the siddur
presented in both Unicode and CCAT BetaCode


אֲדוֹן עוֹלָם

)DWN (WLM


אֲדוֹן עוֹלָם אֲשֶׁר מָלַךְ
):ADWON (WOLFM ):A$ER MFLAK:


בְטֶֽרֶם כָּל יְצִיר נִבְרָא
B:+EREM K.FL Y:CIYR NIB:RF)


לְעֵת נַעֲשָׂה בְחֶפְצוֹ כֹּל
L"T NA(:A&FH B:XEP:CWO K.OL


אֲזַי מֶֽלֶךְ שְׁמוֹ נִקְרָא
):AZAY MELEK: $:MWO NIQ:RF)


וְאַחֲרֵי כִּכְלוֹת הַכֹּל
WAX:AR"Y K.IK:LWOT HAK.OL


לְבַדּוֹ יִמְלוֹךְ נוֹרָא
L:BAD.WO YIM:LWOK: NWORF)


וְהוּא הָיָה וְהוּא הֹוֶה
W:HW.) HFYFH W:HW.) HOWEH


וְהוּא יִהְיֶה בְּתִפְאָרָה
W:HW.) YIH:YEH B.:TIPFRFH


וְהוּא אֶחָד וְאֵין שֵׁנִי
W:HW.) )EXFD W"YN $"NIY


לְהַמְשִׁיל לוֹ לְהַחְבִּֽירָה
L:HAM:$IYL LWO L:HAX:B.IYRFH


בְּלִי רֵאשִׁית בְּלִי תַכְלִית
B.:LIY R")$IYT B.:LIY TAK:LIYT


וְלוֹ הָעֹז וְהַמִּשְׂרָה
W:LWO HF(OZ W:HAM.I&:RFH


וְהוּא אֵלִי וְחַי גֹּאֲלִי
W:HW.) )"LIY W:XAY G.O):ALIY


וְצוּר חֶבְלִי בְּעֵת צָרָה
W:CW.R XEB:LIY B."T CFRFH


וְהוּא נִסִּי וּמָנוֹס לִי
W:HW.) NIS.IY W.MFNWOS LIY


מְנָת כּוֹסִי בְּיוֹם אֶקְרָא
M:NFT K.WOSIY B.:YWOM )EQ:RF)


בְּיָדוֹ אַפְקִיד רוּחִי
B.:YFDWO )AP:QIYD RW.XIY


בְּעֵת אִישַׁן וְאָעִֽירָה
B."T )IY$AN WF(IYRFH


וְעִם רוּחִי גְּוִיָּתִי
WIM RW.XIY G.:WIY.FTIY


יהוה לִי וְלֹא אִירָא
YHWH LIY W:LO) )IYRF)

ISalzman
11-22-2010, 12:44 PM
Adon Olam. Now you are talking memories, Brian! By the way, you might be interested to know that the hymn/song Adon Olam has been put to many melodies, all of them nice, in my humble opinion.

BigJayOneill
11-22-2010, 11:05 PM
Some Good news, AND a challenge!

I have just finished converting or rather re-transcribing a small Ashkenazi Siddur with vowel points into the CCAT format.
When a 12point font is used it takes about 69 pages in OpenWriter office.
(Sorry, I been busy at work so I am sorry I was a little on the slow side on this project. From now on I should have more time.)

Now, I need help in thinking about how to arrange/organize the text for use with BibleWorks database compiler.

The issue here is the fact that:
Prayer Books are usually not numbered/ and versed like Bible texts. However, to import the Siddur text into the database compiler it will need to be versified.

Does, anyone have an suggestions?


This is great news! I can't wait to see it! Thanks for your efforts!

I think I would treat each core piece of the liturgy as a book (like books of the bible), and give each main section within each piece of the liturgy a chapter number. From there, verse numbers. Each Kaddish could be one book, with one chapter, with whatever number of verses. The Shema could be one book, with three chapters, each with so many verses. The Amidah could be one book, with 19 or more chapters, each with so many verses. You will have to come up with some "book names" for the various prayers! However, this could become our standard layout for future projects! There are a few English Siddurim available within the public domain that could easily become B.W. modules.

Nevertheless, please do what you think is best!

Hey Irv, you better know Adon Olam! :D Shalom my friend!

ISalzman
11-23-2010, 12:31 PM
Hey Irv, you better know Adon Olam! :D Shalom my friend!

Absolutely, Jay! Adon Olam, Lord of the Universe, please bring back the Whale for my friend Jay! Stamkos is amazing by the way!

Mark Eddy
11-23-2010, 11:45 PM
Just a couple hints about dividing up new databases. When you come up with "book name" abbreviations, they have to have a combination of three letters or numbers each, and they have to be different from any other three-character abbreviation used for any other database in BibleWorks. I think that the BibleWorks blog should have a books name file (ending in .bna) available, which contains all the abbreviations in use thus far for publicly available databases.
Concerning the "verse" number, some of Michael Hanel's classical Greek databases use line numbers instead of coming up with a non-standard verse division. Such line numbers work fine for displaying the text in BW, but if a translation is envisioned to go along with it, sometimes it is difficult to match up the lines of translation exactly with the original. At present verse map file (to allow the user to view multiple verses or lines together) do not work for non-biblical books. But God-willing that can be remedied if the programmers ever have time to branch out farther from the biblical books, which is not their focus.
If you want to bounce your decisions about abbreviations and numbering systems off of other users, just ask.
Mark Eddy

Dale A. Brueggemann
11-24-2010, 11:00 AM
..."book name" abbreviations, they have to have a combination of three letters or numbers each (I think the first character must be a letter), and they have to be different from any other

Biblical book name abbreviations include 1ki, 2ki, 1jo, etc.

BigJayOneill
11-24-2010, 12:00 PM
Absolutely, Jay! Adon Olam, Lord of the Universe, please bring back the Whale for my friend Jay! Stamkos is amazing by the way!


Thanks Irv! I say AMEN!

Stamkos is playing well right now.

bkMitchell
11-29-2010, 10:52 PM
...Concerning the "verse" number, some of Michael Hanel's classical Greek databases use line numbers instead of coming up with a non-standard verse division. Such line numbers work fine for displaying the text in BW, but if a translation is envisioned to go along with it, sometimes it is difficult to match up the lines of translation exactly with the original...
If you want to bounce your decisions about abbreviations and numbering systems off of other users, just ask...

Thanks, Mark and I will take be asking a few questions very soon.


...
I think I would treat each core piece of the liturgy as a book (like books of the bible), and give each main section within each piece of the liturgy a chapter number. From there, verse numbers...

Thank you for your advice BigJayOneill.

bkMitchell
12-16-2010, 08:15 PM
Just to let you(plural) know I haven't abandoned this project.
I am still working on it in my free time. If, anyone wants any of the files I have converted so far let me know and I will either post them here are send them by e-mail.

BigJayOneill
12-18-2010, 11:17 PM
Great news. I would be glad to check out your files.

Please feel free to take my Amidah files and include them in any type of Jewish liturgy module.

What happed to the guy who was going to make that conversion program?

Enjoy your week before Christmas!

bkMitchell
07-11-2011, 01:11 AM
This project has neither been forgotten nor abandoned.
I apologize for my absent for this thread. I have been both busy and a little lazy.



The first thing I would like to finnish is the siddur. The Siddur (http://he.wikisource.org/wiki/%D7%A1%D7%99%D7%93%D7%95%D7%A8_%D7%A0%D7%95%D7%A1% D7%97_%D7%90%D7%A9%D7%9B%D7%A0%D7%96) I am working with have been released under a creative common license (http://he.wikipedia.org/wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94:% D7%A8%D7%99%D7%A9%D7%99%D7%95%D7%9F_Creative_Commo ns_%D7%99%D7%99%D7%97%D7%95%D7%A1-%D7%A9%D7%99%D7%AA%D7%95%D7%A3_%D7%96%D7%94%D7%94_ 3.0_%D7%9C%D7%90_%D7%9E%D7%95%D7%AA%D7%90%D7%9D) and should be free for anyone to use. From now on I will start posting some of my work so that I hope it can receive constructive criticism. The first part in creating a Hebrew text custom version for use with BibleWorks is is to enter the text in the CCAT format since Unicode is not supported yet for Hebrew text. Rather, than post the entire Siddur here, I have attached CCAT Siddur Draft to this post as a PDF.