moonryul

01-09-2005, 08:30 PM

This is a continuation of the thread "Search Limitations of

BibleWorks". I think the issue is important, but there has been no respons to my last post. So, I repost it under the

more specific title.

[Michael]

I wiill take your carefully described request and put in on our list of possible changes. I do see the merit to your idea,

=> Thanks.

but I should warn you that it would not be simple to fully implement. The ASE code is well-designed, but as I am sure you are aware, in any large piece of complex software, a seemingly small change can have large ramifications.

=> Yes, in general.

But not, in this particular case.

As one small example, I can think of several practical situations where you would not want this feature unless there were further flags that could be used to de-activate the feature for given phrases under a merge box. These small changes start to add up.

=> See below.

As another small example, to be fully general, the program would need to properly deal with multiple ordering links between multiple pairs of merge boxes.

=> Yes, but it is trivial. See my explanation below.

A complex set of rules would then have to be implemented and enforced. All of this can be done, but I hope you can see that it is not a trivial change.

=> All we need is translation rules (A) and (B).

Rule (A):

(1) an ordering constraint O between two OR merge boxes, OR(p1, p2, ....pn) and OR(q1, q2, ... qm),

is equivalent to:

(1)':

O between p1 and q1, between p1 and q2, ...., between

p1 and qm,

O between p2 and q1, between p2 and q2, ...., between

p2 and qm,

...

O between pn and q1, between pn and q2, ...., between

pn and qm,

Right? Rather than having the user specify (1)', let us

have them specify (1) and let the system translate it

into (1)'. This translation rule is obvious.

Of course, pi and qi are either Merge boxes or word boxes.

If they are word boxes, the translation is completed.

If they are OR Merge boxes, then translation rule (A) is

applied again. If they are AND Merge boxes, rule (B) is

applied again.

Rule (B):

(2) an ordering constraint O between two AND merge

boxes, AND(p1,p2,...,pn) and AND(q1,q2,...,qm)

is equivalent to

(2)':

O between the last daughter of the first AND box, pn,

and the first daughter of the second AND box, q1.

Here I assume that pn is the last word

of the first AND box, and q1 is the first word

of the second AND box. See (C) below, for this.

Of course, pn and q1 are either Merge boxes or Word boxes. If they are Word boxes, the translation is completed. If they are OR merge boxes, translation rule (A) is applied again. If they are AND merge boxes,

translation rule (B) is applied again.

SO, given an ordering constraint between OR boxes or

AND boxes, it can be translated into equivalent ordering

constraints between their daughters recursively down

the query tree.

------------------------------------------------------------

Michael, you said:

you would not want this feature unless there were further flags that could be used to de-activate the feature for given phrases under a merge box.

=> I do not understand it. Perhaps you misunderstood the

translation rule of the ordering constraints, which I

explained above in a more algorithmic manner.

Because an ordering constraint between Merge boxes is a

simpler and equivalent way of specifying the same

constraints repeatedly between the daughter boxes

of the Merge boxes, there is no "de-activation" here.

Am I missing something?

(C) There is another concern for ordering constraints

which is related to the transation rule I am talking about,

but which is also worthy to talk about in itself.

In the manual, I read that AND merge box, AND(p1,p2,p3), does not say anything about ordering

between p1, p2, and p3. Without any further

ordering constraints on them, the search engine

finds 6 permutations of p1, p2, p3 from individual

verses. It should be true even when pi's are AND or

OR Merge boxes. Right? AND(AND(p1,p2), AND(q1,q2))

will be interpreted as saying: find permutations of

p1,p2, q1, q2 from verses. I think it is very confusing.

I think AND(p1,p2,p3), without further ordering

constraints, should be interpreted as saying:

Find a verse in which p1, p2, p3 occur in that order.

That is, the relative order of p1, p2, p3 are determined,

only the distances between them are not given.

If I want to find some permutations of p1, p2, p3, say

p2, p1, p3 and p1, p3, p2, then I will say that by

creating OR merge box

OR( AND(p1,p2,p3), AND(p2,p1,p3), AND(p1, p3, p2) ).

This rule has several advantages.

(1) Under this rule, it is possible to translate

an ordering constraint between AND(p1,p2,p3) and

AND(q1, q2, q3) into

the same ordering constraint between

p3 and q1, when there are no further ordering

constraints among p1, p2, p3 and among q1, q2, q3.

(2) It can liberate the user from the burden to specify

distances among p1, p2, p3, e.g. "at most 10" etc,

even when the user is NOT concerned with them.

I find this burden quite dissatisfying.

In fact, this is another instance where the system forces

the user to specify too much details which the system

can take care of by itself.

(3) The user need not specify a backward ordering

arrow, say from p3 to p1, for AND box AND(p1,p2,p3).

Such a user interface is too confusing and complicated.

Again, when there is need for such an ordering constraint,

then I can introduce another AND box and an OR box,

so that the new and the old AND boxes are daughters

of the OR box. No backward arrows! Query language

is also a language. It should be easy to interpret, easy

to modify. I doubt very much that the user will ever use

backward ordering arrows. See (4) for another potential waste of this feature.

(4) The search engine does not have to waste its time

finding all permutations of p1, p2, p3 and then throwing

away most of them because most of them do not satisfy

the specified ordering constraints. I think this can become critical when a query becomes general and complex.

Are there any good arguments for interpreting

AND(p1, p2, p3) as saying all permutations of p1, p2, p3?

------------------------------

Moon Jung

Associate Professor

Dept of Media Tech

Sogang Univ, Seoul, Korea

BibleWorks". I think the issue is important, but there has been no respons to my last post. So, I repost it under the

more specific title.

[Michael]

I wiill take your carefully described request and put in on our list of possible changes. I do see the merit to your idea,

=> Thanks.

but I should warn you that it would not be simple to fully implement. The ASE code is well-designed, but as I am sure you are aware, in any large piece of complex software, a seemingly small change can have large ramifications.

=> Yes, in general.

But not, in this particular case.

As one small example, I can think of several practical situations where you would not want this feature unless there were further flags that could be used to de-activate the feature for given phrases under a merge box. These small changes start to add up.

=> See below.

As another small example, to be fully general, the program would need to properly deal with multiple ordering links between multiple pairs of merge boxes.

=> Yes, but it is trivial. See my explanation below.

A complex set of rules would then have to be implemented and enforced. All of this can be done, but I hope you can see that it is not a trivial change.

=> All we need is translation rules (A) and (B).

Rule (A):

(1) an ordering constraint O between two OR merge boxes, OR(p1, p2, ....pn) and OR(q1, q2, ... qm),

is equivalent to:

(1)':

O between p1 and q1, between p1 and q2, ...., between

p1 and qm,

O between p2 and q1, between p2 and q2, ...., between

p2 and qm,

...

O between pn and q1, between pn and q2, ...., between

pn and qm,

Right? Rather than having the user specify (1)', let us

have them specify (1) and let the system translate it

into (1)'. This translation rule is obvious.

Of course, pi and qi are either Merge boxes or word boxes.

If they are word boxes, the translation is completed.

If they are OR Merge boxes, then translation rule (A) is

applied again. If they are AND Merge boxes, rule (B) is

applied again.

Rule (B):

(2) an ordering constraint O between two AND merge

boxes, AND(p1,p2,...,pn) and AND(q1,q2,...,qm)

is equivalent to

(2)':

O between the last daughter of the first AND box, pn,

and the first daughter of the second AND box, q1.

Here I assume that pn is the last word

of the first AND box, and q1 is the first word

of the second AND box. See (C) below, for this.

Of course, pn and q1 are either Merge boxes or Word boxes. If they are Word boxes, the translation is completed. If they are OR merge boxes, translation rule (A) is applied again. If they are AND merge boxes,

translation rule (B) is applied again.

SO, given an ordering constraint between OR boxes or

AND boxes, it can be translated into equivalent ordering

constraints between their daughters recursively down

the query tree.

------------------------------------------------------------

Michael, you said:

you would not want this feature unless there were further flags that could be used to de-activate the feature for given phrases under a merge box.

=> I do not understand it. Perhaps you misunderstood the

translation rule of the ordering constraints, which I

explained above in a more algorithmic manner.

Because an ordering constraint between Merge boxes is a

simpler and equivalent way of specifying the same

constraints repeatedly between the daughter boxes

of the Merge boxes, there is no "de-activation" here.

Am I missing something?

(C) There is another concern for ordering constraints

which is related to the transation rule I am talking about,

but which is also worthy to talk about in itself.

In the manual, I read that AND merge box, AND(p1,p2,p3), does not say anything about ordering

between p1, p2, and p3. Without any further

ordering constraints on them, the search engine

finds 6 permutations of p1, p2, p3 from individual

verses. It should be true even when pi's are AND or

OR Merge boxes. Right? AND(AND(p1,p2), AND(q1,q2))

will be interpreted as saying: find permutations of

p1,p2, q1, q2 from verses. I think it is very confusing.

I think AND(p1,p2,p3), without further ordering

constraints, should be interpreted as saying:

Find a verse in which p1, p2, p3 occur in that order.

That is, the relative order of p1, p2, p3 are determined,

only the distances between them are not given.

If I want to find some permutations of p1, p2, p3, say

p2, p1, p3 and p1, p3, p2, then I will say that by

creating OR merge box

OR( AND(p1,p2,p3), AND(p2,p1,p3), AND(p1, p3, p2) ).

This rule has several advantages.

(1) Under this rule, it is possible to translate

an ordering constraint between AND(p1,p2,p3) and

AND(q1, q2, q3) into

the same ordering constraint between

p3 and q1, when there are no further ordering

constraints among p1, p2, p3 and among q1, q2, q3.

(2) It can liberate the user from the burden to specify

distances among p1, p2, p3, e.g. "at most 10" etc,

even when the user is NOT concerned with them.

I find this burden quite dissatisfying.

In fact, this is another instance where the system forces

the user to specify too much details which the system

can take care of by itself.

(3) The user need not specify a backward ordering

arrow, say from p3 to p1, for AND box AND(p1,p2,p3).

Such a user interface is too confusing and complicated.

Again, when there is need for such an ordering constraint,

then I can introduce another AND box and an OR box,

so that the new and the old AND boxes are daughters

of the OR box. No backward arrows! Query language

is also a language. It should be easy to interpret, easy

to modify. I doubt very much that the user will ever use

backward ordering arrows. See (4) for another potential waste of this feature.

(4) The search engine does not have to waste its time

finding all permutations of p1, p2, p3 and then throwing

away most of them because most of them do not satisfy

the specified ordering constraints. I think this can become critical when a query becomes general and complex.

Are there any good arguments for interpreting

AND(p1, p2, p3) as saying all permutations of p1, p2, p3?

------------------------------

Moon Jung

Associate Professor

Dept of Media Tech

Sogang Univ, Seoul, Korea