[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: [mizar] Reference distances?
Hi Freek,
On Fri, 3 Feb 2006, Freek Wiedijk wrote:
I wonder whether anyone here can tell me the answer to the
following:
I think Krzysztof Retel and Robert Milewski did some related data-mining
some time ago.
I would like to know the statistics about the "distances" of
references to labels in justifications. Specifically:
You can get the data easily from the recent MPTP version, e.g. for the
lemma IRRAT_1:E2_3 - the first by-inference in the proof of theorem
IRRAT_1:1
(http://lipa.ms.mff.cuni.cz/~urban/xmlmml/html.937/irrat_1.html#T1), which
is:
then E3: c1 > 1 by INT_2:def 5;
the MPTP format is:
fof(e2_3,lemma-derived, ~ r1_xreal_0(c1_3,1),
file(irrat_1,e2_3),
[mptp_info(2,[3],proposition,position(49,16),[3]),
inference(mizar_by,[],[e1_3,d5_int_2])]).
mptp_info(2,[3], ... ) (as well as e2_3) tells you that this is the second
proposition in the third top-level block,
inference(..., [e1_3,d5_int_2]) tells you that it was proved from e1_2,
i.e. the first (i.e. previous) proposition in the same block (and also
from INT_2:def 5).
Generally, the block paths can be longer, e.g. e2_49_3_1 (with
mptp_info(2,[49,3,1],...)) means that this is the second proposition in
the first subblock of the third subblock of the 49-th top-level block.
The data are in the pl/*.xml2 files of the MPTP0.2 distro, something like
cat pl*.xml2 | grep mizar_by | sed -e 's/^fof(\([^,]*\),.*mizar_by,\[\],\(.*\)..../\1:\2/'
will print all these data in the following format:
e2_3:[e1_3,d5_int_2]
It should be fairly easy to do the data-mining you describe on it
afterwards.
In the article that you're interested in, put _one_ label on
everything that you can put a label on. Then, restrict
yourself to labels that are "inside" a proof/end (or now/end,
hereby/end, etc.) block. Then, for a justification "count
back" to accessible labels. So the label of the statement
accessible with "then" would be counted as "1", the one before
that as "2", etc. For instance in
theorem Th1: ...
proof
...
end;
theorem Th2: ...
proof
A1: ...
consider ... such that A2: ...
assume A3: ...
proof
A4: ...
end;
A5: ...
then A6: ... by Th1,A1;
end;
the distances for this "by" at the justification of A6 would
be 1 (for the implicit A5 that's in the "then") and 4 (for the
A1). The "Th1" label should be ignored as it is outside the
proof/end blocks.
But this will generally happen often, references will be lemmas
proved/assumed in parent blocks, are you sure you do not want them?
Best,
Josef