Ben,
the MPTP export keeps at this moment quite a lot of the original Mizar proof
structure in the TPTP 'useful info' slot, but not all of it. Depending on
what kind of learning you want to do, it might or might not suffice. The
biggest ommission is probably the lack of further description of unproved
propositions (precisely in TPTP syntax: fofs with role 'unknown' and
mptp_info(_,_,proposition,_,_)). Most of them are probably natural deduction
(ND) assumptions, but some are not (e.g. propositions about constants
introduced by the 'consider' keyword - they can be rather thought of as
definitions).
It should not be difficult to add the missing info there - the TPTP format is
produced directly from the XML, which contains all the proof structure. There
have been several reasons for postponing this so far - my focus on getting
the reproving of the simple steps right (which is sort of a precondition for
ATP cross-verification of Mizar, which in turn is a precondition for
productive use of deductive tools like ATP as part of larger knowledge-based
systems tailored for Mizar), and also a focus on getting the reproving of
theorems from their external references right, which in a sense gives you a
large-scale proof structure (which is not only usable for learning, but also
- unlike the ND internal proofs - understandable to ATPs, and thus allowing
things like the MPTP challenge). Another reason was that I did not want to
decide about the details of MPTP ND annotations, until I decided about the
export of ND proof structure to TPTP. The latter accidentally happened last
week (not only as a next step for the cross-verification, but also as a
megalomanic plan to build the detailed MML DAG with milions of nodes :-). So
some missing annotations will appear in the next few weeks (maybe even days),
and more importantly, the Mizar ND proofs will become TPTP proofs (if needed
without any ND - though there is a good chance that assumptions will become
acknowledged and processed parts of TPTP proofs).
To sum up:
- you can have 'raw Mizar' loaded by using the XML - that gives you access to
the formulas and Mizar proof structure; the disadvantage is that for any
deductive tool which you might want to use, you'll have to define the
translation to its logic
- there is the 'raw MPTP', with formulas in extended TPTP syntax containing
the mptp_info annotations; this is sort of a middle way between the XML and
standard TPTP; the annotations are likely to get a bit better in near future,
what they annotate is still the Mizar ND structure
- there are pre-generated 'standard TPTP' problem sets in the MPTP distro and
the MPTPChallenge distro; these you can feed to ATP systems, and also
learning systems (the symbols and proposition names are stable there - always
the same semantics); for quite a lot of them an ATP proof can be found (and
used for learning)
- there will (hopefully soon) be a full TPTP (i.e. mostly/completely non-ND)
proof structure export, compatible with the proofs produced by ATP systems
like EP.
Josef
On Sun, 18 Mar 2007, Ben Goertzel wrote:
Josef,
Thanks for your reply! However, I'm not sure we are fully understanding
each other.
I agree that the TPTP language has a level of simplicity similar to that of
KIF, so that converting TPTP into KIF wouldn't be a big deal.
And, from the point of view of my own Novamente AI system, writing a
specialized TPTP loader and bypassing KIF wouldn't be a big deal either.
However, I am not sure from your email if the "MPTP export" of Mizar, at
this point, actually exports the **proofs** in the existing Mizar corpus
(Journal of Formalized Mathematics), or just the theorem statements. Can
you clarify? For my purposes, I need the proofs not just the theorems, as
my desire is to do inductive learning of proof-patterns based on automated
analysis of the proof-corpus.
Thanks
Ben Goertzel
Novamente LLC
ben@goertzel.org
Hi Ben,
On Sat, 17 Mar 2007, Ben Goertzel wrote:
Hi all,
I am the leader of an AI project (www.novamente.net) and am interested
to apply my AI system to automated theorem-proving based on a method
of having it ingest the existing corpus of Mizar proofs, inductively
infer
"patterns of proof" therefrom, and then use these patterns to guide its
work in further proofs.
Good, hope you'll submit your system to the MPTP Challenge
(http://www.cs.miami.edu/~tptp/MPTPChallenge/), and write a paper about it
to ESARLT (http://www.cs.miami.edu/~geoff/Conferences/ESARLT/).
However, the first issue confronted along this path is that my AI system
does not possess a Mizar loader. This is because Mizar's syntax is
bloody complicated! (Understandably so, because of its design goals.)
Thus, it seems necessary to convert Mizar to some simpler and more
standard
format. A sensible choice would seem to be KIF
http://www.ksl.stanford.edu/knowledge-sharing/kif/
but of course other similarly simple formats would do as well.
Mizar is translated to TPTP by the MPTP export
(http://kti.ms.mff.cuni.cz/~urban/MPTP2/mptp0.2.tar.gz). TPTP2KIF can be
probably achieved through TPTP2X from the TPTP distro (not that there
would be many "really running systems" fully implementing KIF (AFAIK)).
Also, just for initial learning experiments, the MPTPChallenge problems
(http://www.cs.miami.edu/~tptp/MPTPChallenge/MPTPChallenge.tgz) might be
enough.
My question is whether anyone has created a converter of this sort, or
has
thought seriously about it.
If no one has created one, does anyone have an educated estimate
regarding
how long this task would take for an individual with suitable background
in mathematics and software engineering?
I hope that MPTP is enough for you, you are welcome to play with it and
improve it - there are number of ways how one can "translate" Mizar and
its proofs, e.g. with respect to optimizations available in various
reasoning systems, and I hardly did the "basic" approach so far. Also
Mizar is available in XML-ized format
(http://mmlquery.mizar.org/mizardoc/xml/Mizar.html) which solves the
low-level technical "loading" issue, and may be also usable e.g. for
machine learning (fo some hints read e.g. section 3.5 of
http://ktiml.mff.cuni.cz/~urban/mizxml.ps). However it is certainly still
"Mizar world", not a standardized well-known format like TPTP.
Best regards,
Josef Urban