[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: [mizar] Size of Mizar/MML



Hello,


Fabian, if you are interested in more detailed statistics of MML, I suggest using some XML files generated during the work of the Mizar verifier.
If you like, I can give you the files and teach you what information is in the files.
Having the files you can count, for example, how many times the given notion is used or how many existential quantifier is used, or …

You do not have to install the Mizar system; you can download the files and prepare some simple software to process them.


What do you think?


Kind regards


Artur




W dniu 5.06.2025 o 19:22, Fabian Huch huch _AT_ in.tum.de pisze:
On 6/5/25 18:53, Alex Nelson thmprover _AT_ gmail.com wrote:
Hello,

I am not sure what you mean by "autogenerated theorems", I do not think Mizar has such a feature.
Yes, this part of my project description applies to other ITPs.

Two bits of advice spring to mind:

Advice 1: Use the abstract files. If you use the latest MML, the $MIZFILES/abstr/ directory will number the theorems and definitions automatically for you. After thinking about it for a bit, I think this is "the right way to go". These are the local files corresponding to what's available online (https://mizar.uwb.edu.pl/version/current/html/).

You can simply "grep -r theorem $MIZFILES/abstr/ | wc -l" to get the number of theorems (for me, 75571) and  "grep -r ":: .+:sch " $MIZFILES/abstr/ | wc -l" counts the number of schemes (for me, 914). Note that this will not count "lemmas" which are local to files, but "egrep -nr "Lm[0-9]+:" $MIZFILES/mml/ | wc -l" suggests there are 7345 lemma; I would not count them, personally, because they are often made into theorems later in the article (or they prove half of an "iff" statement, or...).

Definitions are a bit trickier, since multiple definitions may appear within a single "definition block". One crude estimate may be give by looking at the number of times ":: ARTICLE:def " appears (running "grep -nr ":: .+:def "  $MIZFILES  abstr/ | wc -l" produces 14134). A similar grep may be done for synonyms (495) and antonyms (138), which are "abbreviations" of sorts.
That is a great suggestion. Thank you!

These numbers are for mizar-8.1.15_5.94.1493 which is the latest version of the MML. 

I should stress, these numbers are approximate. In particular, the theorem count will also include comments like ":: Binomial theorem" (the command "egrep -nr "::.+theorem" $MIZFILES/abstr/ | wc -l" tells me there are only 432 instances of this, which means there are 75139 theorems in the MML).
Certainly; my goal is merely an approximation. For uniformity, I will count the comments -- for some ITPs it is not feasible to spearate them.

I am certain there is a cleverer way to do a more accurate count of these things.

Advice 2: About axioms. Also, the articles (files) TARSKI_0 and TARSKI_A contain the axioms to the underlying set theory, so presumably you'd want to exclude them? These are merely 6 axioms and 1 scheme.

I think it is reasonable to include them (but hardly matters for the approximation).


Best,

Fabian