[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: [mizar] Size of Mizar/MML



On 6/5/25 18:53, Alex Nelson thmprover _AT_ gmail.com wrote:
Hello,

I am not sure what you mean by "autogenerated theorems", I do not think Mizar has such a feature.
Yes, this part of my project description applies to other ITPs.

Two bits of advice spring to mind:

Advice 1: Use the abstract files. If you use the latest MML, the $MIZFILES/abstr/ directory will number the theorems and definitions automatically for you. After thinking about it for a bit, I think this is "the right way to go". These are the local files corresponding to what's available online (https://mizar.uwb.edu.pl/version/current/html/).

You can simply "grep -r theorem $MIZFILES/abstr/ | wc -l" to get the number of theorems (for me, 75571) and  "grep -r ":: .+:sch " $MIZFILES/abstr/ | wc -l" counts the number of schemes (for me, 914). Note that this will not count "lemmas" which are local to files, but "egrep -nr "Lm[0-9]+:" $MIZFILES/mml/ | wc -l" suggests there are 7345 lemma; I would not count them, personally, because they are often made into theorems later in the article (or they prove half of an "iff" statement, or...).

Definitions are a bit trickier, since multiple definitions may appear within a single "definition block". One crude estimate may be give by looking at the number of times ":: ARTICLE:def " appears (running "grep -nr ":: .+:def "  $MIZFILES  abstr/ | wc -l" produces 14134). A similar grep may be done for synonyms (495) and antonyms (138), which are "abbreviations" of sorts.
That is a great suggestion. Thank you!

These numbers are for mizar-8.1.15_5.94.1493 which is the latest version of the MML. 

I should stress, these numbers are approximate. In particular, the theorem count will also include comments like ":: Binomial theorem" (the command "egrep -nr "::.+theorem" $MIZFILES/abstr/ | wc -l" tells me there are only 432 instances of this, which means there are 75139 theorems in the MML).
Certainly; my goal is merely an approximation. For uniformity, I will count the comments -- for some ITPs it is not feasible to spearate them.

I am certain there is a cleverer way to do a more accurate count of these things.

Advice 2: About axioms. Also, the articles (files) TARSKI_0 and TARSKI_A contain the axioms to the underlying set theory, so presumably you'd want to exclude them? These are merely 6 axioms and 1 scheme.

I think it is reasonable to include them (but hardly matters for the approximation).


Best,

Fabian